Abstract
Purpose
Quantitative quality control procedures were sought to evaluate technical variability in multi-center measurements of the diffusion coefficient of water as a prerequisite to use of the biomarker apparent diffusion coefficient (ADC) in multi-center clinical trials.
Materials and Methods
A uniform data acquisition protocol was developed and shared with 18 participating test sites along with a temperature-controlled diffusion phantom delivered to each site. Usable diffusion weighted imaging data of ice water at 5 b-values were collected on 35 clinical MRI systems from 3 vendors at 2 field strengths (1.5 and 3T) and analyzed at a central processing site.
Results
Standard deviation of bore-center ADCs measured across 35 scanners was <2%; error range: −2% to +5% from literature value. Day-to-day repeatability of the measurements was within 4.5%. Intra-exam repeatability at the phantom center was within 1%. Excluding one outlier, inter-site reproducibility of ADC at magnet isocenter was within 3%, though variability increased for off-center measurements. Significant (>10%) vendor-specific and system-specific spatial non-uniformity ADC bias was detected for the off-center measurement that was consistent with gradient non-linearity.
Conclusion
Standardization of DWI protocol has improved reproducibility of ADC measurements and allowed identifying spatial ADC non-uniformity as a source of error in multi-site clinical studies.
Keywords: diffusion, MRI, phantom, ice-water, quality control, gradient non-linearity
INTRODUCTION
Diffusion weighted imaging (DWI) is widely used in magnetic resonance (MR) examinations for diagnostics (1–4) and is considered a potential biomarker of therapy response assessment. The desire to utilize apparent diffusion coefficient (ADC) as a viable quantitative biomarker for cancer diagnosis and treatment response monitoring (4) requires determination of measurement uncertainty and systematic dependencies. The complexity of water diffusion measurement in living systems containing anisotropic tissues, perfusion and other sources of motion (5–8) combined with limited instrumental sensitivity (9–12) confound quantitative ADC measurements. Furthermore, synchronization and standardization of DWI acquisition techniques across multiple MRI platforms (4, 8, 9, 13) is necessary for multicenter clinical trials.
This study focused on MRI system-related factors affecting ADC measurements using a standardized acquisition protocol and a single known diffusing medium as an estimate of baseline repeatability and reproducibility. To detect clinically-significant changes in diffusion measurements, the sources of technical variability have to be characterized and controlled relative to anticipated biologic/therapeutic diffusion changes (7, 11, 12, 14). Since molecular mobility is dependent on temperature, knowledge and control of temperature of the diffusing media are essential in the course of measurements used for technical quality evaluation. It was recently shown (15) that water maintained in an ice-water bath can serve as a universal temperature-controlled fluid in ADC phantoms to test instrumental variability. However in the previously described study (15), the DWI acquisition protocol was somewhat variable and defined by local site preferences. Protocol variability impact on the measurement was unknown. In contrast, this study uses common acquisition protocol adopted by all sites to achieve an unmixed reproducibility estimate. In addition, a more complex phantom geometry as well as off-center measurements allowed detection and characterization of spatial error contribution. Other compounds (e.g., high concentration sucrose in present study) can be added to phantoms to provide a range in diffusion properties as desired (16, 17).
The primary objective of this study was to determine the extent of quantitative agreement for ADC values of a known, temperature-controlled fluid measured on multiple platforms, sites, and field-strength MR systems using a common acquisition protocol. This approach characterizes baseline technical performance of a given system for diffusion measurements (12, 14, 15) without interference from complex temperature or concentration dependence (13, 17, 18). The described procedures may be used for site certification/quality control and may serve as a basis for standardization of diffusion imaging protocols across multi-vendor/multi-site for clinical trials that utilize ADC as a potential biomarker.
MATERIALS AND METHODS
Temperature-Controlled Phantom
A variation of the ice-water-based diffusion weighted imaging (DWI) phantom (15) was devised to measure ADC over a clinically-relevant spatial range. As shown in Figure 1, the phantom consisted of an ice water-filled container holding five, 29mm diameter tubes filled with distilled water (4 at phantom corners and 1 in the center) as the diffusion medium standard, as well as, 1 tube filled with a sucrose solution (9gm sucrose/30ml water) to provide some diffusion contrast (see axial MR images of phantom in Figure 1). Since sucrose solution is not a “universal” fluid, it was not used for absolute ADC measurements. The six tubes were filled and sealed at phantom fabrication, although the interstitial space was filled with ice water at each site and each use per detailed phantom preparation instructions. Recommended ice fill procedure included a two-step preparation: (step 1) prime the phantom with ice water for 10 minutes for an initial rapid cool-down of the water tubes from room temperature, and (step 2) add ice to replenish the relatively large volume of ice melted during step 1, then let the phantom equilibrate for an hour before scanning. Once filled and equilibrated, the ice-water bath surrounded the measurement tubes with a temperature-controlled, high thermal capacity environment. At thermal equilibrium just above 0°C the diffusion coefficient of water is known to be 1.1×10−3 mm2/s (20). Thermal mass and an additional foam insulation sleeve allowed use of the phantom over several hours (Fig. 1). If temperature was controlled to within 0–0.5°C, the measured diffusion coefficient should be determined within 1–2% of literature values (15, 20, 21). To illustrate temporal stability, the central tube of ice-water phantom was monitored on one system over the course of 8 hours. The ice melted faster closer to phantom surface, thus exposing corner tubes to higher temperature gradients than the center tube. The phantom container was semi-flexible to avoid creating a vacuum as the ice melted. As such, the geometrical aspects of the phantom were considered not crucial which kept fabrication costs low and allowed production of 25 phantoms for delivery to 18 institutions and three MRI manufacturers.
Figure 1.
Temporal stability (~8 hours) of measured ADC (central tube) of the ice-water phantom in head coil at 3T (b = 1000) is within +/−2.5% (dashed lines) of nominal ADC value at 0°C (Ref.(22), solid line). The inserts on top show axial T2wt MR images corresponding to different ADC measurement time points (as indicated by labels) relative to initial phantom filling with ice. Blue circles in the middle of the center tube mark the locations of the 1 cm ROI from which the ADC measurements were performed. Data (blue squares) are presented as mean and standard deviation (error bars) of ADC measurements across an ROI (typically, ~90 pixels). The vertical dimension of the open-rectangle represents the standard deviation of the mean ADC values measured for four consecutive passes of the same exam (~2 min/pass), while the dashed line inside is the corresponding four-pass average ADC. ADC color-map in the right pane corresponds to the first SOP-recommended measurement point (open-rectangle). The labels on the ADC map indicate the positions of water (W, center and four corners) and sucrose (S, top) tubes.
MR DWI Data Acquisition Protocol
A core DWI data acquisition protocol was designed for compatibility across several clinical MR imaging platforms (GE, Philips, Siemens) and field strengths (1.5 and 3T). Key elements of the core protocol were: single spin-echo, single-shot EPI; b-values = 0, 500, 800, 1000, 2000 s/mm2; DW axes = X, Y, Z in the lab frame; TR = 8000ms; TE = 100±10ms; Acquisition matrix = 128×128; Phase = Anterior/Posterior (A/P), Freq. = Right/Left (R/L); FOV = 240 × 240mm2; 25 slices, 6mm thick, 1mm gap; Bandwidth = 1.5–2.6kHz/pixel; NEX = 1; and no parallel imaging to avoid variability in vendor-specific parallel imaging algorithm implementation. To permit compliance across all systems, the appropriate ranges of protocol parameters were established via communication with clinical scientists from each of the vendors. The listed parameters were available from Dicom headers of submitted series to insure protocol compliance. Both head and torso phased array coils were used for signal reception. To estimate intra-exam repeatability and confirm thermal equilibrium, sites were instructed to acquire multiple DWI “passes”, where all b-values were acquired in each pass (approximately 2 min/pass). Four passes were first acquired in the head coil. Then the phantom was repositioned in the torso coil for two passes near R/L = 0; two passes offset to the left ≈110 mm; and two passes offset to the right ≈110mm. Axial measurements along the central tube provided a spatial extent of approximately 140 mm (limited by tube length) in the superior/inferior (S/I) direction.
To independently assess relative gradient amplitude over the range of measured R/L and S/I offsets, local gradient strength was measured using a distortion phantom comprised a 3D array of point signal sources spaced 15mm apart. Standard spin-echo MRI was performed without and with the spatial geometric correction routine normally applied on the scanner. Local distances between adjacent point sources were measured on both non corrected and geometrically corrected images as proxies for “actual” and “nominal” local gradient strength. The square of the ratio of these distances was used as a proxy for ratio of [b-actual/b-nominal] for DWI experiments, since b-value scales with the square of gradient amplitude.
System day-to-day repeatability (including phantom preparation) was assessed by comparing complete runs at the center and with right-to-left offset on two different days. Spatial offsets along the center tube, as well as a whole-phantom displacement from the isocenter, were used to sample spatial uniformity of ADC. Sites were instructed to provide a DICOM screenshot of a region-of-interest defined on an ADC map generated using scanner software to corroborate with ADC maps produced by centralized processing.
ADC measurements and data analysis
A single Matlab-based package (MathWorks Inc., Natick MA) was developed for centralized diffusion data analysis to mitigate variability introduced by different software packages. While all images were provided in DICOM format, seven distinct source-dependent image order scenarios were discovered, which were addressed by a custom DICOM import and sort routine. The output of this routine contained relevant system information, acquisition settings, and images in a uniform data structure format regardless of image source. Dicom headers of submitted series from forty scanners were checked for protocol compliance. For each nonzero b-value, apparent diffusion coefficient (ADC) maps were computed by:
| [1] |
where S0 is the b = 0 image and Sb is the isotropically-weighted DWI at the given b-value. That is, multiple ADC maps (i.e. ADC500, ADC800, ADC1000, ADC2000) were derived analytically using Eq.[1] as opposed to a single ADC derived by numerical fitting of multiple b-value DWI. Since the diffusing medium is known to be mono-exponential, the array of ADC values should not exhibit any b-value dependence. ADC maps provided by the sites were used to confirm that the centralized processing of raw DWI-data resulted in the same ADC values, as would be measured by system-specific software. This ensured that no software-dependent bias was introduced by preprocessing steps. Signal-to-noise ratio (SNR) measurements for head coil DWI-images were performed on central-tube ROIs for separate b-values to ensure that detection level was sufficient for unbiased ADC determination. An estimate of noise for SNR measurements was obtained by the temporal standard deviation for each pixel from consecutive passes of the core DWI protocol.
In this study, reproducibility (and repeatability) of a given measure is defined as 2 times the standard deviation (STD) of that measure expressed as percentage of the mean. That is, reproducibility across systems represents the range within which 95% of measurements are expected to fall. The fundamental measurement was the mean of each ADC metric within a 10mm diameter circular region-of-interest (ROI) of approximately 90 pixels. ROIs were defined on ADC maps at the center of the middle tube on all usable slices. Typically 20 ROIs were defined along the middle tube (on axial slices center-to-center 7 mm apart) for one phantom position yielding approximately 80 ROIs from all four phantom positions on each exam date. Spatial coordinates of all ROI centers were stored so that subsets of ROIs could be retrospectively graphically selected for analyses detailed below, as well as for automatic application the ROIs to all ADC maps across all like passes. Additional ROIs were defined in the four corner tubes for estimation of intra-exam repeatability for six representative systems (2 field strengths for each of 3 vendors). Corner-tubes were not used for derivation of “baseline” reproducibility statistics for water ADC due to mixed systematic effects of non-linearity bias and different phantom positioning/rotation by participating sites. Likewise, off-center position of sucrose tube and its proximity to air resulted in more artifacts limiting the number of useful (homogeneous) ROIs for ADC measurements and precluded generation of sufficient statistics for sucrose ADC across sites. Statistical analyses for multi-system repeatability and reproducibility were performed only on measurements obtained from the central water tube to minimize mixed spatial effects (X-Y-Z offsets), ROI non-uniformity artifacts and ensure best temperature control.
ADC mean and standard deviation statistics derived from central slice ROIs using the head coil and torso coil with the phantom at isocenter were compiled for a fundamental comparison of ADC accuracy and reproducibility across all systems. Statistics from central-slice ROIs of the middle tube with the phantom offset in the torso coil by a nominal ±110mm in R/L direction were combined to estimate accuracy and reproducibility for R/L off-center ADC measurements. To determine ADC accuracy and reproducibility for offsets in S/I direction, head coil ROIs of the middle tube on peripheral slices at a nominal ±70mm S/I offset were combined. Intra-exam repeatability was derived from the relative difference in ROI means measured over 2 to 4 passes acquired in immediate succession. Inter-exam repeatability was derived from the relative difference in ROI means measured on a given system over two scan dates.
Intra-exam repeatability was defined as percent ratio of 2·STD to mean ADC value measured for 4 consecutive passes of the protocol for the same (single) ROI location at the middle or peripheral tubes. Day-to-day repeatability for each system was measured by comparing %difference from the two-day average ADC. Similar to repeatability, reproducibility of center and off-center measurements between scanners was estimated from 2·STD percent ratio deviation of mean ADC values from the corresponding ROIs for individual b-values. The scanners were grouped by vendor and field strength. The bias error was measured as a percent deviation of system-wise average ADC from the expected value (20–22). Multi-vendor/site/b-value/field-strength comparisons and reproducibility analysis were performed using SPSS (version 19) package to generate statistics and graphical output. Data dispersion was characterized by mean, median, standard deviation, range and outlier analysis.
RESULTS
Quality assurance inspection of DICOM parameters allowed identification of 5 systems (out of the original 40) that did not comply with the acquisition protocol and their corresponding data was discarded from the subsequent analysis. The remaining 35 (protocol compliant) systems represented three vendors, labeled A, B and C (at 1.5T and 3T respectively) with the following scanner “population”: A – 10 (5 and 5), B - 17 (6 and 11), and C – 8 scanners (4 and 4). Twenty eight out of 35 systems provided two-day acquisition data to evaluate day-to-day repeatability. For the worst condition at b = 2000, the average SNR of the trace DWI image was found to be 110 at 3T (range 60 to 175) and 40 at 1.5T (range 30 to 65). Based on simulation, ADC bias error due to Rician noise at the lowest SNR = 30 would be negligible (10). Thus SNR analysis across specific field and vendor subgroups was not considered informative within the context of ADC measurement.
Figure 1 illustrates temporal stability of the ice-water diffusion phantom as well as within-ROI confidence intervals (error bars defined as ±1 standard deviation of ROI pixels) for one 3T system central tube ADC1000 calculated by Eq.[1]. Image inserts illustrate the phantom morphology at different time points along the time-line of the experiment (13, 120, 300, and 480 minutes relative to initial ice filling). The color-map of ADC1000 measurement (right pane) corresponds to the protocol-recommended time- point (65 min, un-filled black symbol) for the first pass of the core DWI protocol using head-coil. After initial cool-down (points 1–4, < 1 hour), thermal equilibrium was achieved and persisted for over 4 hours as is evident from lower deviation from nominal value (22) for ADC measurements in this time range. The mean ADC measurements during thermal equilibrium were within 0.1% of literature value of 1.1×10−3 mm2/s (22) (marked as a solid line in Fig. 1). The standard deviation of ROI means for 4 passes (~2 min/pass) of the core DWI protocol for the central tube ROI is illustrated by the vertical extent of the un-filled black symbol on the “65min” time-point (STD for this system = 0.005×10−3mm2/s). The observed average intra-exam repeatability estimated from 4 consecutive passes of the same measurement for 70 ROI locations in all 5 water tubes was within 1%. All measurements (over 8 hour period) for this system were within 2.5% of the nominal value (dashed lines in Fig. 1).
Day-to-day repeatability for each system was measured by comparing %difference from two-day average ADC of the central tube ROI for four b-values of 500, 800, 1000, and 2000 s/mm2. The summary of measurements pooled over field strength and grouped by vendor is plotted in Figure 2 for (a) head-coil at the center and (b) torso-coil at 110 mm right and left offset (average ADC). For all systems this day-to-day repeatability was within 4.5%. Across all scanners and all b-values the average of median day-to-day difference in ADC values (as illustrated by solid lines in the box-plot of Fig. 2) was less than 1.5%, and the overall standard deviation for all measurements was below 2.3%. A greater degree of variability was observed in vendors A and B for right/left-offset ADC measurements. Several outlier systems (>5% difference between days) were detectible for head coil data at 3T (vendor B) and 1.5T (vendor C).
Figure 2.
Box-plot summary of day-to-day repeatability for multi-site results pooled over field strength plotted as a percent difference in ADC measurements taken on 2 different days (~ 30 hours apart). On each box, the central mark is the median, the vertical boundaries mark the 25th and 75th percentiles, the whiskers extend to the most extreme data-points excluding outliers, and the outliers are plotted individually. Data for different b-values of 500, 800, 1000, and 2000 s/mm2 are coded by gray shades left-to-right and grouped by vendor: (a) head coil, center; (b) torso coil, right-to-left offset 110 mm.
Figure 3 summarizes results for ADC measurements at the central ROI grouped by vendor (horizontal axis), field strength (left-to-right) and receiver coil type (a, b). The expected ADC for water at 0°C is marked as a solid line with ±5% of this value shown as dashed lines. One clear outlier 3T system generated ADC values as high as 1.8×10−3mm2/s (~70% deviation from literature value) for 3 of four b-values. Excluding this one outlier, multi-system reproducibility for the center ADC are within 2.8% (Fig. 3a) and 3.1% (Fig. 3b) for head and torso coils respectively. The bias error of the mean measured ADC in respect to literature value is within 2.5% for head and 3.5% for torso coil at 3T, and approximately half of these values at 1.5T. There is no significant dependence on b-value relative to the typical ROI measurement error (bar size in Fig. 1).
Figure 3.

Box-plot summary for between-system reproducibility of ADC measured form central ROI grouped by b-values (500, 800, 1000, and 2000 s/mm2, coded by shades of gray left-to-right) and by vendor (A,B,C) at two fields (1.5T - left column, 3T – right column) for head coil (a) and torso coil (b). Dashed lines delineate 5% deviations from reported ADC value of water (22) at 0 °C (solid line).
To evaluate system-to-system reproducibility as a function of ROI location, the average of ADC measurements of the central tube ROI at two extreme superior-inferior (S/I) and separately the average of right-left (R/L) offsets were used. Off-center ADC measurements were less reproducible across scanners. As depicted in Figure 4, measurement variability was significantly higher with ±70 mm offset from center in S/I direction for head and ±110mm R/L offset in torso coil, respectively. As is evident by comparison of data box sizes in Fig. 4 versus Fig. 3, for each vendor, inclusion of spatial offsets doubled or, in case of A torso coil, tripled the dispersion (standard deviation) of measured values compared to center ROI measurements (Fig. 3.). The apparent errors were higher at 3T and show distinct vendor-specific pattern. Vendor-specific standard deviation as well as bias error (with respect to expected ADC value) exceeded 3% at 1.5T and 5% at 3T, and vendor A exceeded by 15% at 3T. Measured ADCs were proportionally underestimated for S/I offsets (Fig. 4a), and overestimated for R/L offsets (Fig. 4b). Similar to Fig. 3, no significance dependence on b-value is observed.
Figure 4.

Effect of off-center shift on between-system reproducibility for ADC grouped similar to Figure 3 for (a) head coil S/I-offset and (b) torso coil R/L-offset. Dashed lines mark 5% deviations from reported ADC value of water (22) at 0 °C (solid line).
More detailed spatial dependence of measured ADC is depicted in Figure 5(a,b) for one representative 3T system. These plots are consistent with observation of larger (approximately quadratic) ADC errors for larger offsets, and the opposite sign errors in S/I versus R/L direction (Fig. 4). The results of empiric measurement of local gradient strength are depicted in panes (c,d) of Figure 5. Square of the ratio of “actual” to “nominal” gradient amplitude exhibits parabolic dependence on offset due to gradient non-linearity. Both the sign and the scale of observed ADC bias (a,b) are consistent with the systematic measurements of gradient non-linearity (c,d). Combining all systems excluding the one clear outlier, reproducibility (2·STD percent ratio) of the ADC1000 measurement was 2.8% at isocenter (head coil), 6.8% for S/I offsets and 11.3% for R/L sampled offsets.
Figure 5.
Spatial dependence of measured ADC (a,b) and squared gradient deviation from linearity (c,d) for a single representative system at 3T for S/I-offset (left column) and R/L-offset (right column). Filled-symbols with error-bars in (a, b)-panes mark ADC measurements for b = 500, 800, 1000, and 2000 (offset left-to-right for clarity). Asterisks in (c, d)-panes represent measured squared ratio of true-to-nominal local gradient strength (b-ratio proxy) as a function of offset from isocenter. Data scatter reflects measurement error for gradient strength. Solid gray line represents a quadratic fit for b = 1000 in (a,b) and a fit for true-to-nominal b-ratio in (c,d)-panes. Dashed lines mark 5% deviations from reported ADC value of water (22) at 0 °C (black solid line) for (a, b), and deviation from nominal gradient (true-to-nominal ratio of 1 marked by solid line) for (c,d)-panes.
DISCUSSION
The purpose of our multi-center study was to assess repeatability, reproducibility and quantitative quality control of ADC measurements across vendors, field strength and b-values using a standardized acquisition protocol. In a prior study, the DWI acquisition protocol was variable and based on each sites’ local “brain DWI” exam, and the issue of spatial variability was not addressed (15). By standardizing data acquisition conditions across participating test sites and using a temperature-controlled fluid phantom/procedure that allowed off-center ADC measures over a wide range of b-values envisioned for clinical trials, we sought to determine the baseline for systematic reproducibility level relevant for brain and body DWI multi-center trials (8,9,13). Our aim was to distinguish between sources of observed technical variability and quantify technical deficiencies of individual systems (12,14). Certainly one 3T system was detected as a clear outlier.
Investigation of temporal stability of the ice-water phantom (Fig. 1) confirmed that over the course of ADC measurements no significant variability (STD < 1%) was introduced by temperature fluctuations within phantom and that this design allowed for over 4 hours of use. Since all subsequent experimental measurements were performed during thermal equilibrium, ADC variability measurements were not substantially influenced by fluctuations in temperature within the diffusing medium. Note that the intra-exam repeatability error for all ROIs was consistently less than one third of the spatial standard deviation “noise”-error within ROI (<1.5%, Fig. 1). This relation was observed for average repeatability versus average ROI-noise of the ADC measurements in other studied systems, indicating ROI “inhomogeneity” noise (across ROI pixels) as a dominant source of ADC measurement error. This is consistent with observation of slightly larger reproducibility errors and bias (with respect to true value) across multiple systems with greater error in torso coil versus head coil ADC measurements (Figs. 2–4). By inspection of Figure 3 versus Figure 4, higher variability was apparent for off-center measurements at higher field (Figure 4). The pattern of ADC overestimation for R/L offsets and underestimation for S/I offsets was commonly observed for individual 1.5T and 3T systems, but was more prominent on 3T systems. The spatial pattern appeared “saddle shaped” with greatest ADC uniformity near magnet isocenter and steeper non-uniformity that increases with distance off-center. Therefore, variability in positioning the phantom across sites and days translates to greater variability for offset ADC measurements, particularly for 3T systems that tend to exhibit steeper spatial dependence.
Multiple runs for the same exam were highly repeatable (< 1%), and day-to-day differences did not exceed 3% (Fig. 2) for the majority of systems. Closer examination of images for several outlier systems (> 5% variation) revealed larger areas of melted ice around phantom tubes (see, e.g., morphology around top outer tube in the last insert of Fig. 1). This indicated that observed higher day-to-day variability may be due to not fully achieving thermal equilibrium on one or both exam days at some sites that presumably did not follow recommended ice fill procedure. Note that day-to-day variations for each system included components both from phantom preparation and repositioning. No apparent dependence of day-to-day repeatability on phantom position (Fig. 2) confirmed the absence of temporal bias in system performance.
By first principles, self-diffusion of water is independent of timescale thus should appear mono-exponential as a function of b-value. This is consistent with our observation of no systematic dependence on b-value for all studied systems (Figs.3 and 4), except the clear 3T outlier by vendor A. Further analysis of this outlier revealed that ADC maps from two of three orthogonal gradient directions produced exceptionally high ADC values for b-values 500, 800 and 1000; although the b=2000 directional and trace ADC maps were essentially equivalent with other vendor A systems. For this system, the large bias error was repeated for both measurement days. The source of this error is unknown.
The major contributors to observed higher variability and bias for off-center ADC measurements (Fig. 4) are likely due to spatial dependent error for individual systems (Fig. 5) coupled with slight differences phantom positioning. Several instrumental imperfections (or combinations thereof) may account for the detected spatial nonuniformity of the measured ADC values for individual systems: phantom vibration, concomitant fields, poor shim, and gradient nonlinearity. By exploring observed trends in spatial variations of measured ADC we are able to determine the dominant source of observed spatial nonuniformity (Fig. 5). Note, that spatial offsets from isocenter were nominal offsets (±110mm R/L, and ±70mm S/I) defined by protocol. There was a range of actual offsets which contributed to the increased/decreased range of measurable shifts from isocenter combined in Fig. 5. If vibrations were major contributors to the ADC error, they would systematically increase apparent ADC with increasing b-value, which does not agree with our observations (Fig. 5). Furthermore, apparent decrease in ADC is observed in superior-inferior direction in contrast to increase right-left. The effect of concomitant fields (23) from pulsed gradients would manifest itself in decreasing ADC with increasing magnetic field (3T versus 1.5T). However, the opposite change is observed in Fig. 3 and 4, and its absolute value could account for 2–3%, at most, of the observed spatial ADC variation. Poor shim would be anticipated to produce an asymmetric effect at the opposite edges (decrease vs. increase), which would be enhanced by de-shimming. Both of these were ruled out by our experimental observations of symmetric spatial dependence for ADC with respect to isocenter (Fig. 5).
As illustrated in Figure 5, ADC bias scales with the ratio of actual to nominal b-value (measured by the ratio of squared gradient strength). Thus empirically, the observed (approximately) quadratic offset-dependence of ADC is consistent with measured gradient non-linearity errors. Gradient non-linearity correction (19) is fixed for a given coil design (independent of a patient). Consequently, the observed spatial nonuniformity of ADC could be corrected by constructing and applying instrument-dependent gradient maps (19). The specifics of the ADC-correction procedure will be described in a future article.
In conclusion, standardization of the DWI protocol for multi-site studies has improved reproducibility of ADC measurements compared to “local-standard” (site-specific) protocol (15). Measured ADC of the ice-water phantom near magnet isocenter was within 3% of literature value for 95% of the systems. Reproducibility between vendors was invariant to b-value and field strength (except for one outlier due to direction dependence discussed above), and 2·STD percent ratio errors ranged from 2.8% to >10% dependent on spatial location of the ADC measurement. Intra-site day-today repeatability across all sites was better than 5%, and average intra-exam repeatability (evaluated from several passes through the same exam protocol) within 1%. Observed b-value dependent variability at the phantom center was likewise less than 1%. Large ADC non-uniformity errors (5–15% bias) were present for off-center measurements consistent with gradient non-linearity. Vendor cooperation is needed to develop viable instrumental correction procedures to control spatial ADC errors in clinical studies. The proposed protocol for standardization and quantitative quality control of ADC measurements is generally applicable for future multi-center clinical trials (8, 9, 13).
Acknowledgments
Contract grant sponsor: National Institutes of Health (NIH); contract grant number: P01-CA85878, U01-CA166104, P50-CA93990, R01CA136892, P01CA087634 and SAIC 29XS161, as well as training support for MD from T32 EB005172.
We are grateful for assistance with protocol standardization and data collection by participating organizations: Barrow Neurological Institute; University of California, San Francisco; National Institutes of Health; University Medical Center Utrecht, NL; Kennedy Krieger Institute, Johns Hopkins University; Hopital Beaujon, Paris FR; Mount Vernon Hospital, Middlesex UK; Royal Marsden Hospital, Sutton UK; William Beaumont Hospital, Royal Oak MI; University of Nijmegen, Nijmegen NL; University of Oxford, Oxford UK; Henry Ford Hospital, Detroit MI; University of Manchester, Manchester UK; MD Anderson Cancer Center, Houston TX; General Electric Medical Systems; Siemens Healthcare; and Philips Healthcare.
References
- 1.Hamstra DA, Lee KC, Moffat BA, Chenevert TL, Rehemtulla A, Ross BD. Diffusion magnetic resonance imaging: an imaging treatment response biomarker to chemoradiotherapy in a mouse model of squamous cell cancer of the head and neck. Transl Oncol. 2008;1:187–194. doi: 10.1593/tlo.08166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Galban CJ, Mukherji SK, Chenevert TL, et al. A feasibility study of parametric response map analysis of diffusion-weighted magnetic resonance imaging scans of head and neck cancer patients for providing early detection of therapeutic efficacy. Transl Oncol. 2009;2:184–190. doi: 10.1593/tlo.09175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Chenevert TL, Ross BD. Diffusion imaging for therapy response assessment of brain tumor. Neuroimaging Clin N Am. 2009;19:559–571. doi: 10.1016/j.nic.2009.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Padhani AR, Liu G, Koh DM, et al. Diffusion-weighted magnetic resonance imaging as a cancer biomarker: consensus and recommendations. Neoplasia. 2009;11:102–125. doi: 10.1593/neo.81328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Mulkern RV, Gudbjartsson H, Westin CF, et al. Multi-component apparent diffusion coefficients in human brain. NMR Biomed. 1999;12:51–62. doi: 10.1002/(sici)1099-1492(199902)12:1<51::aid-nbm546>3.0.co;2-e. [DOI] [PubMed] [Google Scholar]
- 6.Riches SF, Hawtin K, Charles-Edwards EM, de Souza NM. Diffusion-weighted imaging of the prostate and rectal wall: comparison of biexponential and monoexponential modelled diffusion and associated perfusion coefficients. NMR Biomed. 2009;22:318–325. doi: 10.1002/nbm.1328. [DOI] [PubMed] [Google Scholar]
- 7.Colagrande S, Pasquinelli F, Mazzoni LN, et al. MR-Diffusion Weighted Imaging of Healthy Liver Parenchyma: Repeatability and reproducibility of apparent diffusion coefficient measurement. JMRI. 2010;31:912–920. doi: 10.1002/jmri.22117. [DOI] [PubMed] [Google Scholar]
- 8.Tiepel SJ, Reuter S, Stieltjes B, Acosta-Cabronero J, et al. Multicenter stability of diffusion tensor imaging measures: A European clinical and physical phantom study. Psych Res: Neuroimaging. 2011;194:363–371. doi: 10.1016/j.pscychresns.2011.05.012. [DOI] [PubMed] [Google Scholar]
- 9.Sasaki M, Yamada K, Watanabe Y, et al. Variability in absolute apparent diffusion coefficient values across different platforms may be substantial: a multivendor, multi-institutional comparison study. Radiology. 2008;249:624–630. doi: 10.1148/radiol.2492071681. [DOI] [PubMed] [Google Scholar]
- 10.Collins DJ, Blackledge M. Techniques and optimization. In: Koh DM, Thoeny HC, editors. Diffusion-Weighted MR Imaging: Applications in the Body. New York: Springer-Verlag; 2010. pp. 19–32. [Google Scholar]
- 11.Saritas EU, Lee JH, Nishimura DG. SNR Dependence of Optimal Parameters for Apparent Diffusion Coefficient Measurements. IEEE Trans Med Imag. 2011;30:424–437. doi: 10.1109/TMI.2010.2084583. [DOI] [PubMed] [Google Scholar]
- 12.Ogura A, Hayakawa K, Miyuati T, Maeda F. Imaging parameter effects in apparent diffusion coefficient determination of magnetic resonance imaging. Europ J Radiology. 2011;77:185–188. doi: 10.1016/j.ejrad.2009.06.031. [DOI] [PubMed] [Google Scholar]
- 13.Zhu T, Hu R, Qiu X, Taylor M, et al. Quantification of accuracy and precision of multi-center DTI measurements: A diffusion phantom and human brain study. Neuroimage. 2011;56:1398–1411. doi: 10.1016/j.neuroimage.2011.02.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Delakis I, Moore EM, Leach MO, De Wilde JP. Developing a quality control protocol for diffusion imaging on a clinical MRI system. Phys Med Biol. 2004;49:1409–1422. doi: 10.1088/0031-9155/49/8/003. [DOI] [PubMed] [Google Scholar]
- 15.Chenevert TL, Craig JG, Ivancevic MK, et al. Diffusion coefficient measurement using temperature controlled fluid for quality control in multi-center studies. JMRI. 2011;34:983–987. doi: 10.1002/jmri.22363. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Tofts PS, Lloyd D, Clark CA, et al. Test liquids for quantitative MRI measurements of self-diffusion coefficient in vivo. Magn Reson Med. 2000;43:368–374. doi: 10.1002/(sici)1522-2594(200003)43:3<368::aid-mrm8>3.0.co;2-b. [DOI] [PubMed] [Google Scholar]
- 17.Spees WM, Song S-K, Garbow JR, Neil JJ, Ackerman JH. Use of ethylene glycol to evaluate gradient performance in gradient-intensive diffusion MR sequences. Magn Res Med. 2011 Nov;:1–6. doi: 10.1002/mrm.23201. on-line. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Matsuya R, Kuroda M, Matsumoto Y, et al. A new phantom using polyethylene glycol as an apparent diffusion coefficietn standard fro MR imaging. Int J Oncology. 2009;35:893–900. doi: 10.3892/ijo_00000404. [DOI] [PubMed] [Google Scholar]
- 19.Bammer R, Markl M, Barnett A, Acar B, Alley MT, Pelc NJ, Glover GH, Moseley ME. Analaysis and generalized correction of the effect of spatial gradient field distortions in diffusion weighted imaging. Magn Reson Med. 2003;50:560–569. doi: 10.1002/mrm.10545. [DOI] [PubMed] [Google Scholar]
- 20.Krynicki K, Green CD, Sawyer DW. Pressure and temperature dependence of self-diffusion in water. Faraday Discuss Chem soc. 1978;66:199–208. [Google Scholar]
- 21.Simpson JH, Carr HY. Diffusion and nuclear spin relaxation in water. Phys Rev. 1958;111:1201–1202. [Google Scholar]
- 22.Holz M, Heil SR, Sacco A. Temperature-dependent self-diffusion coefficients of water and six selected molecular liquids for calibration in accurate 1H NMR PFG measurements. Phys Chem Chem Phys. 2000;2:4740–4742. [Google Scholar]
- 23.Meier C, Zwanger M, Feiweier T, Porter D. Concomitant field terms for asymmetric gradient coils: consequences for diffusion, flow, and echo-planar imaging. MRM. 2008;60:128–134. doi: 10.1002/mrm.21615. [DOI] [PubMed] [Google Scholar]



