Abstract
We employed a novel diffusion tensor imaging phantom to study intra- and interscanner reproducibility on two 3T magnetic resonance (MR) scanners. Using a phantom containing thousands of hollow micron-size tubes in complex arrays, we performed two experiments using a b value of 1000 s/ms2 on two Siemens 3T Trio scanners. First, we performed 12-direction scans. Second, on one scanner, we performed two 64-direction protocols with different repetition times (TRs). We used a one-way analysis of variance to calculate differences between scanners and the Mann-Whitney U test to assess differences between 12-direction and 64-direction data. We calculated the coefficient of variation (CoV) for intrascanner and interscanner data. For 12-direction protocols, mean fractional anisotropy (FA) was 0.3003 for Scanner 1 (four scans) and 0.3094 for Scanner 2 (three scans). Lowest FA value on Scanner 1 was 2.56 standard deviations below the mean of Scanner 2. For 64-direction scans, mean FA was 0.2640 for 4000 ms TR and 0.2582 for 13,200 ms TR scans. For 12-direction scans, within-scanner CoV was 0.0326 for Scanner 1 and 0.0240 for Scanner 2; between-scanner CoV was 0.032. For 64-direction scans, CoV was 0.056 for TR 4000 ms and 0.0533 for TR 13,200 ms. The difference between median FA values of 12-direction and 64-direction scans was statistically significant (p < 0.001). We found relatively good reproducibility on any single MR scanner. FA values from one scanner were sometimes significantly below the mean FA of another scanner, which has important implications for clinical use of DTI.
Keywords: Brain, diffusion, fractional anisotropy, imaging, phantom, quantitative
Introduction
A major issue facing users of diffusion tensor imaging (DTI) is that the range of normal fractional anisotropy (FA) values, which are the most commonly used DTI metric, can differ substantially from one magnetic resonance (MR) scanner to another. For instance, in one study, an investigator determined a mean FA value of the genu of the corpus callosum of normal adults using a circular region of interest (ROI) of 0.836 whereas, using a freehand ROI, he determined a mean FA value of 0.785.1 In another study, a mean FA value of the genu of the corpus callosum in normal individuals was determined to be approximately 0.560.2 Previous investigators have noted that a number of factors can account for these discrepancies, which include different image acquisition parameters, signal-to-noise ratios, partial volumining effects, and spatial resolution between studies.1 In addition, an effect of scanner upgrades on FA values has been demonstrated.3 A significant effect of number of diffusion-encoding directions on FA has also been noted.4 Finally, a significant effect of the head coil used in a DTI scan on FA values has been shown.5 These facts raise the question of interscanner repeatability, i.e. whether differences in mean values between study populations imaged on different MR scanners are truly due to differences in the groups of individuals studied or, instead, are due to differences in performance characteristics of various MR scanners.
Various investigators have performed a number of studies to assess interscanner, as well as intrascanner, reproducibility of DTI images using human participants. In some studies, the authors found evidence of relatively high reproducibility for some, but not all, brain regions.6 In another study of nine human volunteers, the investigators found consistent interscanner bias that slightly improved with a global scaling factor.7 A number of studies have documented poor inter-scanner reproducibility. One group of investigators has noted that interscanner variability significantly affects the results of longitudinal DTI studies.3 The same investigators also reported an effect of scanner upgrade on DTI values. In another study, the investigators compared data from four 3T scanners (two different vendors) using tract-based spatial statistics for FA values for each of the sites and found widespread group differences between sites.8
Because numerous investigators are using DTI to study various diseases in research trials, it is important that the degree of reproducibility on a single MR scanner, as well as on multiple MR scanners, become known. In this study, we employed a novel DTI phantom to study intra- and interscanner reproducibility of FA values.
Methods
Design of the phantom
The outer shell of the phantom is a 15 cm in diameter dome-like structure that contains many thousands of hollow micron-size tubes composed of multifilament polypropylene yarn (Psychology Software Tools Inc).9 The tubes have an outer diameter of 32 μ and an inner diameter of 12 μ (Figure 1). The tubes are arranged in complex arrays, each having one of the following degrees of tube density: 12.5%, 25%, 50% or 100%. The 100% fiber density region corresponds to 1024 tubes per mm2. The variation in fiber densities allows for simulation of disease processes in which axonal loss is characteristic. For each of these fiber densities, the tubes are configured into five individual tracts of the following sizes: 2 mm2, 4 mm2, 6 mm2, 8 mm2, and 10 mm2 and densities 12.5–100%. Thus, a total of 20 test regions exist in the phantom. In addition, these arrays consist of crossing patterns similar to that of axons within brain tissue. Like axons, the tubes both contain water and are separated by water from one another, allowing differing degrees of mobility of microscopic water motion. The phantom also contains reference fluid test regions, which allow quantification of isotropic measures (e.g. T1, T2, proton density, and the apparent diffusion coefficient). In various locations within the phantom, FA values have been shown to increase and radial diffusivity values to decrease, almost linearly with packing density (Schneider, unpublished data).
Scanning the phantom
As a first means of assessing interscanner variability using the phantom, we performed two experiments using two Siemens 3T Trio MR scanners. Both scanners operated syngo MR B19 software and used a 12-channel head coil. All scans were performed using a b value of 1000s/ms2. First, we performed a total of seven DTI scans using a 12-direction protocol on each of two 3T MR scanners, four on one scanner (hereafter designated as Scanner 1) and three on another scanner (hereafter designated as Scanner 2). We performed the DTI scans on Scanner 1 over a period of one month and those on Scanner 2 over a period of three months.
We performed a second experiment solely on Scanner 1 using two 64-direction protocols, one using a minimum allowable repetition time (TR) of 4000 ms and the other with a maximum allowable TR of 13,200 ms. We obtained both sequences during the same imaging session for each scan.
For each scanner and/or protocol dataset, we used the Kolmogorov-Smirnov (KS) test to determine whether the data were normally distributed. This test failed to reject the null hypothesis that the data were normally distributed at the 5% significance level, i.e. it indicated that the data were normally distributed. On that basis, we calculated differences in 12-direction data between scanners using a one-way analysis of variance or Tukey’s Honestly Significant Difference procedures (using p < 0.05 to show statistical significance). When looking at differences between the 12-direction and 64-direction scans, the KS test showed that these aggregate data were not normally distributed (p < 0.001). Hence, we used the Mann-Whitney U test procedure (using p < 0.05 to show statistical significance) to show differences. Because the data were not normally distributed, we report measurements between these groups as medians instead of means.
Phantom data processing
We evaluated FA at each fiber compartment using a diffusion tensor model and automated ROI placement. We ran FSL (Jenkinson et al., 2012) dtifit reconstruction on each diffusion-weighted imaging (DWI) series to fit a diffusion tensor model at each voxel and produce FA maps. We constructed a template using a b0 image from a previous phantom scan. ROIs were then hand-drawn as rectilinear volumes on the template for each of the 20 fiber anisotropic compartments with the help of phantom manufacturing diagrams and specifications. We aligned each phantom scan to the template using nonlinear registration tool FNIRT11 on a b0 image to account for differences in distortion. We then transferred template regions to native DWI space and used them to extract voxels for each fiber compartment from the FA maps. Finally, we computed summary statistics for the FA (i.e. the minimum value, maximum value, mean value, and standard deviation (SD)) across the voxels for each region.
Results
The individual FA values for all 12-direction and 64-direction scans, mean values for each direction scan on each scanner, and the coefficient of variation (CoV) for groups of scans are listed in Table 1.
Table 1.
FA value | Mean FA value | Standard deviation | CoV within scanners | CoV between scanners | ||
---|---|---|---|---|---|---|
Twelve-direction scans (n = 7) | Scanner 1 (n = 4) | 0.3138 | 0.3003 | 0.0098 | 0.0326 | 0.0315 |
0.2973 | ||||||
0.2994 | ||||||
0.2906 | ||||||
Scanner 2 (n = 3) | 0.3098 | 0.3097 | 0.0075 | 0.0242 | ||
0.3172 | ||||||
0.3022 | ||||||
FA value | Mean FA value | Standard deviation | CoV within same TR | CoV between both TRs | ||
Sixty-four-direction data (Scanner 1) (n = 8) | TR 4000 ms (n = 4) | 0.2441 | 0.2640 | 0.0148 | 0.0559 | 0.0520 |
0.2616 | ||||||
0.2769 | ||||||
0.2732 | ||||||
TR 13,200 ms (n = 4) | 0.2401 | 0.2582 | 0.0137 | 0.0533 | ||
0.2549 | ||||||
0.2681 | ||||||
0.2696 |
CoV: coefficient of variation; FA: fractional anisotropy; MR: magnetic resonance; TR: repetition time.
The four 12-direction scans on Scanner 1 had a mean FA of 0.3003 ± 0.0098 (range: 0.2906–0.3138). The three 12-direction scans on Scanner 2 had a mean of 0.3097 ± 0.0075 (range: 0.3022–0.3172) (Figure 2). The mean FA values of the 12-direction scans on the two MR scanners did not differ significantly. Notably, the lowest FA value on Scanner 1 (i.e. 0.2906) was 2.56 SD (or 6%) below the mean for Scanner 2. These facts are notable given the relatively small number of scans. Figure 2 shows that the FA values on Scanner 2 were generally higher than those obtained on Scanner 1. It also shows that the degree of overlap of the data between Scanner 1 and Scanner 2 is relatively low, as evidenced by the fact that the upper 25th percentile of Scanner 1 values does not overlap with the lower 75th percentile of Scanner 2 values. The figure shows a relatively large overlap of 64-direction data from the TR of 4000 ms scans and those of the TR of 12,300 ms scans. No overlap, however, is seen between the 12-direction data and the 64-direction data.
The mean FA value for the 64-direction, minimum TR scans on Scanner 1 was 0.2640 ± 0.0148 (range: 0.2441–0.2732) and that for the 64-direction, maximum TR scans on the same scanner was 0.2582 ± 0.0137 (range: 0.2401–0.2696). The mean FA values of these two imaging protocols did not differ significantly.
For the 12-direction scans, the CoV within scanners was 0.0326 for MR Scanner 1 and 0.0240 for Scanner 2. The between-scanner CoV for the 12-direction scans was 0.032. For the 64-direction scans, the CoV for the TR of 4000 ms scans was 0.056 and that of the TR of 13,200 ms scans was 0.0533. The CoV for all 64-direction scans was 0.0520.
The median FA value of the 12-direction sequences was 0.3022 and the median value of the 64-direction sequences was 0.2649. Based on the non-normal distribution of all data obtained using all imaging protocols from both MR scanners, we calculated the difference between 12-direction data and 64-direction data using the Mann-Whitney U test procedure. We found a statistically significant difference (p < 0.001). Furthermore, the CoV for the 64-direction TR of 4000 ms pulse sequence (i.e. 0.0559) and the 64-direction TR of 12,300 pulse sequence (i.e. 0.0533) were both higher than the CoV for the 12-direction pulse sequence on Scanner 1 (i.e. 0.0326) and Scanner 2 (i.e. 0.0242). Finally, the CoV between the two 64-direction pulse sequences (i.e. 0.0520) was higher than the CoV between the 12-direction pulse sequences on Scanner 1 and Scanner 2 (i.e. 0.0315).
Discussion
In this study, we set out to assess interscanner variability in FA values between two MR scanners obtained using 12-diffusion encoding directions, and in one scanner, between two 64-diffusion encoding gradient protocols. When evaluating any single 12-direction scan performed on Scanner 1 relative to the mean 12-direction scan FA value obtained on Scanner 2, we found some substantial differences. For instance, the lowest FA value obtained using a 12-direction imaging session on Scanner 1 was approximately 2.6 SD below the mean FA value obtained on 12-direction scans on Scanner 2. This fact is especially important for two reasons. First, the two MR scanners were essentially the same instruments, even down to the employment of the same software and very similar head coils. Hence, very similar imaging environments can produce different FA values that clinicians might incorrectly determine to be meaningful on individual patient assessments. Second, in many imaging studies that use DTI as a biomarker to distinguish normal brain tissue from diseased tissue (e.g. traumatic brain injury (TBI)), a threshold of 2 SD has been used.12 Thus, a finding of an FA value on one MR scanner that is 2.6 SD below the mean FA value of a second scanner of the same vendor and model and using a similar head coil is important. To place this finding in perspective, a large percentage of the abnormal FA values in a study of blast-related TBI patients were within 2.6 SD of the normal mean.12 Similarly, the 6% difference of the lowest FA value on Scanner 1 relative to the mean FA value of Scanner 2 is similar to the degree of difference in mean FA values between normal individuals and mild TBI (mTBI) patients in a number of studies. For instance, in one article comparing individuals diagnosed with mTBI against normal controls, the decreases in FA values were 7% in the splenium of the corpus callosum, 6% in the genu of the corpus callosum, and 4% in the posterior limb of the internal capsule.13 A number of other studies have reported a similar degree of difference between normal individuals and mTBI patients.14,15
Our study found significantly higher FA values on the 12-direction scans than on the 64-direction scans. A study in humans has also found a significant difference in FA values as a function of number of diffusion-encoding directions.4 Interestingly, that study found that the direction of change in FA values when increasing the number of diffusion-encoding gradients depended on the FA value already seen on scans with a relatively low number of diffusion-encoding gradients. Specifically, in regions with high FA values on scans obtained with relatively few diffusion-encoding gradients, FA values increased with an increasing number of diffusion-encoding gradients. This increase was associated with an increase in the contrast-to-signal variance ratio between the region of white matter studied and adjacent regions. In regions with low FA values on scans using relatively few numbers of diffusion-encoding gradients, however, the FA value decreased on scans with increasing numbers of diffusion-encoding gradients. It is possible the decrease in FA values on 64-direction scans in our study is due to a difference in the contrast-to-noise ratios between 12-direction and 64-direction scans. Nonetheless, the exact cause is not yet determined and will be a matter of future study.
In summary, our initial findings from the use of the DTI phantom indicate relatively good reproducibility both for 12-direction scans and 64-direction scans on any single MR scanner. In some instances, however, FA values from an individual scan were significantly below the mean value of another scanner. Importantly, such a decrease was of a degree similar to that noted in studies of mTBI patients. This fact has important implications for clinical use of DTI, for it indicates that low FA values on one MR scanner could be significantly higher on another MR scanner. Finally, we found that FA values from 64-direction MR scanners were significantly different from those obtained from 12-direction scans, even on the same MR scanner. This fact emphasizes the need for avoiding combining findings from different imaging protocols in a common database.
Funding
This work was supported by the Radiological Society of North America /Quantitative Imaging Biomarkers Alliance, and Round 6 QIBA Project Subcontract Award HHSN268201500021C (Provenzale, principal investigator). This material is also in part upon work supported by the U.S. Army Medical Research and Material Command and from the U.S. Department of Veterans Affairs Chronic Effects of Neurotrauma Consortium under Award No. W81XWH-13-2-0095. The U.S. Army Medical Research Acquisition Activity, 820 Chandler Street, Fort Detrick MD 21702-5014 is the awarding and administering acquisition office and the Chronic Effects of Neurotrauma Consortium/Veterans Affairs Rehabilitation Research & Development project F1880, US Army 12342013 (W81XWH-12-2-0139), and Naval Health Research Center (W911QY-15-C-0043). Any opinions, findings, conclusions or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the U.S. Government, or the U.S. Department of Veterans Affairs, and no official endorsement should be inferred.
Conflict of interest
Dr James M Provenzale, Dr Bryan A Taylor, Dr Elizabeth A Wilde and Dr Michael Boss have nothing to disclose. Dr Walter Schneider is associated with the Phantom Metrics Division of Psychology Software Tools Inc., the company that manufactures and distributes the phantom described in this study.
References
- 1.Hakulinen U, Brander A, Ryymin P, et al. Repeatability and variation of region-of-interest methods using quantitative diffusion tensor MR imaging of the brain. BMC Med Imaging 2012; 12: 30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hasan KM, Gupta RK, Santos RM, et al. Diffusion tensor fractional anisotropy of the normal-appearing seven segments of the corpus callosum in healthy adults and relapsing-remitting multiple sclerosis patients. J Magn Reson Imaging 2005; 21: 735–743. [DOI] [PubMed] [Google Scholar]
- 3.Takao H, Hayashi N, Kabasawa H, et al. Effect of scanner in longitudinal diffusion tensor imaging studies. Hum Brain Mapp 2012; 33: 466–477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Giannelli M, Cosottini M, Michelassi MC, et al. Dependence of brain DTI maps of fractional anisotropy and mean diffusivity on the number of diffusion weighting directions. J Appl Clin Med Phys 2009; 11: 2927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Giannelli M, Belmonte G, Toschi N, et al. DTI measurements of fractional anisotropy and mean diffusivity at 1.5T: Comparison of two radiofrequency head coils with different functional designs and sensitivities. Med Phys 2011; 38: 3205–3211. [DOI] [PubMed] [Google Scholar]
- 6.Marenco S, Rawlings R, Rohde GK, et al. Regional distribution of measurement error in diffusion tensor imaging. Psychiatry Res 2006; 147: 69–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Vollmar C, O’Muircheartaigh J, Barker GJ, et al. Identical, but not the same: Intra-site and inter-site reproducibility of fractional anisotropy measures on two 3.0 T scanners. Neuroimage 2010; 51: 1384–1394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Mirzaalian H, de Pierrefeu A, Savadjiev P, et al. Harmonizing diffusion MRI data across multiple sites and scanners. Med Image Comput Comput Assist Interv 2015; 9349: 12–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Guise C, Fernandes MM, Nobrega JM, et al. Hollow polypropylene yarns as a biomimetic brain phantom for the validation of high-definition fiber tractography imaging. Appl Mater Interface 2016; 9: 29960–29967. [DOI] [PubMed] [Google Scholar]
- 10.Jenkinson M, Beckmann CF, Behrens TE, et al. FSL. Neuroimage 2012; 62: 782–790. [DOI] [PubMed] [Google Scholar]
- 11.Andersson J, Jenkinson M and Smith S. Non-linear registration, aka spatial normalisation. Technical Report TR07JA2, Oxford Centre for Functional Magnetic Resonance Imaging of the Brain, Department of Clinical Neurology, Oxford University, Oxford, UK, http://www.fmrib.ox.ac.uk/analysis/techrep (2007, accessed 11 March 2018).
- 12.MacDonald CL, Johnson AM, Cooper D, et al. Detection of blast-related traumatic brain injury in U.S. military personnel. N Engl J Med 2011; 364: 2091–2100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Matsushita M, Hosoda K, Naitoh Y, et al. Utility of diffusion tensor imaging in the acute stage of mild to moderate traumatic brain injury for detecting white matter lesions and predicting long-term cognitive function in adults. J Neurosurg 2011; 115: 130–139. [DOI] [PubMed] [Google Scholar]
- 14.Kraus ME, Susmaras T, Caughlin BP, et al. White matter integrity and cognition in chronic traumatic brain injury: A diffusion tensor imaging study. Brain 2007; 130: 2508–2519. [DOI] [PubMed] [Google Scholar]
- 15.Lo C, Shifteh K, Gold T, et al. Diffusion tensor imaging abnormalities in patients with mild traumatic brain injury and neurocognitive impairment. J Comput Assist Tomogr 2009; 33: 293–297. [DOI] [PubMed] [Google Scholar]