Abstract
Diffusion tensor imaging is used to measure the diffusion of water in tissue. The diffusion properties carry information about the relative organization and structure of the underlying tissue. In the case of a single voxel containing both tissue and a fast diffusing component such as free water, a single diffusion tensor is no longer appropriate. A two-tensor free water elimination model has previously been proposed to correct for the case of volume mixing. Here, this model was implemented in a straightforward but novel manner without the use of spatial constraints. The optimal acquisition parameters were investigated through Monte Carlo simulations and human brain imaging studies. At a signal-to-noise ratio of 40 with 64 diffusion-weighted encoding images, the most accurate estimates of fast diffusion signal were obtained with two diffusion-weighted shells (b-value in s/mm^2 x number of directions) of 500×32 and 1500×32. The potential bias in fractional anisotropy induced by this two-compartment model was more than an order of magnitude less than the error of using the single diffusion tensor model in the presence of partial volume effects with free water. This strategy may be useful for characterizing the diffusion of tissues adjacent to cerebral spinal fluid (CSF), tissues affected by edema, and removing artifacts from blurring and ghosting of the CSF signal.
Keywords: Diffusion Tensor Imaging, DTI, Diffusion Modeling, Free Water
1. Introduction
Diffusion weighted imaging (DWI) is a non-invasive magnetic resonance imaging (MRI) technique capable of measuring properties that describe the molecular displacements of water in biological tissues. Diffusion tensor imaging (DTI), an application of DWI, is used to quantify the three-dimensional movement of water with the assumption that simple Gaussian diffusion is a good descriptor of the water diffusion within a voxel (Basser et al., 1994). The most common application of DTI is brain imaging. In this application, the diffusion information is used to draw conclusions about brain architecture and microstructure. DTI has shown a great deal of utility in routine clinical use, as well as in brain research (Alexander et al., 2007).
The relative ease or resistance to diffusion along any single direction yields information about tissue structure and organization. Free water, which is characterized by uninhibited movement, displays isotropic diffusion and an apparent diffusion coefficient (ADC) of roughly 3 ×10−3mm2/sec (Alexander et al., 2001). Meanwhile, more structured tissue such as white matter (WM) displays distinctly anisotropic diffusion. In the brain, free water exists as cerebral spinal fluid (CSF) in the ventricles and bordering the parenchyma of the brain. Grey matter (GM) is characterized by a lower degree of anisotropy than white matter, as well as more hindered diffusion than CSF. GM and WM both have an ADC of approximately 0.8 ×10−3mm2/sec (Sener, 2001).
The Free Water Elimination (FWE) model seeks to remove the deleterious effect of CSF partial volume effects on diffusion measurements. While the initial description of this two-compartment diffusion model described using multiple b-values (Pierpaoli and Jones, 2004), more recent implementations estimated the fast diffusing component using only a single b-value acquisition with local spatial constraints on the model (Pasternak et al., 2009). This approach is ill-posed without these constraints and assumptions. The FWE model is solvable using multiple b-value measurements, yet few studies have applied these schemes for DTI (Pasternak et al., 2012a). None of these implementations have undertaken a rigorous assessment of the accuracy of the FWE fitting. Likewise, a determination of the best acquisition parameters has not been investigated. This work sets out to determine the accuracy and reliability of the fitting as well as what acquisition is best suited to fitting the FWE model.
Recent work with more advanced diffusion models such as diffusion basis spectrum imaging (DBSI) (Wang et al., 2011a), neurite orientation distribution diffusion imaging (NODDI) (Zhang et al., 2012), multiple fascicle models (Scherrer and Warfield, 2012), and combined hindered and restricted diffusion (CHARMED) imaging (Assaf et al., 2004a) have included an isotropic free water component in their models. While there is growing interest in these complex models of diffusion for characterizing brain tissue microstructure, the acquisition times are long and computational demands for these models are high relative to clinical DTI protocols. However, the simple DTI model with an additional fast diffusion compartment may provide a rapid and simple model for estimating and removing fast diffusion effects in many DWI studies.
This work sets out to develop a relatively simple, yet novel, method for estimating the fast diffusing component and the underlying tissue parameters. The DWI protocol for the FWE DTI model was optimized through successive simulations that took into account different experimentally realistic factors in the optimization. All the while, clinical feasibility, as defined by a maximum of 70 DWI measurements, was maintained. Thus, the number of gradient directions was fixed at 64 and 6 b = 0 images, which corresponded to a minimum whole-brain imaging time of 6 minutes and 30 seconds at 2.5 mm isotropic resolution on our MRI system.
2. Materials and Methods
In this section we introduce our two-compartment FWE DTI model and describe two independent, but complimentary methods to solve the FWE DTI model. The first method is a weighted linear least squares method using a brute force region contraction approach to solve for the free water component. Although not as robust or accurate as our second approach, this method has the advantage of being computationally efficient and serves well as either a stand-alone estimate or the initial guess for non-linear search methods. The second model is a modified Newton’s method approach (Koay et al., 2006) with a dynamically adjusted dampening parameter to control step size and direction.
2.0 Free Water Elimination Model
Errors arise in DTI when the tissue within a single voxel is a mixture of multiple tissue types resulting in partial volume effects (Alexander et al., 2001). A two-compartment model has been used previously to estimate the diffusion characteristics of brain tissue in the presence of the partial volume effect with a fast diffusing component such as free water (Pierpaoli and Jones, 2004). The tissue compartment, which could be white or grey matter, is modeled as a tensor just as in DTI. The fast diffusing compartment is modeled as having isotropic diffusion with a fixed diffusivity equal to the theoretical expected diffusivity of unhindered water at body temperature. The relative signal contribution of the fast diffusing component is described by f, a scalar volume fraction. The free water elimination DTI signal model is described by
| (1) |
where, Si and S0 are the signal from the i-th diffusion and non-diffusion weighted measurements, respectively, Diso = 3 ×10−3mm2/sec is the free water diffusivity, Dtissue is the tissue diffusion tensor, bi and gi are the diffusion-weighting amplitude (in mm2/s) and unit gradient encoding vector, respectively. This two-compartment model is attractive because of its similarity to DTI. The tissue signal compartment results in the same scalar metrics of DTI. The addition of the isotropic compartment is intended to compensate for confounding partial volume effects from CSF and also edema. This will improve the ability to characterize tissue parenchyma microstructure in voxels with partial volume averaging and multiple diffusion components.
2.1 Fitting procedures
Initially the tissue compartment tensor was fit using a weighted linear least squares (WLLS) region contraction approach. This was accomplished by recasting Equation 1 as
| (2) |
and solving for Dtissue using fixed f values. A weighted linear least squares (WLLS) estimation was carried out to estimate the diffusion tensor, Dtissue, (Koay et al., 2006) for each fixed value of f and Diso = 3 ×10−3 mm2/s was set as a constant. The WLLS result was a diffusion tensor that best fit the measured data for a corresponding volume fraction. This procedure was carried out for a range of f between 0 and 1.
Once a (f, Dtissue) pair was calculated, the WLLS objective function
| (3) |
was used to judge which (f, Dtissue) best fit the measured data. Here i = 1,…, m, where m = the number of images obtained, si = the measured signal including noise, ωi = the weights for each si, and yi and γi are defined differently for DTI and FWE as described below. The weights were set equal to the signal magnitude (ωi = si). This technique emphasizes acquisitions with higher signal that result from lower b-value shells and gradient directions aligned with lower diffusion distances. The diffusion encoding matrix is
Smaller objective function values indicate a better fit. Thus, by systematically fitting diffusion tensors to many volume fractions and then evaluating which (f, Dtissue) minimizes the objective function, it is possible to determine which isotropic volume fraction and tissue compartment tensor best represents the measured data.
To identify the optimum f would require many small steps in f from zero to one, which would be cumbersome and time consuming. However, the WLLS routine was further modified so that multiple volume fractions could be fit simultaneously. Furthermore, it was seen that the total number of fittings were greatly reduced by systematically refining the size of the Δf step.
This was implemented by simultaneously solving for the diffusion tensor with multiple f-values. Initially, a coarse Δf step size of 0.1 was used over the range from zero to one. The objective function for each of these initial eleven (f, Dtissue) pairs was evaluated with the lowest value being passed on as the best estimate. The next iteration used steps of 0.01 over the range of the previous best estimate ± .05. A third step reduced the step size by another order of magnitude.
The method did not display any drop off in estimation accuracy compared to the use of an initial step size of 0.001. However, the series of refined steps reduced the number of WLLS estimations from 1001 to 31 per voxel.
We investigated the search space of f by comparing the value of the objective function across the entire range of f values from f=0 to f=1 using an increment of 0.001. This was performed independently for 50 voxels from an in vivo data set with an SNR of 36. The voxels were chosen to represent a wide range of different f values, as determined by the WLLS procedure described above, as well as widespread anatomical locations within the brain. The objective function was manually inspected for position of the global minimum compared to local minima.
The WLLS routines for DTI and FWE-DTI are similar, as both share the same objective function with some modification. There are two conceptual differences in the WLLS routines and both are manifest in the definition of y. As discussed earlier, equation 1 is recast to account for the isotropic compartment, which leads to a change in the definition of y.
DTI
For WLLS, the natural log transformed signal y is an m x 1 vector
and the model solution is defined by the parameter vector γ
FWE-DTI
The WLLS method works on the natural log transformed free-water adjusted diffusion-weighted signal on the left side of Equation 2:
where k = 1,…, n and n = the number of f-values fitted simultaneously
The model solution
is an m x n matrix where n is the number of f-values to be fit simultaneously. This parameter matrix is estimated by
where S is a diagonal matrix with the measured diffusion signal as the nonzero elements:
The volume fraction and fitted tensor pair, which minimized the WLLS objective function, were passed on as the initial estimate for a nonlinear fitting. The nonlinear optimization used a modified Newton’s method approach (Koay et al., 2006) that explicitly calculated the Hessian at each iteration. In the context of diffusion tensor estimation, the inclusion of the full Hessian matrix has been shown to be superior (Koay et al., 2006). The definition of the dual compartment nonlinear least squares (dcNLS) objective function for the FWE model, its gradient vector, Hessian matrix, and search step vector are shown below.
Here λ is the Levenberg-Marquardt parameter and I is an identity matrix of the same dimensions as the Hessian. This parameter dampens the search step with larger λ values corresponding to greater dampening and λ = 0 simply being Newton’s method. The dampening allows for an increased likelihood of convergence over Newton’s method when the initial parameter estimates are far from the true minima. When the initial estimate is near the minima, a smaller λ and, thus, less dampening will result in faster convergence. The exact definition of small and large λ is relative to the values in the Hessian. Further details on the values used and other implementation details can be found in the Appendix.
After each step that reduced the nonlinear objective function, λ was reduced by dividing it by some incremental factor (λinc), which was a positive number greater than one. If the step increased the objective function, then it was rejected and λ was increased by multiplying by λinc.
Based on an estimate of the goodness of fit of the initial parameters determined by the WLLS fitting, a certain λ and λinc were set for the nonlinear fitting. It has been shown (Koay et al., 2006) that an unbiased estimate of the diffusion signal variance is formed by , where v = m − p is the number of degrees of freedom; m is the number of samples; p is the number of parameters being fit. Thus, an estimate of the signal to noise ratio, here called the pseudo signal to noise ratio (SNRp), can be defined as , where S0 is the average signal of the b = 0 acquisitions. This SNRp measure was used to determine the goodness of fit of each initial estimate.
It was also necessary to reinitialize some voxels prior to the nonlinear fitting. It was observed that high isotropic volume fractions (f > 0.7) occasionally were fit by WLLS with an estimated f < 0.05 and a nearly isotropic tensor with a mean diffusivity slightly below or above the fixed value of 3 ×10−3 mm2/s. Therefore, any voxel with a tensor-compartment mean diffusivity greater than 1.5 ×10−3 mm2/s was reset as having an f-value of 0.5 and its tensor components divided by two. Though this also represented a crude initial estimate, it was an improvement in the case of the WLLS fit procedure failing. This relative weakness in the WLLS fitting is remedied in the NLS fitting evident by the small estimation bias and standard deviation in the Results section.
Both the WLLS and NLS routines constrained the volume fraction to be 0 ≤ f ≤ 1. However, neither method constrained the tensor to be positive definite. Computational times for WLLS and NLS routines for a single data set were roughly 1.5 hours and 20 hours, respectively. These routines were written in MATLAB (R2013a) by a group member with no formal programming training and without optimization for speed.
2.2 Acquisition Optimization
Simulations were initially conducted to explore the range of b-values for designs with 2, 3, 4, 6, 8, and 16 different shells that produced the best (f, Dtissue) pair. The total number of nonzero directions were fixed at 64 and the number of directions in each shell was kept as even as possible. Any additional directions were assigned to the highest b-value. Thus, the two-shell design featured 32 directions per shell, while the 3-shell design had a shell structure of 21/21/22 directions. The b-values were linearly spaced between the minimum and maximum value for each scenario. The distribution of directions for each shell was determined using the sparse and optimal acquisition method (Koay et al., 2012).
The minimum b-values investigated varied from 200 to 800 sec/mm2 in increments of 100 sec/mm2. Two-hundred was established as the minimum b-value to ensure that perfusion effects were avoided (Turner et al., 1990). The maximum b-value varied from 300 to 1500 sec/mm2, once again in steps of 100 sec/mm2. The maximum b-value was always at least 100 sec/mm2 greater than the minimum value. Also, the outer shell was limited to no higher than 1500 s/mm2 to minimize non-Gaussian diffusion effects in the signal (Assaf et al., 2004b).
Data was simulated utilizing the two-compartment model with f fixed at 0.5. The choice of this f-value was to create a scenario that balances the effect of parenchymal tensor and CSF isotropic compartments. At any extended interface between CSF and tissue, the distribution of f values will have a mean of 0.5. Thus, we did not want to bias the optimization towards high or low f-values.
Two separate tensor shapes were considered independently. First, a prolate tissue compartment tensor shape was considered. The eigenvalues were λ1 = 1.6 ×10−3, λ2 = 0.5 ×10−3, and λ3 = 0.3 ×10−3 mm2/sec. The tensor had a trace of 2.4 ×10−3 mm2/sec and a fractional anisotropy (FA) value of 0.712. These values were chosen as representative of voxels in a major white matter pathway such as the corpus callosum.
Additionally, an isotropic tensor was examined. Like the prolate tensor, this tensor had a trace of 2.4 ×10−3 mm2/sec, however, its eigenvalues were λ1 = λ2 = λ3 = 0.8 ×10−3 mm2/sec. While these two tensors were used for determination of the optimal acquisition, additional tensors were simulated to characterize the performance of the reconstruction. These tensors maintained a trace of 2.4 ×10−3 mm2/sec with FA increments of approximately 0.1 from 0 to 0.8.
The metrics used to judge estimation accuracy were the mean squared errors (MSE) of FA, trace, and f. Mean squared error was a convenient metric because it incorporates both the bias and variance of the estimator. , where n is the number of measurements, is the true parameter value and is the estimated parameter. For an estimator, MSE is equivalent to the sum of the variance and squared bias of the estimator (Montgomery et al., 1987).
Monte Carlo simulations were conducted where the ideal signal was corrupted with Rician noise across a plausible range of signal to noise ratios (SNR) as established by Pierpaoli and Basser (Pierpaoli and Basser, 1996). The SNR was set relative to the b = 0 image. To ensure rotational invariance, each synthetic tensor was simulated in 120 unique orientations evenly spaced about a hemisphere. At each tensor orientation, 100 iterations were performed.
Once the optimal b-values were determined for each design, a more thorough characterization of estimation performance was carried out. This was accomplished by investigating a range of f-values from 0 to 1 in increments of 0.1 and SNRs from 20 to 60 in increments of 10. Once again, 100 iterations for each of the 120 orientations were performed for each SNR and f-value.
The largest b-value determines the minimum echo time (TE), which induces T2 signal attenuation. In a single acquisition or simulation, the TE remained constant to mitigate any confounding signal variations. However, when comparing two separate acquisitions, it is imperative to consider the drop in overall SNR associated with higher b-values. Though a rough estimate may be achieved with reasonable assumptions, the exact change in signal depends on the T2 of the tissue and a number of factors that determine the TE. Here, a single T2 is assumed. Using TE values recorded from our 3T scanner (MR750, GE Healthcare, Waukesha, WI) and an approximate white matter T2 of ≈ 70 ms at 3T (Stanisz et al., 2005), a relative SNR scaling factor was established to correspond to the largest b-value. While the nominal SNR may have been 40, the simulated SNR for max-b = 500 was larger than that of the simulation with max-b = 1500. The nominal SNR was set such that max-b = 1000 had a scaling factor of unity.
To arrive at a “best” acquisition, the MSE for each f-value was weighted by the frequency of occurrence of that f-value in vivo. An in vivo data set was used to determine the frequency of each f-value in bins of 0.1 to match the simulated f-values. This frequency was normalized such that the sum of all bins was one. Table 1 contains the weights obtained from a single healthy volunteer. These values were averaged from the two-shell and eight-shell acquisitions. The relative frequency was then used as a weighting for each MSE to allow for a single weighted mean squared error (wMSE) for each acquisition that reflects the expected global performance of the acquisition.
Table 1.
The relative frequency of f-values within the brain of a healthy volunteer. These values were used to weight the MSE values to calculate a single wMSE for each acquisition.
| f-value bins | <0.05 | 0.05–0.15 | 0.15–0.25 | 0.25–0.35 | 0.35–0.45 | 0.45–0.55 | 0.55–0.65 | 0.65–0.75 | 0.75–0.85 | 0.85–0.95 | >0.95 |
| Relative frequency | 0.14 | 0.26 | 0.27 | 0.10 | 0.05 | 0.04 | 0.03 | 0.03 | 0.02 | 0.01 | 0.04 |
To investigate the effect of directional organization, i.e. how many independent directions are in each shell, the “best” design (i.e. optimal number of b-values) was simulated letting the distribution of directions in each shell vary along with the minimum b-value while keeping the total number of directions 64.
A comparison to a “standard” Levenberg-Marquadt (LM) fitting procedure is included. In this implementation only the “best” design is used for the LM optimization.
2.3 Model Insufficiencies / Acquisition Effects
The effects of using an incorrect underlying model and a non-standard acquisition were investigated. This included the case where a standard DTI acquisition and estimation was used when the underlying data was actually the two-compartment model. The effect of using the free water elimination method and the optimized acquisition when the underlying signal was generated with a DTI model was also investigated.
For the standard DTI acquisition, a single non-zero b-value of 1000 s/mm2 was used for all 64 unique gradient directions along with six b = 0 images. The FWE acquisition was the optimized acquisition as was determined by the simulation steps. The simulations were carried out in the manner discussed above. The gradient directions for the DTI and FWE scans were the same.
2.4 Example Human Brain Acquisitions
A single healthy volunteer was scanned with the optimized protocol. This design was repeated twice in each of two sessions conducted a day apart. All scanning was performed using informed consent and in compliance with an approved protocol from the Institutional Review Board. Diffusion scans were performed with a 3T MR750 Discovery scanner (General Electric Medical Systems, Milwaukee, WI) and an 8-channel head coil. MR parameters for the optimized scan were TE = 68.9 ms, TR = 5600 ms, acquisition matrix = 96 × 96, number of slices = 50, voxel size = 2.5 × 2.5 × 2.5 mm3, and scan time = 6:30 min. Diffusion parameters were Δ = 24.0ms and δ = 18.0ms. The image was then interpolated in the axial view to a matrix of 256 × 256 and a display resolution of 0.94 × 0.94 × 2.5 mm3.
3. Results
Figure 1 shows the results of the search space investigation for two voxels with f-values, MD, and FA determined by the modified Newton’s method of 0.913, 0.84, 0.45 ×10−3 mm2/s and 0.204, 0.71, 0.56 ×10−3 mm2/s. respectively. The objective function value shown corresponds to the WLLS (f, Dtissue) for each fit f. A simplified version of the search space is displayed, as the full eight-dimensional space cannot be meaningfully visualized.
Figure 1.

The collapsed WLLS search space. All graphs have the objective function value on the vertical axis and the presumed f-value on the horizontal axis. The top and bottom rows correspond to in vivo voxels with fitted nonlinear f-values of 0.913 and 0.204, respectively. The first column depicts the entire space of presumed f-values from 0 to 1. The second column shows an expanded region ± 0.5 of the (f, Dtissue) pair from step one of the WLLS fitting. Included in the last column is an expanded depiction of the high f-value region, which displays many local minima.
Though the voxels differ in fitted f-value, they display similar qualitative appearance. At very high f-value (> 0.97) there are many local minima, however, outside this region the space is smooth and convex near the global minima. This pattern was observed for all voxels that the authors investigated. Though this is a simplified view of the search space, it appears that a derivative based optimization approach is likely justified.
The first step of optimization was to establish the b-values for each acquisition scheme with different numbers of shells that minimized the mean square error. Table 2 contains a summary of each design, including number of shells, b-values for each shell, and the number of directions in each shell. A general trend is seen where the optimal minimum b-value decreases with increasing number of shells. For each design, the optimal outer shell was the highest one simulated.
Table 2.
The different designs that minimized MSE in the first step of optimization and were subsequently used for the second step of optimization.
| Number of Shells | b-values (s/mm2) | Directions per Shell |
|---|---|---|
| 2 | 500/1500 | 32 per shell |
| 3 | 500/1000/1500 | 21/21/22 |
| 4 | 400/767/1133/1500 | 16 per shell |
| 6 | 400/620/840/1060/1280/1500 | 10/10/10/10/10/14 |
| 8 | 300/471/643/814/986/1157/1329/1500 | 8 per shell |
| 16 | 300/380/460/540/620/700/780/860/940/1020/1100/1180/1260/1340/1420/1500 | 4 per shell |
Figure 2 illustrates the effects of varying the minimum and maximum b-values on the model mean squared error. Though a single case is shown, all designs shared several characteristics. The metric most sensitive to b-value was f-value, while FA and trace were less sensitive to b-value. However, for each metric the same set of b-values (500 and 1500 s/mm2) produced the lowest MSE.
Figure 2.

The scaled reciprocal of MSE corresponding to a certain minimum and maximum b-value (the value of 1 denotes the smallest MSE). MSE was evaluated for f-value (left), FA (middle), and trace (right). This set of images corresponds to the two b-value simulation and the prolate tensor. Though not shown, the other acquisitions produced qualitatively similar appearing MSE maps.
Figure 3 shows the effect of letting both the b-value and number of direction vary for the lower shell of the two-shell design. The lowest MSE was associated with a two-shell design with b-values of 500 and 1500 s/mm2 and 32 directions each. The results shown here correspond to the case of the prolate tensor. Though not shown the same acquisition design yielded the lowest MSE for the isotropic tensor as well.
Figure 3.

The scaled reciprocal of MSE, as in Figure 1, corresponding to a certain lower shell b-value and number of directions for the prolate tensor. MSE is shown for the f-value (left), FA (middle), and trace (right). The simulated acquisition had two-shells with the higher shell fixed at b = 1500 s/mm2. The combined number of diffusion directions was held constant at sixty-four.
The next optimization step examined the effect of varying SNR and f-value on the MSE of each metric. Here the two-shell acquisition is seen to be superior for estimation in every situation investigated (see Figure 4). This design resulted in the lowest MSE regardless of f-value or SNR simulated.
Figure 4.
Characterization of performance for estimation of FA, left column, and f-value, right column, for the prolate tensor. The first row depicts MSE with increasing f-value while SNR = 40 for the various acquisitions tested. The second row shows MSE with increasing SNR for f = 0.5, the same acquisitions as above are displayed.
Figure 5 shows the mean and standard deviation of the estimated FA and f-values for the simulation with SNR = 40. Here it is seen that despite accurate estimation of the f-value that the FA was biased to larger values as the simulated f-value increased. The bias was greatest for the isotropic tensor. The bias was larger for lower FA values. Table 3 shows the results of linear regression on the fit of the estimated f-value with increasing simulated f-value.
Figure 5.
Estimation performance shown as mean ± standard deviation from simulations using the two-shell acquisition shown vs. increasing simulated f-value. The top image shows estimated FA for various tensor shapes. The points on the right indicate the true simulated FA values. The bottom row shows estimated f-value vs. increasing simulated f-value for the two cases used for optimization (red: FA= 0.71 and green: FA = 0.00).
Table 3.
Linear regression on the fit of f-value for the tensors with varying FA.
| FA | slope | intercept | R2 |
|---|---|---|---|
| 0.00 | 0.9927 | 0.0073 | 0.9986 |
| 0.11 | 0.9932 | 0.0069 | 0.9986 |
| 0.21 | 0.9927 | 0.0071 | 0.9986 |
| 0.30 | 0.9933 | 0.0067 | 0.9986 |
| 0.71 | 0.9966 | 0.0042 | 0.9998 |
Figure 6 contains the top two plots from figure 4 with the addition of the MSE curve derived from the LM fitting procedure. It is clearly visible that using the WLLS initialization of the modified Newton’s method fitting has a much greater effect on estimation accuracy than the acquisition used. Indeed, the f-value MSE for the two-shell modified Newton’s method is an order of magnitude less for simulated f-values greater than 0.3.
Figure 6.

Relative performance of a Levenberg-Marquadt optimization vs. the modified Newton’s method approaches. The black line denotes the MSE of the LM implementation for estimation of the f-value (left) and FA (right) using the two shell ‘optimal’ acquisition. The MSE curves corresponding to Figure 4 are also presented with the same color coding.
Figure 7 shows several images from the same axial slice in a healthy volunteer. A T1-weighted anatomical image is provided for reference along with FA and MD maps from standard DTI and FWE. Additionally, a free water fraction map and FA and MD difference map are included. Here the periventricular gray matter is seen as having an artificially large FA. This result echoes the simulation findings that a high f-value can lead to inflated FA values. Voxels which fit a large isotropic volume fraction (f > 0.95) and a tensor ADC > 3 × 10−3 mm2/s after nonlinear fitting were masked out of all maps with the exception of the free water fraction map. These voxels were found to be limited to the ventricles.
Figure 7.
An axial slice from a normal volunteer is used to illustrate the different measurements of the FWE DTI technique as compared to conventional DTI. Fractional anisotropy images from DTI and FWE are shown in (a) and (b) respectively. The difference image (c) between FA from FWE and DTI shows the pattern of increasing FA with FWE. The MD from DTI and FWE are show in (e) and (f). The difference image between MD from DTI and FWE (g) shows reduced MD in the FWE tensor compartment. A T1w image is included for anatomical reference (d). The isotropic volume fraction (h) is shown as well. The arrows highlight artificially elevated FA in periventricular grey matter.
A joint histogram of FA and MD across the whole brain, as well as WM and GM segmented regions, illustrates the removal of CSF partial volume effects (Figure 8). The MD is not only reduced, but the distribution becomes narrower, losing its tail of high diffusivity, low FA voxels as evidenced by the reduction in standard deviation in MD for both WM and GM, Table 4. The reduction of voxels with CSF partial volume averaging is shown in a sagittal image in Figure 9. Here voxels with MD intermediate to parenchyma and CSF are considered as partial volume voxels.
Figure 8.

Joint histogram of FA and MD for FWE (top row) and DTI (bottom row). Maps are shown for the whole brain (left column), GM (middle column), and WM (right column). Partial volumed voxels are identified as high MD (>0.9 × 10−3 mm2/s) with low FA (< 0.2), creating a “tail” in the DTI histograms which is not present in the FWE histograms.
Table 4.
Global metrics in WM and GM from a single healthy volunteer.
| DTI | FWE | |
|---|---|---|
| WM | ||
| FA | 0.35 ± 0.19 | 0.41 ± 0.21 |
| MD (x10−4) | 6.84 ± 2.47 | 5.63 ± 1.77 |
| f | ~ | 0.17 ± 0.14 |
| GM | ||
| FA | 0.15 ± 0.13 | 0.19 ± 0.16 |
| MD (x10−4) | 7.35 ± 4.18 | 5.28 ± 2.82 |
| f | ~ | 0.24 ± 0.23 |
Figure 9.

The background image is an FA map from a healthy volunteer approximately 7-mm from the mid-sagittal plain. The red/yellow voxels are those with a mean diffusivity between 0.9 × 10−3 and 2 × 10−3 mm2/s. These values are intermediate between WM (0.8 × 10−3 mm2/s) and CSF (3 × 10−3 mm2/s) and clearly outline the interface between the CSF filled ventricles and the corpus callosum and fornix. The left overlay is from the DTI MD map while the right overlay is from the FWE tissue MD.
The use of the wrong model, either the standard DTI model when partial voluming is present or the proposed FWE model with no partial voluming, introduces some bias in the FA estimates. Taking a prolate tensor (FA = 0.712) as an example, if a voxel contains a free water fraction of 0.2, then the FA would be underestimated by ≈ .1 (see Table 5). As the free water component increases, so would the amount of underestimation. If there is no fast diffusion component (e.g. f = 0), then the FWE model should yield the same results as DTI. However, in the presence of noise, the model appears to fit a small fast diffusing component to the noise, thus introducing some small bias. For the prolate tensor with f = 0, the FWE model overestimates FA by < 0.009 (see Table 6). While induced bias is suboptimal, it is to a much smaller degree than DTI’s partial volume errors.
Table 5.
Using a standard DTI model in the case of a volume fraction of 0.2 results in a bias in fitting FA. In this case the standard DTI fit underestimates the true tissue FA which was 0.712.
| FA Bias for a Single Tensor Fit | |||||
|---|---|---|---|---|---|
| SNR | 20 | 30 | 40 | 50 | 60 |
| Bias | −0.106 | −0.102 | −0.100 | −0.100 | −0.099 |
Table 6.
Using a FWE model in the case of a volume fraction of 0 results in a bias in fitting FA. In this case the FWE fit overestimates the true tissue FA.
| FA Bias for an FWE Fit | |||||
|---|---|---|---|---|---|
| SNR | 20 | 30 | 40 | 50 | 60 |
| Bias | 8.7×10−3 | 6.3×10−3 | 4.8×10−3 | 3.6×10−3 | 2.3×10−3 |
Tables 5 and 6 display the results of the mismatch between the assumed model and the underlying simulated model. Table 5 presents the underestimation that occurs when using a single tensor DTI fit when there is actually a fast diffusing component of 20% (e.g. f = 0.2). The resulting FA is biased to a lower than true value even in the case of infinite SNR.
Table 6 shows the bias induced by using the FWE model when the underlying signal conforms to the single tensor DTI model (f = 0). The model wrongly fits a small fast diffusing component, resulting in an overestimate of FA. This is due to a fitting of the noise. As would be expected, the bias decreases as SNR increases.
4. Discussion
In this study, we introduced a simple technique using a two-compartment diffusion model and two-shell DTI acquisition, which resolves partial volume effects due to CSF contamination. We optimized the acquisition parameters to minimize the estimation errors in FWE DTI. Simulations showed that a two-shell design was superior to designs with a greater number of shells.
Fitting diffusion data to a biexponential function is a notoriously difficult and ill-conditioned problem (Mulkern et al., 2009). We feel it is essential to stress the importance of the “ancillary” points regarding the ways that reinitialization in the search was carried out, as well as the modified Newton’s method and the setting of the dampening parameter. These finer points resulted in a reduction in wMSE of more than an order of magnitude over the case without reinitialization and using a basic Levenberg-Marquardt implementation. Without the modifications presented here, the model is somewhat prone to poor performance especially with a large f-value. It is also possible that derivative free approaches such as BOBYQA (Powell, 2009; Scherrer and Warfield, 2012) may yield even greater speed and improved convergence properties. Additionally, the authors did not investigate any different initialization techniques such as that proposed by Pasternak and colleagues (Pasternak et al., 2012a). It is entirely possible that some other method will yield equal or better results. Though not included, a comparison of optimization techniques would make for a worthwhile and interesting extension of this work. In addition to alternative nonlinear methods, an investigation of differing WLLS methods such as iterative reweighting would be informative with potential speed and accuracy improvements.
The estimation of tensors with a substantially high f-value leads to overestimated FA values. The lower the anisotropy, the more pronounced the effect was. This matches the observation from DTI in the presence of low SNR (Pierpaoli and Basser, 1996). The low anisotropy scenario means that the proposed method contains extra parameters that are unnecessary, hence the estimation of these parameters may vary wildly. For this situation the Hessian matrix may not be well conditioned. Without some modification periventricular grey matter will be most plagued by this error and thus the FA in this region cannot by reliably determined. However, for high FA regions such as coherent white matter it is expected that a higher f-value (≈ 0.8) may be tolerated with reliable performance. This stands as a clear limitation of the presented method. Though spatial regularization was intentionally avoided it may be that some spatial constraints would reduce this limitation. Nothing regarding the current implementation would preclude the inclusion of spatial constraints. It is also possible to mitigate most errors by dismissing all diffusion metrics from voxels with an f-value set above a reasonable threshold such as 0.75.
Any voxel that contains some free water portion must necessarily have a lower signal arising from the tissue compartment. As such, the FWE tensor SNR will be lower than when fitting a single diffusion tensor to the recorded signal. This effect, in turn emphasizes the known directional dependence of variance in diffusion imaging (Koay and Basser, 2006; Koay et al., 2006). We believe it is this reduced tissue compartment SNR, which causes the uncertainty and bias when the isotropic component is large. It is expected that any method that attempts to resolve CSF PVE in a voxel will similarly be plagued by low apparent SNR. This would be expected even in experiments that fit a single diffusion tensor, but alter the acquisition to minimize the CSF signal such as fluid attenuation inversion recovery (Concha et al., 2005) and the recent method suggested by Baron and Beaulieu (Baron and Beaulieu, 2014).
The difference in estimation accuracy for two-shell acquisition design as compared to any of the other designs was statistically significant with p < 0.05 via t-test. Even so, thanks to the algorithm optimization, any of the acquisitions with the “optimal” b-values would likely perform sufficiently well in vivo. A detailed analysis of directional distribution was carried out for the two-shell design only. It is possible that more complicated arrangements of directions for the other cases may lead to increased estimation accuracy. Due to its performance and relative simplicity, we recommend the two-shell acquisition for future work.
The FWE DTI model is based on the assumption that the fast diffusion compartment behaves as free water in both magnitude (D = 3 × 10−3mm2/sec) and mode (isotropic) of diffusion. This is a reasonable approximation in regions with CSF and is the basis for the previous studies that apply FWE DTI (Metzler-Baddeley et al., 2012; Pasternak et al., 2012a, 2012b, 2009). Note that, if the fast diffusion component deviates significantly from the predicted value, the acquisition parameters may be suboptimal and the estimated measures from the model may be biased.
The in vivo results with the optimized scan parameters in this study yielded a mean FWE volume fraction of 0.17 ± 0.14 in white matter and 0.24 ± 0.23 in grey matter, which are consistent with previously reported values using FWE DTI (Pasternak et al., 2012a). These volume fractions are similar to the mean extracellular volume of ≈ 0.2 in brain parenchyma from non-MRI diffusion measurements of tetra-methylammonium (Nicholson and Syková, 1998). Though these results are similar, the physical interpretation of the isotropic component within the parenchyma away from CSF interfaces may be debated as being inconsistent with the expected properties of the extracellular space (Beaulieu, 2002 and references therein).
Further complicating the interpretation of the isotropic component as being indicative of the extracellular compartment is the presence of Gibbs ringing in the f map, Figure 7h. This artifact is also present in the b = 0 image due to the high contrast transition from CSF to parenchyma. The high spatial frequency of the edge at CSF and brain borders causes the ringing. This artifact is then propagated in the tensor calculation and is visible in the standard MD image, Figure 7e. By using the FWE model, this ringing appears substantially reduced from the tissue compartment MD maps, Figure 7f. This artifact reduction should allow the tensor to more faithfully reflect diffusion properties of the tissue, however, it also displays that the f-value may be affected by factors beyond the extracellular volume.
Even with the limitations noted, the free water elimination model may be implemented in a simple way with a scan that is less than six and a half minutes. With a manageable scan time and additional information relative to DTI, we believe that the FWE scheme has significant potential for immediate, impactful clinical use. While more sophisticated models such as CHARMED, DBSI, and NODDI provide attractive models, they are not currently configured for routine clinical use (Assaf et al., 2004a; Wang et al., 2011b; Zhang et al., 2012).
The joint histograms of WM and GM, Figure 8, show a great reduction in partial-volume averaged voxels with MD > 0.9 × 10−3 mm2/s and FA < 0.2. Instead, these voxels are corrected to be more in line with the majority of voxels in the parenchyma. It appears that partial volume effects with CSF have successfully been corrected resulting in tissue specific diffusion measures. As a means of correcting for partial volume effects, the free water elimination model has the potential for a wide variety of applications. Any investigation into diffusion properties of tissue bordering CSF could benefit from adopting the FWE model. For example, this method may provide a means to differentiate widespread edema from infiltrating tumor by revealing tissue characteristics masked by the presence of edema. Further validation is required to truly interpret the isotropic volume fraction.
5. Conclusions
In this paper, we have described a bi-tensor free water elimination model that is useful for the removal of the confounding effects of free water and other sources of fast diffusion in the brain. These effects appear at CSF interfaces due to partial volume effects and more distally due to Gibbs ringing artifacts. The method is limited in the case of high CSF contamination and low underlying anisotropy. Otherwise, it results in DTI measures that are better representations of underlying tissue microstructure, as well as the additional contrast mechanism of the free water map. This work offers a straightforward implementation of the FWE model along with recommended acquisition parameters. The FWE model allows for a clinically acceptable acquisition time and reconstruction. For the conditions investigated, the best acquisition featured two b-value shells with an equal distribution of directions per shell. The recommended b-values are 500 and 1500 s/mm2 for each shell.
Highlights.
We implemented a diffusion model that resolves partial volume effects with CSF.
This model results in a smaller bias than DTI and better fits the measured signal.
The scan is less than seven minutes, shorter than other advanced diffusion models.
An optimized acquisition was found through Monte Carlo simulations.
Acknowledgments
The views expressed in this article are those of the authors and do not necessarily reflect the official policy or position of the Department of the Navy, Department of Defense, or the U.S. Government. This work was partially supported by a graduate fellowship through the Department of the Navy (to ARH) and NIH grants NIA 037639, NIA 043125 NIMH 10031, NIMH 097464, and NICHD 003352.
Appendix
Presented here is the algorithm employed for the nonlinear fitting. It is broken up into the steps that took place prior to the first iterations, and the body of the algorithm, which is iterated until convergence. While values are given for the Levenberg-Marquardt parameters, these may not be generally applicable. The proper values are dependent on your specific data set and will need adapting to achieve efficient and robust algorithm performance.
Before the first iteration:
1. If (MDtensor>.0015 mm2/s) {
f = 0.5
tensor = tensor/2
}
2. Calculate F(γk)
3. Calculate SNRp
4. If (SNRp < 20) {
λ = 1*10^8
λinc = 1.1
} Elseif (SNRp < 30) {
λ = 1*10^7
λinc = 2
} Else {
λ = 1*10^7
λinc = 5
The different models represent the change specific cases that occur when f = 0 or 1. In these cases the Hessian must be modified. Hence, when f = 0, the Hessian is the same as if the DTI model were in use.
At the kth iteration
1. If (k>kmax or flag==false) exit
2. Solve (H(γk)+λI)δk = -▽F(γk) for δk
3. If (F(γk + δk) < F(γk){
If (|F(γk + δk) - F(γk)| < ε2 and ){
flag == false
} Else {
λ = λ/λinc
Accept δk by setting γk+1 = γk + δk
}
} Else {
λ = λ*λinc
Reject δk by setting γk+1 = γk
}
Where ε1, ε2, and ε3 are small positive numbers and kmax is the maximum number of iterations.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Alexander AL, Hasan KM, Lazar M, Tsuruda JS, Parker DL. Analysis of Partial Volume Effects in Diffusion-Tensor MRI. 2001;780:770–780. doi: 10.1002/mrm.1105. [DOI] [PubMed] [Google Scholar]
- Alexander AL, Lee JE, Lazar M, Field AS. Diffusion Tensor Imaging of the Brain. 2007;4:316–329. doi: 10.1016/j.nurt.2007.05.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Assaf Y, Freidlin RZ, Rohde GK, Basser PJ. New modeling and experimental framework to characterize hindered and restricted water diffusion in brain white matter. Magn Reson Med. 2004a;52:965–978. doi: 10.1002/mrm.20274. [DOI] [PubMed] [Google Scholar]
- Assaf Y, Freidlin RZ, Rohde GK, Basser PJ. New modeling and experimental framework to characterize hindered and restricted water diffusion in brain white matter. Magn Reson Med. 2004b;52:965–978. doi: 10.1002/mrm.20274. [DOI] [PubMed] [Google Scholar]
- Baron Ca, Beaulieu C. Acquisition strategy to reduce cerebrospinal fluid partial volume effects for improved DTI tractography. Magn Reson Med. 2014;00:1–10. doi: 10.1002/mrm.25226. [DOI] [PubMed] [Google Scholar]
- Basser PJ, Mattiello J, LeBihan D. MR diffusion tensor spectroscopy and imaging. Biophys J. 1994;66:259–67. doi: 10.1016/S0006-3495(94)80775-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beaulieu C. The basis of anisotropic water diffusion in the nervous system - a technical review. NMR Biomed. 2002;15:435–55. doi: 10.1002/nbm.782. [DOI] [PubMed] [Google Scholar]
- Concha L, Gross DW, Beaulieu C. Diffusion tensor tractography of the limbic system. AJNR Am J Neuroradiol. 2005;26:2267–74. [PMC free article] [PubMed] [Google Scholar]
- Koay CG, Basser PJ. Analytically exact correction scheme for signal extraction from noisy magnitude MR signals. J Magn Reson. 2006;179:317–22. doi: 10.1016/j.jmr.2006.01.016. [DOI] [PubMed] [Google Scholar]
- Koay CG, Chang LC, Carew JD, Pierpaoli C, Basser PJ. A unifying theoretical and algorithmic framework for least squares methods of estimation in diffusion tensor imaging. J Magn Reson. 2006;182:115–25. doi: 10.1016/j.jmr.2006.06.020. [DOI] [PubMed] [Google Scholar]
- Koay CG, Ozarslan E, Johnson KM, Meyerand ME. Sparse and optimal acquisition design for diffusion MRI and beyond. Med Phys. 2012;39:2499–511. doi: 10.1118/1.3700166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Metzler-Baddeley C, O’Sullivan MJ, Bells S, Pasternak O, Jones DK. How and how not to correct for CSF-contamination in diffusion MRI. Neuroimage. 2012;59:1394–403. doi: 10.1016/j.neuroimage.2011.08.043. [DOI] [PubMed] [Google Scholar]
- Montgomery DC, Runger GC, Hubele NF. Engineering Statistics, Statistics. 1987. [Google Scholar]
- Mulkern R, Haker S, Maier S. On high b diffusion imaging in the human brain: ruminations and experimental insights. Magn Reson Imaging. 2009;27:1151–1162. doi: 10.1016/j.mri.2009.05.003.On. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nicholson C, Syková E. Extracellular space structure revealed by diffusion analysis. Trends Neurosci. 1998;21:207–15. doi: 10.1016/s0166-2236(98)01261-2. [DOI] [PubMed] [Google Scholar]
- Pasternak O, Shenton ME, Westin CF. Estimation of extracellular volume from regularized multi-shell diffusion MRI. Med Image Comput Comput Assist Interv. 2012a;15:305–12. doi: 10.1007/978-3-642-33418-4_38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pasternak O, Sochen N, Gur Y, Intrator N, Assaf Y. Free water elimination and mapping from diffusion MRI. Magn Reson Med. 2009;62:717–30. doi: 10.1002/mrm.22055. [DOI] [PubMed] [Google Scholar]
- Pasternak O, Westin CF, Bouix S, Seidman LJ, Goldstein JM, Woo TUW, Petryshen TL, Mesholam-Gately RI, McCarley RW, Kikinis R, Shenton ME, Kubicki M. Excessive extracellular volume reveals a neurodegenerative pattern in schizophrenia onset. J Neurosci. 2012b;32:17365–72. doi: 10.1523/JNEUROSCI.2904-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pierpaoli C, Basser P. Toward a quantitative assessment of diffusion anisotropy. Magn Reson Med. 1996 doi: 10.1002/mrm.1910360612. [DOI] [PubMed] [Google Scholar]
- Pierpaoli C, Jones DK. Removing CSF Contamination in Brain DT-MRIs by Using a Two-Compartment Tensor Model. International Society for Magnetic Resonance in Medicine Meeting. 2004:1215. [Google Scholar]
- Powell M. Cambridge NA Report NA2009/06, University of …. 2009. The BOBYQA algorithm for bound constrained optimization without derivatives. [Google Scholar]
- Scherrer B, Warfield SK. Parametric representation of multiple white matter fascicles from cube and sphere diffusion MRI. PLoS One. 2012;7:e48232. doi: 10.1371/journal.pone.0048232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sener RN. Diffusion MRI: apparent diffusion coefcient (ADC) values in the normal brain and a classication of brain disorders based on ADC values. Comput Med Imaging Graph. 2001;25:299–326. doi: 10.1016/s0895-6111(00)00083-5. [DOI] [PubMed] [Google Scholar]
- Stanisz GJ, Odrobina EE, Pun J, Escaravage M, Graham SJ, Bronskill MJ, Henkelman RM. T1, T2 relaxation and magnetization transfer in tissue at 3T. Magn Reson Med. 2005;54:507–12. doi: 10.1002/mrm.20605. [DOI] [PubMed] [Google Scholar]
- Turner R, Le Bihan D, Maier J, Vavrek R, Hedges LK, Pekar J. Echo-planar imaging of intravoxel incoherent motion. Radiology. 1990;177:407–414. doi: 10.1148/radiology.177.2.2217777. [DOI] [PubMed] [Google Scholar]
- Wang Y, Wang Q, Haldar JP, Yeh FC, Xie M, Sun P, Tu TW, Trinkaus K, Klein RS, Cross AH, Song SK. Quantification of increased cellularity during inflammatory demyelination. Brain. 2011a;134:3590–601. doi: 10.1093/brain/awr307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Y, Wang Q, Haldar JP, Yeh FC, Xie M, Sun P, Tu TW, Trinkaus K, Klein RS, Cross AH, Song SK. Quantification of increased cellularity during inflammatory demyelination. Brain. 2011b;134:3590–601. doi: 10.1093/brain/awr307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang H, Schneider T, Wheeler-Kingshott CA, Alexander DC. NODDI: practical in vivo neurite orientation dispersion and density imaging of the human brain. Neuroimage. 2012;61:1000–16. doi: 10.1016/j.neuroimage.2012.03.072. [DOI] [PubMed] [Google Scholar]



