Abstract
In high-dimensional magnetic resonance imaging applications, time-consuming, sequential acquisition of data samples in the spatial frequency domain (k-space) can often be accelerated by accounting for dependencies along imaging dimensions other than space in linear reconstruction, at the cost of noise amplification that depends on the sampling pattern. Examples are support-constrained, parallel, and dynamic MRI, and k-space sampling strategies are primarily driven by image-domain metrics that are expensive to compute for arbitrary sampling patterns. It remains challenging to provide systematic and computationally efficient automatic designs of arbitrary multidimensional Cartesian sampling patterns that mitigate noise amplification, given the subspace to which the object is confined. To address this problem, this work introduces a theoretical framework that describes local geometric properties of the sampling pattern and relates these properties to a measure of the spread in the eigenvalues of the information matrix described by its first two spectral moments. This new criterion is then used for very efficient optimization of complex multidimensional sampling patterns that does not require reconstructing images or explicitly mapping noise amplification. Experiments with in vivo data show strong agreement between this criterion and traditional, comprehensive image-domain- and k-space-based metrics, indicating the potential of the approach for computationally efficient (on-thefly), automatic, and adaptive design of sampling patterns.
Index Terms: k-space sampling, image reconstruction, and parallel MRI
I. Introduction
Magnetic resonance imaging (MRI) scanners acquire data samples in the spatial frequency domain (k-space) of an object. In Cartesian MRI, the Nyquist sampling theorem dictates that the sample spacing and extent should correspond to the field-of-view and resolution of the acquisition, necessitating time-consuming sequential scanning of k-space. In higher-dimensional MRI applications, data is acquired along additional dimensions beyond space, such as time, echoes, or receive channels. To enable acceleration relative to the nominal rate, a signal model is often incorporated in the reconstruction to account for dependencies along dimensions other than space. In many cases, the signal model is defined per voxel and consists of a predefined subspace model. Assuming Gaussian noise, an unbiased estimate is then a pseudoinverse, for which errors can be characterized by a nonuniform noise level. Noise amplification due to this linear inversion depends only on the k-space sample distribution and reconstruction subspace but not the underlying object.
Many well-known linear reconstructions confine the signal in each voxel to a predefined subspace. A simple example is support-constrained imaging, which confines voxels outside of the support to the zero-dimensional subspace. The problem has been studied in the literature on nonuniform sampling and reconstruction of multiband signals [1]–[5], with general theory developed by Landau [6]. Some specialized approaches in MRI consider support regions for which an optimal sampling pattern can be selected [7], [8]. A second example is parallel MRI, which uses multiple RF receive channels and relies on dependencies between channels to accelerate. In Sensitivity Encoding (SENSE) [9], the profile in each voxel across channels is restricted to a one-dimensional subspace spanned by the channel sensitivities and is scaled by the underlying magnetization. Further consideration has to be made when sampling can be varied along additional dimensions, which we index here by t. One example is dynamic MRI, where signal-intensity-time curves in each voxel are represented by a low-dimensional subspace [10]. Other examples arise in parametric mapping [11]–[14], and Dixon fat and water MRI [15], to name a few.
Many conventional MRI acceleration methods use uniform sampling patterns, but the same inversions can be applied to nonuniformly sampled k-t-space by using iterative reconstruction. Nonuniform sampling allows acceleration factors to be fine-tuned and can be desirable for non-Cartesian MRI, contrast enhancement, motion, or regularized reconstruction.
Although a complete characterization of reconstruction errors is possible with the use of an image-domain geometry (g)-factor metric, g-factor is computationally expensive to estimate for arbitrary sampling patterns using pseudo-multiple-replica-type methods [16], [17] and does not directly explain the sources of noise amplification in terms of correlations in k-t-space. To address the latter problem, the inverse problem can be posed as approximation in a reproducing kernel Hilbert space, which provides a characterization in k-space [18], but it still demands a computationally expensive procedure to derive error bounds. To our knowledge, a formalism that provides similar insights and is amenable to the computationally efficient generation of arbitrary k-t-space sampling patterns is still unavailable.
One nonuniform sampling strategy that is commonly used is Poisson-disc sampling, which has the property that no two samples are closer than a specified distance, despite the randomization that is motivated by compressed sensing [19]. Poisson-disc sampling has empirically performed well in parallel imaging, even without regularization, and it has been used in hundreds of publications to date, [20]–[23], to name a few. Poisson-disc sampling has been motivated by the use of a synthesizing kernel in k-space and the “fit” between a point spread-function (PSF) and coil array geometry [20]. However, criteria for the coil-PSF matching have not been formally described.
Some optimality criteria have been utilized for sampling and attempt to minimize a spectral moment, either the trace of the inverse [24] or the maximum reciprocal eigenvalue [25] of the information matrix. Both have prohibitive computational complexity for large-sized problems (e.g. 3D cartesian MRI at moderate resolution) because of how they probe a correspondingly large unstructured inverse matrix. Related methods approximate the inverse matrix and exploit the local nature of reconstruction in k-space [26] and have been applied to the design of radial sampling patterns [27].
In this work, we use a moment-based spectral analysis to describe a new criterion for linear reconstruction that avoids inverting the information matrix and instead minimizes the variance in the distribution of its eigenvalues. Despite the use of eigenvalues in the criterion, this approach still provides simplified and interpretable theoretical insights into noise amplification in k-t-space. Local correlations in k-space are summarized by a weighting function derived only from system parameters. The new optimality criterion then specifies a corresponding cross-correlation or differential distribution of sampling patterns.
The new analytical relationships described in this paper are developed into fast algorithms for generating adaptive sampling that may be arbitrary or structured. Since these algorithms do not explicitly map noise amplification or reconstruct images, they have extremely short run times, often sufficient for real-time interactive applications. We make our implementation available to the community.1 Numerical experiments are performed to evaluate noise amplification for the derived sampling patterns. This is critical for constrained reconstruction, where conditioning is a major consideration in selecting sampling patterns.
II. Theory
A. Differential sampling domain
For notational simplicity, consider sampling on a finite one-dimensional Cartesian k-space grid of size N. Let be sample locations from one of T sampling patterns. In the subsequent notation, t and t′ arguments are omitted in cases where T = 1. Define the tth sampling pattern as the function
| (1) |
where δ is the discrete Kronecker delta function, extended with period N. Define the point-spread function (PSF) for the sampling pattern as
| (2) |
where ℱ is a discrete Fourier transform.
Next, define a distribution of inter-sample differences, or differential distribution, of the sample locations, which is a cross-correlation of the sampling patterns:
| (3) |
| (4) |
A similar continuous-domain differential distribution was previously introduced in computer graphics [28]–[30] to generate a single sampling pattern with a user-defined power spectrum.
From (4) and (2), the correlation property of the Fourier transform relates the differential distribution to the product of PSF’s, the squared magnitude of the PSF in the single-time-point case:
| (5) |
Fig. 1 shows this relationship in an example.
Fig. 1.
The differential distribution is a distribution of differences in sample locations, or equivalently, a cross-correlation of sampling patterns. In image space, this correponds to a product of point-spread-functions. The differential distributions show that the sampling patterns avoid acquiring nearby samples in k-t space, corresponding to low values of p(Δk, t, t′) where Δk ≈ 0 and t ≈ t′. The central peak has been removed from point-spread-functions and their products.
The differential distribution has natural properties due to the Fourier transform relationship in (5). For example, knowing the differential distribution in a region of size K near Δk = 0 summarizes the local statistics of the sampling pattern and can be used to determine products of PSFs at a resolution proportional to 1/K. Thus, to engineer a PSF with high frequency content, one has to consider large neighborhoods of k-space. The dual property is that smoothing the differential distribution is equivalent to windowing the PSF.
Examples of differential distributions are shown in Fig. 2. It is natural that uniform sampling has a uniform differential distribution, since shifting the sampling pattern by the reduction factor produces the same pattern, while all other shifts produce a complementary pattern. Uniform random (white noise) sampling has a differential distribution with a constant mean where Δk ≠ 0, since all pairwise differences are indepedent and uniform over the space. Poisson-disc sampling imposes a minimum distance between samples, or equivalently, that its differential distribution must satisfy
Fig. 2.
Examples of point-spread functions and differential distributions are shown for uniform random sampling, isotropic and anisotropic Poisson-disc sampling, and separable 2 × 2 uniform sampling. Peaks in the point-spread-functions are highlighted with yellow circles, and the central peak has been scaled for visualization.
| (6) |
for some minimum distance parameter Δkr,max. Based on (5), the lack of a transition band at Δkr,max explains the oscillations seen in the PSF for Poisson-disc sampling in Fig. 2.
B. Forward Model
We consider a general linear forward model, where measurement for channel c and “frame” t are generated by
| (7) |
where εt,c(k) is complex Gaussian white noise, ml is a complex-valued magnetization, ℱ is a discrete Fourier transform operator, St,l,c(r), t = 1, 2, ...T, c = 1, 2, . . . ,C, l = 1, 2, . . . , L is a complex-valued “sensitivity” function. The general case of correlated noise can be handled by a change of variables. Analogous to sensitivity maps in parallel imaging, St,l,c parameterizes the subspace to which the signal in each voxel is confined, allowing one to represent any linear reconstruction in Fourier-based imaging that imposes a reconstruction subspace voxel-wise. The forward model includes dimensions along which the sampling pattern is variable (indexed by t) and fixed (such as coils, indexed by c). In voxel r, a TC-dimensional signal vector is constrained to a subspace of dimension L spanned by basis functions encoded in St,l,c(r) that are indexed by l. As described in subsequent examples, L = T = 1 in SENSE and support-constrained imaging, and L > 1 and T > 1 accommodate more general linear reconstructions, including reconstructions for higher dimensional applications such as dynamic imaging.
(7) can be written as a linear measurement model
| (8) |
where D, ℱ, and S are operators corresponding to sampling, Fourier transformation, and sensitivity maps respectively, ε is a vector of complex Gaussian white noise, m the magnetization, and y the k-space data. Denoting DℱS as E, a best unbiased estimate is obtained from a pseudoinverse, (EHE)−1EHy, which has a nonuniform noise term (EHE)−1EHε. It is common to normalize the noise standard deviation by , where R is the acceleration factor, and that of a fully-sampled acquisition, which is referred to as g-factor. The g-factor metric can be defined with ith element
| (9) |
where σi,accel and σi,full are the standard deviations of the ith element of the fully-sampled and accelerated reconstructions respectively.
C. Noise Amplification and Differential Distributions
To select sampling patterns, an optimality criterion must be chosen that makes the information matrix EHE well-conditioned. One family of classical criteria from statistics is the spectral moments, with the nth spectral moment defined as
| (10) |
where λk is the kth eigenvalue of EHE and n ≤ 1. For parallel MRI, n = −1 and n = −∞ have been investigated [24], [25].
We show formulas for the first two spectral moments for (7). First, the spectral moment for n = 1 is independent of the sampling locations and is given by the following expression.
Theorem 1
| (11) |
Proof
A proof is provided in Appendix A.
Next, the spectral moment for n = 2 is given by a single inner product with the differential distribution.
Theorem 2
| (12) |
where
| (13) |
Proof
A proof is provided in Appendix B.
One measure of the spread in the eigenvalues is variance, which is a function of the spectral moments tr(EHE) and tr((EHE)2). Minimizing the latter with the former constant minimizes the variance in the distribution of eigenvalues. In practice, PSFt(0) is often constant because the number of samples in each “frame” is specified. Although this heuristic criterion does not have a general relationship to traditional g-factor-based criteria, its primary advantages are that it does not require matrix inversion and relates noise amplification to the sampling geometry.
Tikhonov regularization has been widely used to improve conditioning in parallel MRI and requires inverting the matrix, EHE+λI, where λ is the regularization parameter. The effect of this matrix is to shift the eigenvalues by a positive constant, which does not change the variance in the distribution of eigenvalues. Thus, the criterion does not have to be modified if Tikhonov regularization is used. In conjunction with arbitrary sampling, a small amount of Tikhonov regularization is one strategy used to mitigate noise in the least squares solution that appears in later iterations of iterative linear solvers such as conjugate gradient [31].
w encodes correlations in k-space that are expressed in terms of the encoding capability of the sensitivity functions. Naturally, the Fourier transform relationship implies that (anisotropic) scaling of the sensitivity functions results in the inverse scaling of w and therefore the derived sampling pattern, while circulant shifts of the sensitivity functions have no effect. The bandwidth of the sensitivity functions limits the extent of the differential distribution and size of k-space neighborhoods that need to be considered for sampling. Section III-E discusses the relationship between w and the reproducing kernel that is associated with the model.
Theorem 2 also provides some explanation for Poisson-disc sampling. By imposing a minimum difference between samples, Poisson-disc sampling nulls the low spatial frequencies of w. This confirms the intuition that Poisson-disc sampling is well-conditioned due to its uniform distribution.
Due to the use of a readout in MRI, subsets of k-space samples are acquired together. For Cartesian MRI, the differential distribution is constant along the readout direction. Thus, the value of tr((EHE)2) with a Cartesian readout is then the inner product of w, summed along the readout direction, and the differential distribution of the sampling pattern in the phase encoding plane. In this way, minimizing tr((EHE)2) for 3D or 2D Cartesian sampling is simplified to a 2D or 1D problem.
D. Efficient Optimization of Sampling Patterns
Given a forward model parameterized by the sensitivity functions St,l,c(·), we formulate the sampling design as minimization of the cost-criterion minimize
| (14) |
where 𝒮 is the set of sample locations, limited to some number, only p is a function of 𝒮, and only w is a function of St,l,c(·) given by (13). We refer to sampling patterns derived by minimizing J(𝒮) as min tr((EHE)2) sampling.
Since computing a global optimum requires an intractable search, we instead rely on greedy heuristics. One general strategy is a sequential selection of samples, and this procedure is fast due to several properties of the differential distribution and J(𝒮).
Starting from an arbitrary sampling pattern and sensitivity functions, J(𝒮) can be computed efficiently. The differential distribution can be computed using the FFT or from pair-wise differences, and w can be computed with the FFT. The inner product then requires only NT2 multiplications to evaluate.
To approach (14) with a greedy forward selection procedure, computing the differential distribution and cost function explicitly are not necessary. It suffices to keep track of a map of the increase in J(𝒮) that occurs when inserting a sample candidate (k′, t′):
| (15) |
where the convolution is with respect to the first argument of w. After inserting the sample, ΔJ is updated as
| (16) |
Similarly, removing a sample changes the cost by −ΔJ. ΔJ can be used to explain the sources of noise amplification in k-space that contribute to J(𝒮).
We make use of (15) and (16) in an adaptation of the so-called Mitchell’s best candidate sampling algorithm [32] to perform a forward sequential selection of samples, where at each step the best candidate, defined as the sample minimizing of ΔJ, is added to the pattern. This procedure is described in Algorithm 1.
Algorithm 1.
Exact min tr((EHE)2) Best Candidate Sampling
| 1: | procedure ExactBestCandidate | |
| 2: | Initialize ΔJ(k, t) = w(0, t, t) | |
| 3: | Initialize s(k, t) ← 0 | ▷ Sampling pattern |
| 4: | for 1..number of samples do | |
| 5: | ▷ Select best candidate sample | |
| 6: | (k′, t′) = argmink,t ΔJ(k, t) | |
| 7: | s(k′, t′) = s(k′, t′) + 1 | |
| 8: | ▷ Perform update from (16). | |
| 9: | for k, t do | |
| 10: | ΔJ(k, t) ← ΔJ(k, t) + 2w(k − k′, t, t′) | |
| 11: | return s(k, t) | |
The time complexity of best candidate sampling is relatively low, but the algorithm requires on the order of a minute for single-time-point sampling with a 256 × 256 phase encoding matrix. Two strategies greatly improve its efficiency. One strategy, described in Section III-F, is to use a local approximation of w, and by extension, the reproducing kernel, which is justifiable, for example, when the sensitivity functions are bandlimited. A second strategy is to restrict the set of sampling patterns to consider. An example is described in Section III-G.
III. Experiments
A. Support-constrained Imaging
Constraints on the object support can be represented by defining the sensitivity functions to be zero outside of the object support. Poisson-disc sampling and min tr((EHE)2) sampling were compared for two support profiles generated synthetically. Although this does not account for the benefit of randomization in Poisson-disc sampling for regularization, it demonstrates the impact of an adaptive sampling strategy on conditioning compared to one common alternative. The number of samples was arbitrarily chosen to be equal to the size of the support region. The resulting noise amplification was quantified using the g-factor metric in (9), computed with a pseudo multiple replica method [16] with 750 averages, a conjugate gradient solver, and a Gaussian white noise model. The following measure was used as a stopping criterion
| (17) |
where xk is the kth iterate. The solver is stopped when δ drops below a predefined threshold. Maps of ΔJ were used to identify regions of k-space that explain the larger value of J(𝒮) and the predicted sources of noise amplification.
The first profile was a cross-shaped profile first introduced for dynamic imaging [7]. For this profile, quincunx (checkerboard-like) sampling patterns provide optimal sampling: their g-factor is 1, and the eigenvalues of EHE are all equal. Since having all eigenvalues of EHE equal minimizes J(𝒮), min tr((EHE)2) sampling should ideally produce quincunx sampling. For this profile, Tikhonov regularization with a very small parameter (λ = 10−4) was used in reconstructions. The standard deviations of Tikhnov regularized solutions were used in estimating σi,full and σi,accel in (9).
To demonstrate the ability to adapt to an arbitrary support region, a second support region consisting of a rotated ellipse was considered. For this profile, Tikhonov regularization with a very small parameter (λ = 10−5) was used.
g-Factor maps in Fig. 3 show that min tr((EHE)2) sampling patterns have lower g-factors than Poisson-disc sampling. For the cross-shaped profile, the former is identical to quincunx sampling, except in a few small regions due to the nature of local optimization, and the g-factor is very close to 1 (95th percentile g-factor = 1.05). This can be explained by the near-perfect matching between the w and p, which have sidebands that almost null each other. The value of ΔJ is constant over k-space, which is consistent with the symmetry of quincunx sampling. In this case, it is possible to achieve the lower bound
Fig. 3.
min tr((EHE)2) and Poisson-disc sampling are compared in problems with increasing generality: support-constrained MRI (a–b), parallel MRI (c), and parallel MRI with a temporal basis of B-splines (d). Representative slices along the readout are shown. Differential distributions p for min tr((EHE)2) sampling patterns adapt to the weighting w to minimize 〈w, p〉 and describe of geometric properties of the sampling pattern. Sampling patterns with sample locations indicated by black dots are shown with ΔJ, the increase in tr((EHE)2) resulting from sampling a new location corresponding to a prediction of local noise amplification. g-Factor maps show the noise amplification in the image domain.
| (18) |
With equality, all eigenvalues are equal, corresponding to ideal conditioning. Naturally, it can be shown that this is possible when the support profile uniformly tiles the plane.
The elliptical region in Fig. 3b has a w with ringing that results in a matching p. Maps of ΔJ show regions where local noise amplification is predicted, such as in regions far from other samples seen in Poisson-disc sampling.
B. Parallel Imaging
The SENSE parallel MRI model of acquisition with an array of C receiver coils can be described by a linear model composed of a spatial modulation of the magnetization by each channel sensitivity, Fourier encoding, and sampling. The channel sensitivities comprise the single set of sensitivity functions in (7), so that L = T = 1.
Isotropic Poisson-disc and min tr((EHE)2) sampling patterns with 6-fold acceleration were compared for SENSE reconstruction using in vivo breast data acquired with a 16-channel breast coil (Sentinelle Medical, Toronto, Canada), a 3D spoiled gradient-echo sequence on a 3T scanner (14° flip angle, TE=2.2ms, TR=6ms, matrix size=386×192, FOV=22× 17.6 × 16 cm3). Sensitivity maps were estimated with the ESPIRiT method [33] with one set of maps, scaled so that their sum-of-squares over coils is unity over the image support. g-factor maps were calculated with the pseudo multiple replica method described in Section III-A with the stopping criterion (17), which allowed convergence. Tikhonov regularization with a very small parameter (λ = 2×10−4) was used. w was computed using the 3D sensitivity maps and summed along the readout, effectively defining the encoding operator in (14) to include the readout.
Reconstructed images and error maps for a representative slice along the readout are shown in Fig. 4, which shows lower reconstruction error for min tr((EHE)2) sampling (RMSE=9.6%) than for Poisson-disc sampling (RMSE=10.6%). This is confirmed in the g-factor maps in Fig. 3c. The breast coil has more coil sensitivity variation in the left-right direction, which can be seen from the extent of w. The differential distribution for min tr((EHE)2) sampling shows that it adapts by generating higher acceleration in the left-right direction. This is also reflected in k-space by ΔJ, which is low in regions not providing optimal sampling geometry for acceleration.
Fig. 4.
Reconstructed images and error maps for Poisson-disc and min tr((EHE)2) sampling patterns, from experiments with parallel imaging with a breast coil. Both sampling patterns used a reduction factor of 6.
C. Temporally Constrained Dynamic Parallel MRI
The use of min tr((EHE)2) sampling for multidimensional imaging was demonstrated for multi-time-point (dynamic) parallel MRI. For dynamic imaging to be well-posed, it is necessary to assume a certain limited temporal resolution. One generic way to incorporate this assumption to to represent the dynamics in each voxel by a predefined basis of B-splines, allowing coefficients to be recovered by linear reconstruction [34]–[37].
Assuming a discretization of the time axis into T frames indexed by t, and L < T basis functions , (7) can be used to represent such a contrast change model with
| (19) |
where C(r, c) are the coil sensitivity maps for channel c. The lth image of coefficients in this basis is ml(r). This can also be viewed as a subspace model [10] with a predefined temporal basis. The sampling problem is to select a k-t sampling pattern that balances spatiotemporal correlations so that inversion of the model in (19) is well-conditioned. For (19), the optimal strategy suggested by tr((EHE)2) is given by
| (20) |
which is a product of two functions in brackets, the first of which encodes correlations in time and the second of which encodes correlations in k-space. These functions are shown in Fig. 3d.
The breast data and coil sensitivities used in experiments with parallel imaging was used in a hypothetical dynamic MRI acquisition with a linear reconstruction of the coefficients in the spline basis. The time axis was discretized into T = 12 frames, and a basis of L = 4 third-order B-splines with periodic boundary conditions and maximum value of unity was used to represent all signal intensity-time curves. Poisson-disc sampling and min tr((EHE)2) sampling with R = 12 were compared, and an additional constraint was used to make the number of samples in each frame constant. Poisson-disc sampling patterns were generated independently for each frame. A temporally averaged g-factor was calculated using the pseudo multiple replica method described in Section III-A followed by a mean of the L coefficient frames. Tikhonov regularization was used with a very small parameter (λ = 2 × 10−6).
The plot of w sliced in two planes shows a representation of correlations in k-t-space. g-Factor is lower in min tr((EHE)2) sampling due to the ability to balance these correlations with the improved match between w and p (Fig. 3d). This also corresponds with a more uniform ΔJ.
D. Empirical Comparison of Optimality Criteria
To compare the proposed and conventional optimality criteria for selecting arbitrary sampling patterns, sampling patterns minimizing tr((EHE)2) and mean squared error (MSE, given by tr((EHE)−1)), following the approach of [24] and a uniform sampling pattern were compared for accelerated 2D fast spin echo in vivo human knee data acquired from a GE 3T scanner (TE=7.3ms, TR=1s, echo train length=8, slice thickness=3mm, resolution=1.25×1.25 mm2, 8 channel knee coil). A reduction factor of 2 was used for all sampling patterns. Optimal sampling patterns for the MSE criterion were generated using the method described in [24], which reported a computation time of 5 minutes for the same matrix size. w was computed from 2D sensitivity maps and summed along the readout. g-Factor maps were calculated with the pseudo multiple replica method described in Section III-A with the stopping criterion (17), which allowed convergence. Tikhonov regularization was used with a very small parameter (λ = 10−4).
For the proposed criterion, sampling patterns were generated in 367 ms, which included computation of w in 304 ms and Algorithm 1 in 63 ms. Fig. 5 shows sampling patterns, g-factor maps, and reconstructed images for the three sampling patterns. Commonly reported metrics for sampling patterns and the time to generate them are shown in Table I. MSE was calculated using images reconstructed from the original and subsampled data. The two optimality criteria yield similar results, indicating that the criteria are empirically equivalent. The small differences between the two patterns in all metrics and lower MSE for the proposed criterion are expected due to the nature of greedy methods. However, the time to generate the sampling pattern using the proposed criterion is much shorter due to the use of the inverse matrix in computing MSE.
Fig. 5.
Two optimality criteria, mean squared error and the proposed, tr((EHE)2), are compared in parallel MRI reconstruction of 2D in vivo human knee data. (a) Reconstructed images and sampling patterns appear similar and both show an improvement over uniform sampling. (b) Zoom-in images show a reduction in artifacts at the center of the image with either criteria, which is confirmed by (c) g-factor maps shown for the full image. Metrics and execution times are given in Table I.
TABLE I.
Metrics for sampling patterns compared in Fig. 5 and time to generate.
| Uniform | min tr((EHE)−1) (MSE) | min tr((EHE)2) (Proposed) | |
|---|---|---|---|
|
| |||
| MSE | 1 | 0.41 | 0.38 |
| tr((EHE)2) | 1 | 0.978 | 0.976 |
| Max g-factor | 69.6 | 10.1 | 9.3 |
| Median g-factor | 3.14 | 3.05 | 2.97 |
| Time to generate | 0 | 5 minutes | 0.37 seconds |
Both criteria show an improvement over uniform sampling, which results in a small region of very high g-factor at the center of the image. This region shows marked noise artifacts in the zoom-in images. Although uniform sampling is optimal in many cases, uniform sampling can be suboptimal and similar results have previously been observed for the MSE criterion [24].
E. Reconstruction Errors in k-Space
The analysis of [18] provides a description of parallel MRI as approximation in a reproducing kernel Hilbert space and a characterization of reconstruction errors in k-space. From this analysis, it follows that the space of k-space signals in (7) is a reproducing kernel Hilbert space, for which the discrete matrix-valued reproducing kernel is
| (21) |
Like K, w also represents correlations in k-space and is related to K by
| (22) |
As described in [18], local approximation errors can be predicted by a power function, which is computed from K and the sample locations by solving a large linear system. For notational simplicity, we drop t and l indices in the following discussion. The power function is given by
| (23) |
where the first sum is over the sample locations kn, are the cardinal functions (interpolation weights), defined in [18]. A coil-combined power function can be computed by a sum of over coils.
One potential application of the power function is to guide the design of sampling patterns. However, the primary challenge is that the cardinal functions require solving large systems of linear equations, making computation of the power function very expensive. It is important to note that for the purpose of determining optimal sample locations, it is not necessary to compute quantitative error bounds. As a computationally inexpensive, albeit semiquantitative, metric, ΔJ may provide the same information as a coil-combined power function.
ΔJ and the power function were compared for the experiments with parallel imaging described in Section III-B. To reduce the computation time for the power function, the original breast data was cropped to a lower resolution (matrix size=96 × 48). The coil-combined power function was computed by summing over coils. ΔJ and the power function were compared for uniform random and min tr((EHE)2) sampling patterns.
Fig. 6a and b show that ΔJ and the power function make similar predictions, while their relationship is nonlinear. This is confirmed in Fig. 6c, which shows that the two metrics are highly correlated (Spearman’s rank correlation = 0.994 for both sampling patterns). It is tempting to explain this by analyzing second term in (23), which may appear similar to the convolution in (15). However, they are only proportional when the power function is summed over coils and , which corresponds to replacing the interpolation weights with a single convolution with the kernel. Although it is clear that a general relationship between the two metrics does not exist, their high correlation suggests that ΔJ is a good surrogate for the power function for some purposes. This relationship makes the proposed method approximately equivalent to the sampling procedure suggested in [18] based on selecting samples at minima of the power function.
Fig. 6.
The combined power function and ΔJ for min tr((EHE)2) (a) and random (b) sampling patterns were compared for parallel imaging reconstruction with in vivo human breast data. Sample locations are shown as block dots with each metric overlaid. Plotting the combined power function against ΔJ from all k-space locations shows a nonlinear, monotic trend indicating that they provide similar information in this example (Spearman’s rank correlation = 0.994 for both patterns). Note that the color scale in (a) and (b) has been adjusted by fitting a monotonic function to the data in (c).
F. Approximate Best Candidate Sampling
One strategy for speeding up Algorithm 1 is to modify w to have limited support, which is possible when the sensitivity functions are bandlimited. This assumption has been successfully used in procedures for automatic selection of sampling patterns that are based on computing interpolation weights [26], [27]. Similarly, w can be replaced with a sparse surrogate ŵ by thresholding. The number of inner-loop iterations is then reduced from NT to supp(ŵ), where supp( ŵ) is the number of nonzeros in ŵ.
With this modification, a naive implementation still has time complexity O(NTS), where S is the number of samples, to compute the minimum of ΔJ. However, this step can be carried out efficiently using a priority queue. The inner-loop is performed by modifying only the region of k-t space corresponding to the support of w. An implementation is summarized in Algorithm 2. A total of 2supp(ŵ) elements must be incremented, and incrementing each requires O(log(NT)) complexity, resulting in a total computional complexity of O(supp(ŵ)S log(NT)). If supp(ŵ) is small enough, this algorithm is faster than Algorithm 1.
For the experiment in Section III-B, the run time of the two algorithms is compared in Fig. 7. When the support of ŵ is near 20,000 (0.3N), the run times for Algorithms 1 and 2 are equal. In general, this crossing point depends on T and N, and for T large, the support of ŵ must be smaller. When the support of ŵ is only 16 (0.0002N), this results in nearly optimal g-factor and objective value, which shows that best candidate sampling can be dramatically simplified without sacrificing performance. At this level of approximation, the run time is near one second, which is sufficient for interactive applications.
Fig. 7.
For efficiency, best candidate sampling can be approximated by replacing w with a thresholded version ŵ. Reducing the support of ŵ results in (a) shorter run times with (b) nearly-optimal values of tr((EHE)2) until ŵ is too small to accurately represent correlations. g-Factors and differential distributions are shown in (c) for three points on the tradeoff, where point ii achieves similar g-factor to iii but has much shorter run time (≈ 1 second). For point i, w is not well-approximated by ŵ, resulting in suboptimal sampling. Increasing the support of ŵ, the differential distribution shows variation over a larger region.
Algorithm 2.
Approximate min tr((EHE)2) Best Candidate Sampling
| 1: | procedure ApproximateBestCandidate | |
| 2: | Set ŵ to a thresholded version of w | |
| 3: | Initialize ΔJ(k, t) = ŵ(0, t, t) | |
| 4: | Initialize s(k, t) ← 0 | |
| 5: | Construct priority queue with all k-t-space locations as keys and corresponding ΔJ as values | ▷ O(NT) |
| 6: | for 1..number of samples do | |
| 7: | (k′, t′) = argmink,t ΔJ(k, t) | ▷ O(log(NT)) |
| 8: | s(k′, t′) = s(k′, t′) + 1 | |
| 9: | ▷ First term in (16). | |
| 10: | for (k, t) ∈ {(k, t) : ŵ(k − k′, t, t′) ≠ 0} do | |
| 11: | ΔJ(k, t) ← ΔJ(k, t) + ŵ(k − k′, t, t′) | |
| 12: | ▷ Second term in (16). | |
| 13: | for (k, t) ∈ {(k, t) : ŵ(k′ − k, t′, t) ≠ 0} do | |
| 14: | ΔJ(k, t) ← ΔJ(k, t) + ŵ(k′ − k, t′, t) | |
| 15: | return s(k, t) | |
G. Periodic Sampling
To enable fast selection of sampling patterns, one strategy is to consider only a subset, such as periodic sampling patterns. If a sampling pattern can be expressed as a convolution of two sampling patterns, its differential distribution is the convolution of the differential distributions for the two sampling patterns. A periodic sampling pattern can be represented as a convolution of a lattice of δ functions with period n0, which we denote III and a cell of size n0:
| (24) |
Let p0(Δk, t, t′) be the differential distribution of . Then
| (25) |
Since p is periodic, only the period of p, p0, needs to be computed. Since a small number of periodic sampling patterns need to be considered, and often significantly fewer than the number of unique ones, it is feasible to exhaustively evaluate periodic sampling patterns.
Differential distributions can be efficiently computed using (25). 2D-CAIPIRINHA [38] sampling patterns with an acceleration factor of R are periodic with a period of size R × R. The acceleration factors in y and z directions, Ry and Rz with RyRz = R, must be chosen in addition to one shift parameter.
All 12 2D-CAIPIRINHA sampling patterns with R = 6 were evaluated using g-factor maps and ΔJ, with the exception of 4 that resulted in extremely high g-factors. g-Factor maps were computed using the analytical expression [9]. Spearman’s rank correlation was used to show agreement between tr((EHE)2) and several g-factor metrics, including the maximum g-factor which was used previously [39], [40]. Fig. 8 shows that the pattern with 3×2 acceleration in ky and kz has the most uniform ΔJ and lowest value of tr((EHE)2). The rank correlations between tr((EHE)2) and the mean, maximum, and root-mean-square g-factor were 0.93, 0.95, and 0.90 respectively, indicating strong agreement between tr((EHE)2) and g-factor-based criteria.
Fig. 8.
Periodicity of the sampling pattern is preserved in the differential distribution, allowing efficient evaluation of periodic sampling patterns. 2DCAIPIRINHA sampling patterns with R = 6 are compared with samples indicated by black dots and, ΔJ showing local contributions to tr((EHE)2), along with corresponding g-factor maps are shown. Strong agreement was observed between tr((EHE)2) and several g-factor metrics (Spearman’s rank correlation = 0.90, 0.93, and 0.95 for mean, max, and root-mean-square g-factors).
IV. Summary and Conclusion
A novel concept of a differential distribution was shown to be related to conditioning in linear reconstruction via moment-based spectral analysis of the information matrix, leading to a new criterion for sampling. This formalism describes a natural link between point-spread-functions, geometric features of sample distributions, and conditioning for linear reconstruction. It also has unique computational advantages that enable efficient, automatic design of arbitrary and adaptive multidimensional Cartesian sampling patterns. These advantages are threefold: 1) the differential distribution is defined close to the domain where sampling patterns are designed, allowing basic operations to be carried out efficiently, 2) matrix inversion is avoided, and 3) local dependencies are compactly represented in the differential distribution. Greedy algorithms exploiting these advantages are presented and shown to achieve very fast execution times, often sufficient for real-time interactive applications.
The present approach has several limitations. First, only linear reconstructions are considered. Incorporating arbitrary regularization requires considering both conditioning and additional criteria that may lead to variable density or incoherent aliasing. Although many reconstructions in MRI are not linear, the framework may provide some guidance for inverse problems that are bilinear, linear in a subset of variables, or can be associated with a linear inverse problem. Second, the variance in the distribution of eigenvalues has no general relationship to the g-factor or power function, which are comprehensive and quantitative measures of noise amplification but are computationally expensive. Numerical experiments with in vivo human data illustrate strong agreement between the criterion and these metrics. This criterion is not a replacement for these metrics but rather, a complementary metric that has utility for sampling. Empirically, the criterion does not select pathological eigenvalue distributions that have a small variance but are still ill-conditioned. Third, while the theory readily extends to non-Cartesian imaging, only Cartesian imaging is considered in the present work. Similar Algorithms for non-Cartesian sampling would be trajectory-dependent. We note that these greedy approaches do not guarantee a globally optimal solution. Finally, arbitrary sampling patterns often require some regularization that is not required in reconstruction from uniform subsampling.
The implications of the proposed criterion have been investigated on multiple fronts. Numerical experiments in support-constrained, parallel, and dynamic MRI have demonstrated the adaptive nature of the approach and agreement between the criterion and g-factor. Further experimental validation in 2D and 3D Cartesian MRI with arbitrary and periodic sampling patterns shows agreement with existing metrics, including MSE, and the k-space-based power function metric. The formalism has been related to an approximation theory formalism [18], which offers similar descriptions of correlations in k-space and predictions about local reconstruction errors. The approach and results in this paper indicate the potential for computationally feasible, arbitrary, and adaptive sampling techniques for linear, multidimensional MRI reconstruction.
Acknowledgments
The authors would like to thank Dr. V. Taviani for help in experiments with in vivo knee data.
Appendix A: Proof of Theorem 1
Let Stc,l be the diagonal operator corresponding to point-wise multiplication by St,c,l(r) and Dt be the diagonal sampling operator for frame t.
Block ij of the matrix EHE is
| (26) |
Then
| (27) |
| (28) |
| (29) |
| (30) |
Appendix B: Proof of Theorem 2
Let Stc,l be the diagonal operator defined in Appendix A. Then
| (31) |
| (32) |
| (33) |
| (34) |
| (35) |
| (36) |
| (37) |
Parseval’s theorem and the convolution theorem are applied in (35). The inner product in (34), (35–36), and (37) are sums over r, Δk, and (Δk, t, t′) respectively.
Footnotes
References
- 1.Jerri AJ. The Shannon sampling theorem-its various extensions and applications: A tutorial review. Proceedings of the IEEE. 1977;65(11):1565–1596. [Google Scholar]
- 2.Zayed AI. Advances in Shannon’s sampling theory. CRC press; 1993. [Google Scholar]
- 3.Higgins JR. Sampling theory in Fourier and signal analysis: foundations. Oxford University Press on Demand; 1996. [Google Scholar]
- 4.Venkataramani R, Bresler Y. Optimal sub-Nyquist non-uniform sampling and reconstruction for multiband signals. IEEE Transactions on Signal Processing. 2001;49(10):2301–2313. [Google Scholar]
- 5.Marks RJI. Advanced topics in Shannon sampling and interpolation theory. Springer Science & Business Media; 2012. [Google Scholar]
- 6.Landau H. Necessary density conditions for sampling and interpolation of certain entire functions. Acta Mathematica. 1967;117(1):37–52. [Google Scholar]
- 7.Madore B, Glover GH, Pelc NJ, et al. Unaliasing by fourier-encoding the overlaps using the temporal dimension (UNFOLD), applied to cardiac imaging and fMRI. Magnetic Resonance in Medicine. 1999;42(5):813–828. doi: 10.1002/(sici)1522-2594(199911)42:5<813::aid-mrm1>3.0.co;2-s. [DOI] [PubMed] [Google Scholar]
- 8.Tsao J, Boesiger P, Pruessmann KP. k-t BLAST and k-t SENSE: Dynamic MRI with high frame rate exploiting spatiotemporal correlations. Magnetic Resonance in Medicine. 2003;50(5):1031–1042. doi: 10.1002/mrm.10611. [DOI] [PubMed] [Google Scholar]
- 9.Pruessmann KP, Weiger M, Scheidegger MB, Boesiger P, et al. SENSE: sensitivity encoding for fast MRI. Magnetic Resonance in Medicine. 1999;42(5):952–962. [PubMed] [Google Scholar]
- 10.Liang Z-P. Spatiotemporal imagingwith partially separable functions. Biomedical Imaging: From Nano to Macro, 2007. ISBI 2007. 4th IEEE International Symposium on; IEEE; 2007. pp. 988–991. [Google Scholar]
- 11.Petzschner FH, Ponce IP, Blaimer M, Jakob PM, Breuer FA. Fast MR parameter mapping using k-t principal component analysis. Magnetic Resonance in Medicine. 2011;66(3):706–716. doi: 10.1002/mrm.22826. [DOI] [PubMed] [Google Scholar]
- 12.Huang C, Bilgin A, Barr T, Altbach MI. T2 relaxometry with indirect echo compensation from highly undersampled data. Magnetic Resonance in Medicine. 2013;70(4):1026–1037. doi: 10.1002/mrm.24540. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zhao B, Lu W, Hitchens TK, Lam F, Ho C, Liang ZP. Accelerated MR parameter mapping with low-rank and sparsity constraints. Magnetic Resonance in Medicine. 2015;74(2):489–498. doi: 10.1002/mrm.25421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Tamir JI, Uecker M, Chen W, Lai P, Alley MT, Vasanawala SS, Lustig M. T2 shuffling: Sharp, multicontrast, volumetric fast spinecho imaging. Magnetic Resonance in Medicine. 2016 doi: 10.1002/mrm.26102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Dixon WT. Simple proton spectroscopic imaging. Radiology. 1984;153(1):189–194. doi: 10.1148/radiology.153.1.6089263. [DOI] [PubMed] [Google Scholar]
- 16.Robson PM, Grant AK, Madhuranthakam AJ, Lattanzi R, Sodickson DK, McKenzie CA. Comprehensive quantification of signal-to-noise ratio and g-factor for image-based and k-space-based parallel imaging reconstructions. Magnetic Resonance in Medicine. 2008;60(4):895–907. doi: 10.1002/mrm.21728. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Riffe M, Blaimer M, Barkauskas K, Duerk J, Griswold M. SNR estimation in fast dynamic imaging using bootstrapped statistics. Proc Intl Soc Mag Reson Med. 2007;1879 [Google Scholar]
- 18.Athalye V, Lustig M, Uecker M. Parallel magnetic resonance imaging as approximation in a reproducing kernel Hilbert space. Inverse Problems. 2015;31(4):45008. doi: 10.1088/0266-5611/31/4/045008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lustig M, Pauly JM. SPIRiT: Iterative self-consistent parallel imaging reconstruction from arbitrary k-space. Magnetic Resonance in Medicine. 2010;64(2):457–471. doi: 10.1002/mrm.22428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Vasanawala S, Murphy M, Alley MT, Lai P, Keutzer K, Pauly JM, Lustig M. Practical parallel imaging compressed sensing MRI: Summary of two years of experience in accelerating body MRI of pediatric patients. 2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro; IEEE; 2011. pp. 1039–1043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gdaniec N, Eggers H, Börnert P, Doneva M, Mertins A. Robust abdominal imaging with incomplete breath-holds. Magnetic Resonance in Medicine. 2014 May;71(5):1733–1742. doi: 10.1002/mrm.24829. [DOI] [PubMed] [Google Scholar]
- 22.Lebel RM, Jones J, Ferre JC, Law M, Nayak KS. Highly accelerated dynamic contrast enhanced imaging. Magnetic Resonance in Medicine. 2014;71(2):635–644. doi: 10.1002/mrm.24710. [DOI] [PubMed] [Google Scholar]
- 23.Hsiao A, Lustig M, Alley MT, Murphy M, Chan FP, Herfkens RJ, Vasanawala SS. Rapid pediatric cardiac assessment of flow and ventricular volume with compressed sensing parallel imaging volumetric cine phase-contrast MRI. American Journal of Roentgenology. 2012;198(3):W250–W259. doi: 10.2214/AJR.11.6969. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Xu D, Jacob M, Liang Z. Optimal sampling of k-space with cartesian grids for parallel MR imaging. Proc Int Soc Magn Reson Med. 2005;13:2450. [Google Scholar]
- 25.Curtis A, Anand C. Smallest singular value: a metric for assessing k-space sampling patterns. Proc Int Soc Magn Reson Med. 2015:2422. [Google Scholar]
- 26.Samsonov AA. On optimality of parallel mri reconstruction in k-space. Magnetic Resonance in Medicine. 2008;59(1):156–164. doi: 10.1002/mrm.21466. [DOI] [PubMed] [Google Scholar]
- 27.Samsonov AA. Automatic design of radial trajectories for parallel MRI and anisotropic fields-of-view. Proceedings of the 17th Annual Meeting of ISMRM; Honolulu, Hawaii, USA. 2009; p. 765. [Google Scholar]
- 28.Wei LY, Wang R. Differential domain analysis for non-uniform sampling. ACM Transactions on Graphics (TOG) 2011;30(4):50. [Google Scholar]
- 29.Zhou Y, Huang H, Wei LY, Wang R. Point sampling with general noise spectrum. ACM Transactions on Graphics (TOG) 2012;31(4):76. [Google Scholar]
- 30.Heck D, Schlömer T, Deussen O. Blue noise sampling with controlled aliasing. ACM Transactions on Graphics (TOG) 2013;32(3):25. [Google Scholar]
- 31.Qu P, Zhong K, Zhang B, Wang J, Shen GX. Convergence behavior of iterative sense reconstruction with non-cartesian trajectories. Magnetic Resonance in Medicine. 2005;54(4):1040–1045. doi: 10.1002/mrm.20648. [DOI] [PubMed] [Google Scholar]
- 32.Mitchell DP. Spectrally optimal sampling for distribution ray tracing. ACM SIGGRAPH Computer Graphics. 1991;25(4):157–164. ACM. [Google Scholar]
- 33.Uecker M, Lai P, Murphy MJ, Virtue P, Elad M, Pauly JM, Vasanawala SS, Lustig M. ESPIRiT: an eigenvalue approach to autocalibrating parallel MRI: where SENSE meets GRAPPA. Magnetic Resonance In Medicine. 2014;71(3):990–1001. doi: 10.1002/mrm.24751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Nichols TE, Qi J, Asma E, Leahy RM. Spatiotemporal reconstruction of list-mode PET data. IEEE Transactions on Medical Imaging. 2002;21(4):396–404. doi: 10.1109/TMI.2002.1000263. [DOI] [PubMed] [Google Scholar]
- 35.Jerosch-Herold M, Swingen C, Seethamraju RT. Myocardial blood flow quantification with MRI by model-independent deconvolution. Medical Physics. 2002;29(5):886–897. doi: 10.1118/1.1473135. [DOI] [PubMed] [Google Scholar]
- 36.Filipovic M, Vuissoz PA, Codreanu A, Claudon M, Felblinger J. Motion compensated generalized reconstruction for free-breathing dynamic contrast-enhanced MRI. Magnetic Resonance in Medicine. 2011;65(3):812–822. doi: 10.1002/mrm.22644. [DOI] [PubMed] [Google Scholar]
- 37.Le M, Fessler J. Spline temporal basis for improved pharmacokinetic parameter estimation in SENSE DCE-MRI. Proceedings of the 23rd Annual Meeting of ISMRM; Toronto, Canada. 2015; p. 3698. [Google Scholar]
- 38.Breuer FA, Blaimer M, Mueller MF, Seiberlich N, Heidemann RM, Griswold MA, Jakob PM. Controlled aliasing in volumetric parallel imaging (2D CAIPIRINHA) Magnetic Resonance in Medicine. 2006;55(3):549–556. doi: 10.1002/mrm.20787. [DOI] [PubMed] [Google Scholar]
- 39.Deshpande V, Nickel D, Kroeker R, Kannengiesser S, Laub G. Optimized CAIPIRINHA acceleration patterns for routine clinical 3D imaging. Proceedings of the 20th Annual Meeting of ISMRM; Melbourne, Australia. 2012; p. 104. [Google Scholar]
- 40.Weavers PT, Borisch Ea, Riederer SJ. Selection and evaluation of optimal two-dimensional CAIPIRINHA kernels applied to time-resolved three-dimensional CE-MRA. Magnetic Resonance in Medicine. 2014;2242:2234–2242. doi: 10.1002/mrm.25366. [DOI] [PMC free article] [PubMed] [Google Scholar]








