Abstract
The conventional calibration-based parallel imaging method assumes a linear relationship between the acquired multichannel k-space data and the unacquired missing data, where the linear coefficients are estimated using some auto-calibration data. In this work, we first analyze the model errors in the conventional calibration-based methods and demonstrate the nonlinear relationship. Then a much more general nonlinear framework is proposed for auto-calibrated parallel imaging. In this framework, kernel tricks are employed to represent the general nonlinear relationship between acquired and unacquired k-space data without increasing the computational complexity. Identification of the nonlinear relationship is still performed by solving linear equations. Experimental results demonstrate that the proposed method can achieve reconstruction quality superior to GRAPPA and NL-GRAPPA at high net reduction factors.
Keywords: Kernel, nonlinear model, random projection, auto-calibration, parallel imaging
I. Introduction
PARALLEL Magnetic Resonance Imaging (pMRI) is a technique using phased array coils to receive k-space RF signal simultaneously. Conventionally, Fourier reconstruction method that satisfies Nyquist sampling criterion has been used to recover the image. Although the idea of accelerated scans using multiple coils was proposed in 1980’s [1], it was until the late 1990s that accelerated scans using multiple receivers became a practical option, when SiMultaneous Acquisition of Spatial Harmonics (SMASH) [2] and SENSitivity Encoding for fast MRI (SENSE) [3] were proposed. Scans with multiple receiver coils can be accelerated because the data acquired in parallel from all coils comes from the desired image weighted by different spatial coil sensitivities. This sensitivity information can be used to reduce the gradient encoding required for reconstruction. A variety of methods for parallel imaging reconstruction has been developed. Traditional reconstruction methods utilize the coil sensitivity information explicitly or implicitly. Methods like SMASH [2], SENSE [3, 4], Sensitivity Profiles from an Array of Coils for Encoding and Reconstruction In Parallel (SPACE-RIP) [5], Parallel magnetic resonance imaging with Adaptive Radius in k-space (PARS) [6], and parallel imaging reconstruction for arbitrary trajectories using k-space sparse matrices (kSPA) [7] require the coil sensitivities explicitly. These sensitivities are usually estimated from pre-scans, which are subject to errors [8].
Similarly, the coil sensitivity information can also be exploited implicitly without the need for a pre-scan but with some additionally acquired auto-calibration signal (ACS) data. For example, in partially parallel imaging with localized sensitivities (PILS) [9], the coil sensitivity is restricted to a limited region; In JSENSE [10], the coil sensitivity is assumed to be comprised of a limited number of polynomial terms. In auto-SMASH [11, 12], generalized auto-calibrating partially parallel acquisitions (GRAPPA) [13], and SPIRiT [14], the coil sensitivity information is inherently captured by the calibration data, but is not calculated explicitly. Rather the calibration data is used for regression to obtain a set of weights. The unacquired k-space data for each channel (known as the target channel) is thereby reconstructed by a linear combination of some acquired k-space data using these weights. Huo and Wilson [15, 16] have shown that there exists regression “outliers” in the GRAPPA calibration. A few works attempt to address this issue using nonlinear methods [17, 18] but requiring a large amount of ACS data, which leads to reduced net acceleration factors.
Prior information such as sparseness and low rankness has been incorporated in parallel imaging reconstruction to achieve higher acceleration factors [19-25]. For example, sparsity constraints have been used to denoise GRAPPA reconstruction in [19, 20] and to denoise SPIRiT in L1-SPIRiT [14]. These methods require iterative nonlinear reconstruction algorithms.
In this work, we first analyze the model error of linear regression in GRAPPA, and then propose a general nonlinear framework for auto-calibrated parallel imaging. Kernel methods are exploited to represent the general nonlinear relationship between acquired and unacquired k-space data. Identification of the nonlinear relationship is still performed by solving linear equations without the need for iterative method thanks to kernel tricks. In addition, random projections have been used to avoid overfitting and reduce the computational complexity. The rest of this paper is organized as follows: Section II reviews the traditional linear model for k-space-based reconstruction method for parallel imaging, and Section III describes the details of the proposed nonlinear kernel model. Section IV shows experimental results on several in vivo scans. Section V and VI provides discussion and conclusions respectively.
Symbols and Notations used in this paper are explained as follows.
Symbols and Notations: |
---|
R: Outer reduction factor; |
c: Coil/channel index; |
Nc: Total number of coils/channels; |
φ: Index of sliding block in k-space; A block includes target data and the neighboring acquired data, termed source data. Example of one block is shown in the I-shaped polygon in Fig. 1, which includes all data from the two red boxes (source data) and all data from the green box (target data), from all Nc coils; |
l: Index of the coil number and offset for target data; l counts all Nc coils and (R–1) offsets. |
![]() |
αφl: Target data from the lth coil-and-offset at block φ; |
aΩ: Matrix of target data; |
dx: Block size along the frequency-encoding direction; |
dy: Block size along the phase-encoding direction; |
Φ(φ): Source data region in block φ from all coils; |
aΦ(φ): Vectorized source data from region Φ(φ); |
AΩ: Calibration source matrix with columns from aΦ(φ); |
AΩc: Synthesis source matrix with columns from aΦ(φ); |
k(·,·): kernel functions; Inputs are vectors, and the output is a numerical value; |
Ω: Calibration region; |
Ωc: Synthesis region; |
∣Ω∣: Size of Ω. |
Fig. 1.
Illustration of construction of vector of source data aΦ(φ) and target data aφl. All sliding blocks from the ACS region construct the calibration source matrix AΩ, while ah aφl (1 ≤ l ≤ L) construct the target matrix aΩ and share the same source data.
II. Background and Related Work
A. Review of Parallel Imaging and Linear Calibration
Parallel imaging reconstruction in k-space can be viewed as a 2D filterbank reconstruction problem [26]. Specifically, the desired full k-space data in a channel can be obtained by a linear combination of the acquired undersampled k-space data from all channels (without loss of generality, only 1D undersampling is considered here):
(1) |
where ac (or aj) represents the k-space data of channel c(or j), c = 1,…, Nc with kx and ky being the coordinates in the frequency- and phase-encoding directions respectively, (°)n the modulo operator with period n, and n1 and n2 the total number of frequency readouts and phase-encoding lines respectively, R is the reduction factor along phase encoding, and wj is the reconstruction filter coefficients. According to filter bank theory, for perfect reconstruction of the original full k-space data, the reconstruction filter coefficients depend on the coil sensitivity maps only, not on the acquired k-space data [26]. It suggests that Eq. (1) is a linear shift-invariant system with respect to input aj’s.
However in the data-driven approach GRAPPA, such linearity is not necessary satisfied because wj is assumed to have only a finite support dx×dy. Such a truncation of Eq. (1) leads to system model error δ[ky, kx] which can be represented as:
(2) |
where the model error δ[ky, kx] depends on aj’s that are outside the dx×dy window. Because aj’s outside the window correlate with those inside the window, the system is not a linear shift-invariant system anymore with respect to the input, which is the aj’s inside the window. The proof of nonlinearity is included in the Supplementary Materials (available in the supplementary files /multimedia tab). The degree of nonlinearity depends on the size of dx×dy and the coil sensitivity maps.
When the window size is large, δ[ky, kx] becomes negligible, and thus the truncated system is approximately linear and Eq. (2) can be decomposed into computation of several small blocks (see Fig. 1):
(3) |
where aφl denotes the target data from the lth coil-and-offset of block φ, a 1 × dxdyNc vector for all source data inside block φ, wl the dxdyNc × 1 filter coefficients for the lth offset and coil (1 ≤ l ≤ L, L = (R – 1)Nc), which is independent of block φ. GRAPPA uses Eq. (3) to perform the auto-calibration and synthesis. The filter coefficients wl’s are initially estimated in the calibration phase and then exploited in the synthesis phase, both using Eq. (3). During the calibration phase, the ACS data that is fully sampled at the central k-space is used. Therefore, the left-hand side of Eq. (3) is known at certain k-space locations that should be skipped but are acquired for the calibration purpose. In other words, the calibration finds the set of filter coefficients that is the most consistent with the calibration data in the least-squares sense. More formally, the calibration is described by the following equation:
(4) |
Incorporating all data from the ACS region, the above equation can be written in matrix form as:
(5) |
where aΩ is a ∣Ω∣ × L matrix whose lth column is al = [⋯, aφl, ⋯]T for 1 ≤ l ≤ L, φ ∈ Ω, and Aω is a matrix of size dxdyNc × ∣Ω∣ whose columns are aΦ(φ). Matrix Aω has a Hankel structure (See Fig. 1). For different coil/offset combination l, they all share the same matrix Aω.
During the synthesis phase, the unacquired data outside the ACS region is recovered using the same filter coefficients wl obtained in Eq. (5):
(6) |
where denotes the missing data matrix, and AΩc represents the Hankel matrix for the synthesis data.
While GRAPPA works well with a large window size, more calibration data are needed which reduces the net acceleration factor. On the other hand, small window size leads to large model error δ[ky, kx], and thus the linearity assumption in GRAPPA is violated. This tradeoff motivates our proposed nonlinear model to achieve accurate calibration with few calibration data.
B. Nonlinear Kernel Mapping Method
The model error in GRAPPA has been analyzed in [17] and nonlinear GRAPPA (NL-GRAPPA) has been proposed to address the issue, which allows nonlinear calibration and reconstruction. Specifically, higher order terms of the input data aj’s are used to approximate the model error δ[ky, kx],
(7) |
where the weights , , , and can be found in the same way as the traditional GRAPPA calibration does. Such nonlinearity provides improved reconstruction over the traditional GRAPPA. However, the method requires a large amount of ACS data for calibration due to increased number of unknown weights, which may not be practical for highly accelerated MRI. Besides, the reconstruction time is long. Although the idea of using kernel tricks has been mentioned in [17], no direct use of kernel functions for k-space calibration has been demonstrated. Only explicit polynomial functions with a finite order have been studied in [17].
III. Proposed Method
In this work, we propose a general kernel-based framework for auto-calibrated reconstruction, termed as KerNL. In contrast to NL-GRAPPA, the proposed framework does not require a specific nonlinear function such as a 2nd order polynomial, but rather a kernel function which can be used as basis to “learn” a broader class of nonlinear functions to represent the relationship between acquired and unacquired data. Such a kernel method has been used successfully in machine learning [27]. Besides, the proposed method still takes a non-iterative approach using the same undersampling pattern as GRAPPA and NL-GRAPPA.
A. Generalized Model
In the generalized framework, we assume a nonlinear, shift-invariant relationship between the target data and the neighboring source data. Specifically, the relationship between the target data aφl (1 ≤ l ≤ L) and the corresponding neighboring source data aΦ(φ) is described as a nonlinear function:
(8) |
where denotes the nonlinear function and is assumed to be shift invariant. In GRAPPA, fl(·) corresponds to the linear function with coefficients wl. In NL-GRAPPA, fl(·) corresponds to the polynomial function. In the nonlinear calibration process, the objective is to find the nonlinear function.
Here we use a regularized formulation of nonlinear regression that is considered as a variational problem in a reproducing kernel Hilbert space . The generalized calibration process finds the set of functions {fl (·)1 ≤ l ≤ L} that are the most consistent with the auto-calibration data in the least-squares sense:
(9) |
where ∥·∥ represents the L2 norm, defines a norm in the Hilbert space, which enforces the smoothness of the spanned Hilbert space, and ξ is a Lagrange parameter that adjusts the weight between data consistency and the smoothness of the estimated nonlinear function. In other words, a larger ξ leads to a more generalized model of fl(·), while a smaller ξ leads to over-fitting.
Directly finding the unknown nonlinear function in Eq. (9) is difficult. Since all the regressors of k-space data {aΦ(φ)}φ∈Ω forms a compact set , and to be a set of corresponding output, the function of fl(a) can be represented as weighted summations of a set of kernel functions, according to the Representer’s Theorem [27]. According to the theorem, the minimizer fl(·) of Eq. (9) always takes the form of
(10) |
where is a positive definite kernel function, and {βφ,l}φ∈Ω is the linear combination coefficient for the lth specified offset and coil number in k-space. The significance of the theorem is that although we are searching for functions in an infinite-dimensional Hilbert space, it states that the solution lies in the span of a set of particular kernels – those centered on the calibration data aΦ(φ).
As a result, the generalized nonlinear model in Eq. (8) has kernel-based representations shown in Eq. (10). To solve the missing k-space data in the kernel based framework, the proposed methods has three steps: kernel construction, kernel calibration and kernel synthesis.
B. Kernel Construction
There are many different kinds of kernels. Choosing an appropriate kernel is critical for reconstruction quality. Gaussian kernel with σ being a constant, has been proved to be universal, which means that linear combinations of the kernel can approximate any continuous function. However, overfitting may result from such powerful representation. Sine kernel is well known as an interpolation kernel. Polynomials kernel
(11) |
has also been widely used, where z is a constant, (•)H denotes the Hermitian transpose, and d is the degree of the polynomial. NL-GRAPPA is a special case when d = 1 and the inner product is only with itself and adjacent neighbors. (It is worth noting that GRAPPA cannot be represented in the kernel framework.) Here we choose the polynomial kernel with d = 2 as the default kernel for our method. Other kernel functions are also studied.
C. Kernel Calibration
Our goal in the calibration process is to find the nonlinear relationship between the target-source data pairs {aΦ(φ), aφl}φ∈Ω drawn from the ACS region. It is well-known that the challenge in such nonlinear regression is to balance the generalization property and over-fitting phenomena. Several methods such as support vector machines [28] have been studied to improve the generalization of the existing nonlinear regression method. In our case, the size of the calibration data in the ACS region ∣Ω∣ roughly equals to Nf · NACS, which is much larger than dxdyNc, where Nf is the number of frequency encoding lines, and NACS represents the number of ACS lines. If all ACS data were used as target data for calibration, the degree of freedom in the nonlinear regression would be significantly increased, resulting in severe overfitting.
In this work, we propose to select only a subset of data pairs {aΦ(φ), aφl}φ∈ϒ from {aΦ(φ), aφl}φ∈Ω with ϒ ⊂ Ω, as shown in Fig. 2. Random projection (bottom) is preferred to central projection (top) because for the same size ϒ, more data points contribute to the calibration. Figure S1 of Supplementary Materials (available in the supplementary files/multimedia tab) provides an example to demonstrate that random projection is better than the central one. After the reduction, the degree of freedom is significantly reduced for the nonlinear function:
(12) |
Note that ∣ϒ∣ needs to be chosen properly. We define γ = ∣ϒ∣/∣Ω∣ to represent the rate of reduction. Small values for γ lead to severe fitting errors, while large values result in loss of generalizability. Note NL-GRAPPA used central projection with ϒ including adjacent neighbors only. The choice of γ is discussed in Results.
Fig. 2.
Illustration of subset projection Ω → ϒ. The black dots represent the positions of the target data {aφl}φ∈Ω in ACS region in k-space. The dashed boxes show the corresponding source data positions in k-space (Example R = 2, dx = 5, dy = 2). The top row shows the central projection, the bottom row shows random projection.
During calibration, we plug the nonlinear function of Eq. (12) into the model in Eq. (8), we have
(13) |
where kφ, ϒ = [⋯ K(aΦ(φ), aΦ(φ′)) ⋯] is a row vector, and is the unknown kernel weight vector. It shows the target data is actually a weighted summation of a set of kernel functions evaluated at the values of source data.
The significance of the proposed model described in Eq. (13) is that the difficult problem of finding the unknown function fl(·) is simplified to finding a set of linear weights βl. Applying the shift-invariant property, Eq. (13) is set up for all acquired target data {aφl}φ∈Ω, and then we can solve for the linear weights by:
(14) |
where al is a column vector of all target data aφl in the ACS region, and KΩ, ϒ is the kernel matrix (also known as a Gram matrix) given by:
(15) |
D. Kernel Synthesis
After all βl’s are calculated from the kernel calibration process, the missing data can be calculated using the same Eq. (13), but with different kernel matrices. Using Eq. (8), we obtain the missing data by
(16) |
Putting all missing data at the lth offset/coil into a vector, we have
(17) |
where KΩc,ϒ is the kernel matrix for synthesis and is given by
(18) |
for and φ′ ∈ ϒ.
E. Efficient Computation
In this section, we present an efficient method to solve Eq. (14). Instead of adopting the iterative method for large-scale problem, we directly solve the linear equation after applying a dimension reduction on the training data. Dimensionality reduction involves projecting data from a highdimensional space to a lower-dimensional one without a significant loss of information. Here we use random projection [29] for dimension reduction.
As has been introduced in previous section, we use a dimension-reduced kernel matrix in Eq. (15) to find the weights βl. The kernel matrix is reduced only in columns, not in rows. Apparently these are over-determined equations (for all l), since KΩ, ϒ has much more rows than columns (∣Ω∣ ≫ ∣ϒ∣). It has been shown that random projection [30, 31] is able to reduce the number of overdetermined equations in GRAPPA to reduce computations without comprising the accuracy. Here, instead of calculating Eq. (14), we calculate
(19) |
where being a sparse random matrix with n ≪ ∣Ω∣. We define . Due to the reduced dimensions, the coefficients βl in Eq. (19) can be found analytically by:
(20) |
where the kernel matrix is created by randomly selecting a subset of rows and columns from the full kernel matrix KΩ,Ω (as illustrated in Fig. 3), and λI is the Tikhonov regularization term.
Fig. 3.
Illustration of construction of the kernel matrix for calibration and kernel matrix for synthesis with the dimension reduction.
Similar, we define η = n/∣Ω∣ to represent the dimension reduction rate, which equals to the number of equations after random projection over the number of original equations. In other words, η/γ represents how much the new equation is overdetermined after dimensionality reduction. Here η is larger than γ, since we do not expect Eq. (19) to become underdetermined.
Parallel computation can also be incorporated to calculate all βl’s (1 ≤ l ≤ L) simultaneously, since their calculations share the same term in Eq. (20).
Algorithm 1 Kernel-Based NonLinear Reconstruction (KerNL) | ||
---|---|---|
|
F. Summary of KerNL Reconstruction
The procedure of the KerNL reconstruction is summarized in the following algorithm.
IV. Experimental Results
A. Experimental Setup
Three data sets were used to evaluate the performance of the proposed method. The first axial brain datasets were fully sampled on a GE 3T scanner (GE Healthcare, Waukesha, Wisconsin, USA) with an 8-channel head coil using a 2D spin echo sequence (TE/TR = 11/700 ms; matrix size = 256×256). The second 2D multi-slice dataset was acquired on a 1.5T scanner (United-Imaging Healthcare, Shanghai, China) with a 12-channel head coil using 2D spin echo sequence (TE/TR = 173/450 ms; matrix size = 256×224). Both of the first two datasets were retrospectively undersampled manually to simulate the accelerated acquisition for evaluation of the proposed method. The third dataset was prospectively undersampled with 3-fold acceleration from a Siemens 3T scanner (Siemens Healthcare, Erlangen, Germany) with 4-channel coils using SSFP sequence (Flip angle = 50°, TE/TR = 1.7/3.45 ms, matrix size = 256×216). All data were acquired using protocols approved by the local IRB.
The square root of sum-of-squares of the images from all coils from fully sampled datasets were used as references. We compared the proposed KerNL method with several existing k-space based reconstruction methods to demonstrate the reconstruction improvements and computation savings of KerNL. In particular, we compared the three approaches when the same amount of ACS data and the same outer reduction factor R were used. Column size dx = 11, and block size dy = 3 were used in all methods.
The computational complexity was measured in CPU times. Due to the random nature of the proposed method, the NMSE and CPU time were both obtained from the average results of 50 executions. The reconstruction quality was evaluated both visually and quantitatively in terms of the normalized mean-squared error (NMSE), which was calculated by:
(21) |
where x represents the normalized reference image, is the normalized reconstruction.
All programs were implemented in MATLAB R2012a and run on a workstation with 3.4-GHz Intel Core i7 central processing unit, 16GB of memory and Windows 7 Enterprise 64-bit operating system.
B. Reconstruction Performance
We compared the reconstruction quality of the proposed KerNL method to that of GRAPPA, NL-GRAPPA, SPIRiT, and L1-SPIRiT (window size 7×7, 30 iterations). We performed reconstruction for dataset I with different undersampling rates. Fig. 4 shows the reconstruction results.
Fig. 4.
Comparison of brain image reconstruction using (a) GRAPPA, (c) NL-GRAPPA, (e) KerNL, (g) SPIRiT, (i) L1-SPIRiT. (b)(d)(f)(h)(j) are the corresponding selected zoomed-in region. Retrospective under sampling parameters (R- #ACS) are shown on the bottom left of the images of (a) and (g).
For the reconstructions with medium (NACS = 20) ACS lines and high net reduction factor (R = 4), KerNL reconstructions show significant improvements over GRAPPA, NL-GRAPPA, SPIRiT, or L1-SPIRiT. When the number of ACS lines increases (NACS = 30), GRAPPA and NL-GRAPPA improves, but KerNL still performs better than GRAPPA and NL-GRAPPA. On the other hand, when the net reduction factor decreases (R = 3) but with fewer ACS lines (NACS = 16), NL-GRAPPA fails, while other methods perform similarly. This is because NL-GRAPPA usually requires a large amount of ACS data for calibration [17]. When more ACS lines (NACS = 20) are used at R = 3, all methods reconstruct well. The NMSEs in Table I are mostly consistent with the image quality in Fig. 4. It is worth noting that due to the random nature of the proposed method, the NMSE for KerNL is the average of 50 trials. For reconstructions from SPIRiT, all results contain aliasing artifacts. This is because SPIRiT converges much more slowly for 1D uniform undersampling, the case considered in this paper, than 2D undersampling, the cases studied in [14]. SPIRiT is thereby not compared against for other datasets.
TABLE I.
NMSEs (× 10−3) for Different Methods
R-ACS | Net Reduction |
GRAPPA | NL- GRAPPA |
KerNL | SPIRiT | L1-SPIRi T |
---|---|---|---|---|---|---|
4-20 | 3.24 | 2.628 | 2.251 | 2.575 | 4.008 | 3.443 |
4-30 | 2.94 | 2.426 | 1.928 | 2.141 | 2.940 | 2.513 |
3-16 | 2.67 | 1.315 | 24.893 | 1.314 | 1.794 | 1.604 |
3-20 | 2.61 | 1.250 | 1.147 | 1.149 | 1.489 | 1.346 |
We also compared the algorithm efficiency based on the CPU time in Table II. Simulation results indicate that a smaller size of ACS leads to reduced reconstruction time. The reduced computation time of KerNL over existing methods mainly comes from the dimension reduction.
TABLE II.
CPU Times for Different Methods
R-ACS | GRAPPA | NL- GRAPPA |
KerNL | SPIRiT | L1-SPIRiT |
---|---|---|---|---|---|
4-20 | 6.2s | 13.0s | 3.2s | 3.0s | 54.8s |
4-30 | 11.2s | 19.9s | 7.8s | 3.1s | 54.9s |
3-16 | 3.5s | 8.0s | 1.6s | 3.0s | 54.8s |
3-20 | 4.9s | 11.5s | 3.1s | 3.0s | 54.8s |
In the proposed method, the 2nd-order polynomial kernel was used. To evaluate the effectiveness of other kernels, Table III shows the NMSEs of reconstruction results from different kernels on dataset I. As can be seen, polynomial kernels and sinc kernel perform similarly, while the Gaussian kernel fails because the dimension of the feature space is infinite and therefore the calibration suffers from overfitting.
TABLE III.
Comparison of NMSEs (×10−3) Over Different Kernels
R-ACS | 1st-order Polynomial |
2nd-order Polynomial |
3rd-order Polynomial |
Gaussian | Sine |
---|---|---|---|---|---|
4-20 | 2.525 | 2.575 | 2.569 | 43.10 | 2.278 |
4-30 | 2.362 | 2.141 | 2.195 | 30.80 | 2.045 |
3-16 | 1.350 | 1.314 | 1.898 | 52.60 | 1.358 |
3-20 | 1.180 | 1.149 | 1.140 | 31.60 | 1.236 |
In summary, the higher the polynomial order is, the higher fitting accuracy we may achieve for calibration data, but the more likely the regression suffers from overfitting. The proposed method uses the 2nd-order polynomial kernel, which is a balance of fitting accuracy and generalizability.
Fig. 5 show the reconstruction results of dataset II from retrospective undersampling. The acceleration factor was 3 with 40 ACS lines. The KerNL reconstruction is seen to be cleaner, compared with GRAPPA and NL-GRAPPA reconstructions. Quantitative results of NMSE also agree with the visual results.
Fig. 5.
Comparison of three reconstruction methods. Top-left: Reference image; Top-right: GRAPPA; Bottom-left: NL-GRAPPA; Bottom-right: KerNL (R=3, NACS = 40). NMSEs are 0.0095, 0.0081, 0.0072, respectively.
Fig. 6 shows the reconstruction of the third dataset using all three methods and the corresponding enlarged region of interest (ROI). The proposed method captures most of the structures, and removes the noise well. The reconstructed image from the proposed method has a better visual quality and low NMSE than those of GRAPPA and NL-GRAPPA.
Fig. 6.
Comparison of a two-chamber view of heart image reconstructed using (a) GRAPPA, (b) NL-GRAPPA, and (c) KerNL, for R=3, NACS = 28. CPU times are 3.3 s, 10.4s, 2.8s respectively, and NMSEs are 0.07943, 0.05207, and 0.03453 respectively.
V. Parameter Selection
A. Kernel Parameter z in Eq. (11)
In the proposed method, a second-order polynomial kernel was used. Choosing a proper value of z affects the performance of the proposed method. For example, in the extreme case when z → ∞, the kernel function becomes a constant. When z → 0, the kernel function is highly nonlinear, so the reconstruction is poor due to large model errors.
Here we choose z to be the maximum value of for all i and j ∈ Ω. Table IV shows the NMSEs of the reconstructed images under different z. The results suggest that the reconstruction quality is robust over the parameter z. For , the NMSEs remain relatively low and do not vary significantly.
TABLE IV.
NMSEs Over Parameter z (ORF: 3; NACS = 24)
10−4 | 10−3 | 10−2 | 10−1 | |
NMSE (×10−3) | 1.44 | 0.2258 | 0.1109 | 0.1060 |
100 | 101 | 102 | 103 | |
NMSE (×10−3) | 0.1058 | 0.1058 | 0.1055 | 0.1173 |
B. Influence of γ and η
The selection of parameters γ = ∣ϒ∣/∣Ω∣ and η = n/∣Ω∣ is crucial for the reconstruction quality and computational efficiency. Intuitively, the bigger the γ is, the better the function fits the calibration data, but the more likely to have over-fitting problem and thereby poor estimation of the missing data. The smaller the η is, the fewer equations need to be solved, but Eq. (19) becomes under-determined if η is too small. As a result, both γ and η need to selected be balance the above mentioned tradeoff.
A set of experiments have been carried out to evaluate the influence of parameters γ and η on the reconstruction quality with γ and η both varying from 0 to 1. The resulting NMSEs are show as a color map in Fig. 7a. The map indicates that the reconstruction quality is high as long as η/γ > 4.
Fig. 7.
(a) Colored map of NMSE of KerNL method as a function of γ and η (Dataset I, ACS=24, ORF=3 (b) Colored map of NMSE of KerNL method as a function of γ and η (Dataset I, NACS = 24, R = 3) with regularization. The map suggests that the reconstruction is good as long as η is larger than γ to make sure the equation is over-determined.
When γ increases, the degree of freedom becomes higher and overfitting is more serious. Fig. 8 shows two sample images with different values of γ and η/γ = 4 fixed. It is evident that large γ causes the image to be blurry and have aliasing artifacts.
Fig. 8.
KerNL reconstructions with different γ (Dataset I; NACS = 24; R = 3).
Because the values of γ and η directly affects the computational complexity, we chose γ = 0.1 and η = 0.5 empirically for all experiments.
C. Choice of γ in Eq. (20)
Other than choosing a small γ, the overfitting problem can also be addressed using regularization. Direct inverting might be ill-conditioned, which leads to poor reconstruction. When adding a regularization term λI, λ > 0 in calculating the inverse of , where I is the identity matrix and λ is the regularization parameter, the problem becomes well-conditioned.
A larger λ in Eq. (20) makes the solution more stable. However, increasing λ too much so that λ > max(Λi) will result in over-smoothing, where Λi is the ith eigenvalue of . A good value of the regularization parameter λ can often be found to balance the tradeoff. Here we choose λ to be equal to be a quarter of the median eigenvalue of .
Similarly, we show the NMSEs as a color map with respect to parameters γ and η. The results are shown in Fig. 7b. The larger blue region compared to Fig. 7a indicates that regularizations can make the reconstruction less sensitive to the changes in γ and η.
VI. Discussion
The proposed KerNL has shown improvements in noise reduction and computation efficiency. The best performance the proposed KerNL method depends on the generalization ability and the fitting accuracy of the kernel model.
A. Dimension Reduction
An important benefit of the proposed method is the high efficiency in kernel calibration, thanks to the random projection used for dimension reduction. It is known that the computation involved in calibration is [32], which suggests that smaller γ and η
lead to shorter CPU time. Basically there are two different dimension reductions in the proposed KerNL method, feature dimension reduction γ and training data reduction η. The main motivation for feature dimension reduction is to overcome over-fitting, with computation being a minor concern. While the main motivation for training data reduction is to speed up the computation.
B. Cross Validation for Dimension Reduction
Another benefit of the proposed method is that the estimation accuracy of kernel coefficients can be validated from the acquired data itself. since random selection only uses a portion of the ACs data as the target data for nonlinear calibration, the rest of the ACs data can be used to validate if the nonlinear function can be generalized to fit the testing data well.
We use an example in Fig. 9 show how cross validation can be used to tune the parameter γ (feature selection). We used a subset of acquired ACS data to perform KerNL calibration and reconstruct the complement set in ACS data (Dataset I; NACS = 36; R = 3). The NMSEs between the reconstructed and the acquired data were plotted for different values of γ. It is seen that point B on the curve gives the lowest NMSE. The reconstructed images show that the choice of γ at point B correspond to the best reconstruction quality.
Fig. 9.
Example shows cross validation can be used to choose the optimal value for γ.
VII. Conclusions
We have proposed a novel framework for nonlinear calibration and synthesis in parallel imaging reconstruction. The use of kernel function and dimension reduction makes the method to balance well the fitting accuracy and generalizability. As a result, the missing k-space data can be estimated accurately using the nonlinear model. The proposed KerNL method has shown to significantly improve the SNR over GRAPPA and reduce the aliasing artifacts in NL-GRAPPA when few ACS lines are acquired. In addition, the proposed KerNL method remains to be efficient computationally.
Supplementary Material
Acknowledgments
This work is supported in part by National Science foundation CBET-1265612, CCF-1514403, the National Institute of Health R21EB020861. Asterisk indicates corresponding author.
Contributor Information
Jingyuan Lyu, Department of Electrical Engineering, University at Buffalo, The State University of New York and is now with United Imaging Healthcare America, Houston, TX, USA..
Ukash Nakarmi, Department of Biomedical Engineering and the Department of Electrical Engineering, University at Buffalo, The State University of New York, Buffalo, NY, 14260, USA (leiying@buffalo.edu)..
Dong Liang, Shenzhen Key Laboratory for MRI, Paul C. Lauterbur Research Centre for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, China..
Jinghua Sheng, Hangzhou Dianzi University, China..
Leslie Ying, Department of Biomedical Engineering and the Department of Electrical Engineering, University at Buffalo, The State University of New York, Buffalo, NY, 14260, USA (leiying@buffalo.edu)..
References
- [1].Kelton JR, Magin RL, and Wright SM, “An algorithm for rapid image acquisition using multiple receiver coils,” Proc. SMRM 8th annual meeting, Amsterdam, pp. 1172, 1989. [Google Scholar]
- [2].Sodickson DK, and Manning WJ, “Simultaneous acquisition of spatial harmonics (SMASH): fast imaging with radiofrequency coil arrays,” Magn. Reson. Med, vol. 38, no. 4, pp. 591–603, 1997. [DOI] [PubMed] [Google Scholar]
- [3].Pruessmann KP, Weiger M, Scheidegger MB, and Boesiger P, “SENSE: sensitivity encoding for fast MRI,” Magn. Reson. Med, vol. 42, no. 5, pp. 952–962, 1999. [PubMed] [Google Scholar]
- [4].Pruessmann KP, Weiger M, Börnert P, and Boesiger P, “Advances in sensitivity encoding with arbitrary k-space trajectories,” Magn. Reson. Med, vol. 46, no. 4, pp. 638–651, 2001. [DOI] [PubMed] [Google Scholar]
- [5].Kyriakos WE, Panych LP, Kacher DF, Westin C-F, Bao SM, Mulkern RV, and Jolesz FA, “Sensitivity profiles from an array of coils for encoding and reconstruction in parallel (SPACE RIP),” Magn. Reson. Med, vol. 44, no. 2, pp. 301–308, 2000. [DOI] [PubMed] [Google Scholar]
- [6].Yeh EN, McKenzie CA, Ohliger MA, and Sodickson DK, “Parallel magnetic resonance imaging with adaptive radius in k-space (PARS): Constrained image reconstruction using k-space locality in radiofrequency coil encoded data,” Magn. Reson. Med, vol. 53, no. 6, pp. 1383–1392, 2005. [DOI] [PubMed] [Google Scholar]
- [7].Liu C, Bammer R, and Moseley ME, “Parallel imaging reconstruction for arbitrary trajectories using k-space sparse matrices (kSPA),” Magn. Reson. Med, vol. 58, no. 6, pp. 1171–1181, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Blaimer M, Breuer F, Mueller M, Heidemann RM, Griswold MA, and Jakob PM, “SMASH, SENSE, PILS, GRAPPA: how to choose the optimal method,” Topics in Magn. Reson. Med, vol. 15, no. 4, pp. 223–236, 2004. [DOI] [PubMed] [Google Scholar]
- [9].Griswold MA, Jakob PM, Nittka M, Goldfarb JW, and Haase A, “Partially parallel imaging with localized sensitivities (PILS),” Magn. Reson. Med, vol. 44, no. 4, pp. 602–609, 2000. [DOI] [PubMed] [Google Scholar]
- [10].Ying L, and Sheng J, “Joint image reconstruction and sensitivity estimation in SENSE (JSENSE),” Magn. Reson. Med, vol. 57, no. 6, 1196–1202, 2007. [DOI] [PubMed] [Google Scholar]
- [11].Heidemann RM, Griswold MA, Haase A, and Jakob PM, “VD-AUTO-SMASH imaging,” Magn. Reson. Med, vol. 45, no. 6, pp. 1066–1074, 2001. [DOI] [PubMed] [Google Scholar]
- [12].Jakob PM, Grisowld MA, Edelman RR, and Sodickson DK, “AUTO-SMASH: a self-calibrating technique for SMASH imaging,” Magnetic Resonance Materials in Physics, Biology and Medicine 7, no. 1, pp. 42–54, 1998. [DOI] [PubMed] [Google Scholar]
- [13].Griswold MA, Jakob PM, Heidemann RM, Nittka M, Jellus V, Wang J, Kiefer B, and Haase A, “Generalized autocalibrating partially parallel acquisitions (GRAPPA),” Magn. Reson. Med, vol. 47, no. 6, pp. 1202–1210, 2002. [DOI] [PubMed] [Google Scholar]
- [14].Lustig M, and Pauly JM, “SPIRiT: Iterative self-consistent parallel imaging reconstruction from arbitrary k-space,” Magn. Reson. Med, vol. 64, no.2, pp. 457–471, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Huo D, and Wilson DL, “Robust GRAPPA reconstruction,” 3rd IEEE Int’l Symp Biomedical Imaging: Nano to Macro, pp. 37–40, 2006. [Google Scholar]
- [16].Huo D, and Wilson DL, “Robust GRAPPA reconstruction and its evaluation with the perceptual difference model,” J. Magn Reson Imaging, vol. 27, no. 6, pp. 1412–1420, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Chang Y, Liang D, and Ying L, “Nonlinear GRAPPA: a kernel approach to parallel MRI reconstruction,” Magn. Reson. Med, vol. 68, pp. 730–740, 2012. [DOI] [PubMed] [Google Scholar]
- [18].Bydder M, and Jung Y, “A nonlinear regularization strategy for GRAPPA calibration,” Magn. Reson. Imaging, vol. 27, no. 1, pp. 137–141, 2009. [DOI] [PubMed] [Google Scholar]
- [19].Weller DS, Polimeni JR, Grady L, Wald LL, Adalsteinsson E, and Goyal VK, “Denoising sparse images from GRAPPA using the nullspace method,” Magn. Reson. Med, vol. 68, no. 4, pp. 1176–1189, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Weller DS, Polimeni JR, Grady L, Wald LL, Adalsteinsson E, and Goyal VK, “Sparsity Promoting Calibration for GRAPPA Accelerated Parallel MRI Reconstruction,” IEEE Trans. Med. Imaging, vol. 32, no. 7, pp. 1325–1335, July 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Liu B, Zou YM, and Ying L, “SparseSENSE: application of compressed sensing in parallel MRI,” 2008 Int’l Conf on Information Technology and Applications in Biomedicine, IEEE, 2008. [Google Scholar]
- [22].Liang D, Liu B, Wang J, and Ying L, “Accelerating SENSE using compressed sensing,” Magn. Reson. Med, vol. 62, no. 6, pp. 1574–1584, 2009. [DOI] [PubMed] [Google Scholar]
- [23].Otazo R, Kim D, Axel L, and Sodickson DK, “Combination of compressed sensing and parallel imaging for highly accelerated first-pass cardiac perfusion MRI,” Magn. Reson. Med, vol. 64, no. 3, pp. 767–776, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].Shin PJ, Larson PE, Ohliger MA, Elad M, Pauly JM, Vigneron DB, and Lustig M, “Calibrationless parallel imaging reconstruction based on structured low-rank matrix completion,” Magn. Reson. Med, vol. 72, no. 4, pp. 959–970, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Trzasko JD, and Manduca A, “Calibrationless parallel MRI using CLEAR,” 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), pp. 75–79. IEEE, 2011. [Google Scholar]
- [26].Ying L, and Liang ZP, “Parallel MRI Using Phased Array Coils,” IEEE Signal Processing Magazine, vol. 27, no. 4, pp. 90–98, 2010. [Google Scholar]
- [27].Schölkopf B, and Smola AJ, “Learning with kernels: support vector machines, regularization, optimization, and beyond (adaptive computation and machine learning),” The MIT Press, Boston, 2001. [Google Scholar]
- [28].Wahba G, “Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV,” Advances in Kernel Methods-Support Vector Learning, vol. 6, pp. 69–87, 1999. [Google Scholar]
- [29].Arriaga RI, and Vempala S, “An algorithmic theory of learning: robust concepts and random projection,” Machine Learning, vol. 63, no. 2, pp. 161–182, 2006. [Google Scholar]
- [30].Lyu J, Chang Y, and Ying L, “Fast GRAPPA Reconstruction with Random Projection,” Magn. Reson. Med, vol. 74, no. 1, pp. 71–80, 2015. [DOI] [PubMed] [Google Scholar]
- [31].Lyu J, Chang Y, and Ying L, “Efficient GRAPPA reconstruction using random projection,” Proceedings of the 10th IEEE International Symposium on Biomedical Imaging, San Francisco, California, USA, pp. 696–699, 2013. [Google Scholar]
- [32].Brau AC, Beatty PJ, Skare S, and Bammer R, “Comparison of reconstruction accuracy and efficiency among autocalibrating data-driven parallel imaging,” Magn. Reson. Med, vol. 59, pp. 382–395, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.