Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Jan 1.
Published in final edited form as: NMR Biomed. 2025 Jan;38(1):e5302. doi: 10.1002/nbm.5302

Acceleration of Simultaneous Multislice Magnetic Resonance Fingerprinting With Spatiotemporal Convolutional Neural Network

Lan Lu 1, Yilin Liu 2, Amy Zhou 1, Pew-Thian Yap 3, Yong Chen 4,5
PMCID: PMC11758274  NIHMSID: NIHMS2040526  PMID: 39631961

Abstract

Magnetic Resonance Fingerprinting (MRF) can be accelerated with simultaneous multislice (SMS) imaging for joint T1 and T2 quantification. However, the high inter-slice and in-plane acceleration in SMS-MRF causes severe aliasing artifacts, limiting the multiband (MB) factors to typically 2 or 3. Deep learning has demonstrated superior performance compared to the conventional dictionary matching approach for single-slice MRF, but its effectiveness in SMS-MRF remains unexplored. In this paper, we introduced a new deep learning approach with decoupled spatiotemporal feature learning for SMS-MRF to achieve high MB factors for accurate and volumetric T1 and T2 quantification in neuroimaging. The proposed method leverages information from both spatial and temporal domains to mitigate the significant aliasing in SMS-MRF. Neural networks, trained using either acquired SMS-MRF data or simulated data generated from single-slice MRF acquisitions, were evaluated. The performance was further compared with both dictionary matching and a deep learning approach based on residual channel attention U-Net. Experimental results demonstrated that the proposed method, trained with acquired SMS-MRF data, achieves the best performance in brain T1 and T2 quantification, outperforming dictionary matching and residual channel attention U-Net. With a MB factor of 4, rapid T1 and T2 mapping was achieved with 1.5 s per slice for quantitative brain imaging.

Keywords: deep learning, magnetic resonance fingerprinting, quantitative MRI, simultaneous multislice imaging

1 |. Introduction

Magnetic Resonance Fingerprinting (MRF) [1] allows fast and simultaneous quantification of multiple tissue properties in a single acquisition. With pseudo-randomly varying acquisition parameters, such as flip angles (FAs) and repetition times (TRs), MRF generates unique signal time courses, termed fingerprints, in association with multiple tissue properties (e.g., T1 and T2 relaxation times). This differs from conventional quantitative MRI techniques, which mostly quantify only one specific tissue property at a time. Dictionary matching (DM) is then employed to match the acquired signal time course at each voxel to a pre-computed dictionary of fingerprints associated with a wide range of tissue properties. Tissue properties are retrieved based on the most correlated fingerprint in the dictionary. With this pattern matching process, MRF improves not only time efficiency, but also robustness against aliasing artifacts and acquisition imperfections [13].

Although quantitative MRI via MRF is efficient, the acquisition time needed for volumetric coverage with a large number of slices is still relatively long for implementation in clinical settings. Simultaneous multislice (SMS) imaging is a widely used technique for accelerating volumetric acquisition by exciting multiple slices simultaneously rather than sequentially. Recent efforts have been directed toward combining SMS techniques with MRF for rapid volumetric imaging [46]. However, obtaining artifact-free images through SMS-MRF remains challenging due to the simultaneous acceleration along the slice direction, coupled with an exceptionally high acceleration factor (e.g., 48-fold) implemented for rapid in-plane MRF acquisitions.

Unaliasing prior to pattern matching has been the focus of SMS-MRF research. Ye et al. [7] imposed controlled aliasing differences between simultaneously excited slices in the time-axis by using an additional blip gradient along the slice-encoding direction. This approach has demonstrated effectiveness in simultaneous acquisitions of two slices. Jiang et al. [5] employed a multiband (MB) radiofrequency (RF) to excite two slices with different flip angles and RF phases and matched the acquired MRF images to two different dictionaries simulated specific to each slice. Zhao et al. [8] leveraged the strong spatiotemporal signal correlations and acquired low-rank images for reconstruction based on subspace modeling, achieving a MB factor of 3. Hamilton et al. [6] applied low-rank reconstruction on undersampled MRF data to reduce aliasing artifacts, achieving a MB factor of 3. These approaches mostly aim to eliminate aliasing artifacts in the reconstructed MRF images, but still rely on conventional DM for inferring tissue properties. However, conventional DM uses the inner product as a similarity metric to compare dictionary signal evolutions with observed fingerprints, which can pose significant challenges in generating high-quality tissue property maps in the presence of severe aliasing artifacts associated with higher MB factors (≥ 3).

Deep learning based approaches have emerged as an alternative to DM in single-slice MRF [9, 10]. Compared to DM, deep learning methods have demonstrated advantages in providing more accurate and rapid quantification of tissue property maps from MRF images [11]. Earlier approaches predominantly focused on voxel-wise mapping, which only harnesses the temporal characteristics of the signal time courses. For instance, Cohen et al. [11] employed a 4-layer fully connected neural network that can provide rapid tissue property mapping 300- to 5000-fold faster than DM. The network was trained based on signal evolutions contained in the MRF dictionary and the method was validated using simulated brain phantom. Hoppe et al. [12] utilized convolutional layers, in addition to fully connected layers, to capture temporal correlations of the signals, demonstrating enhanced robustness to under-sampling artifacts. Oksuz et al. [13] used a recurrent neural network with long short-term memory [14] to model longer signal temporal dependency. Another line of efforts aims to utilize spatial correlations among neighboring voxels in individual MRF time frames to reduce the effects of aliasing [15]. For example, Fang et al. proposed a spatially constrained quantification approach that is based on a residual channel attention U-Net (RCA-UNet). The network was trained and validated using highly undersampled MRF images acquired from in vivo brain imaging [16]. Although 2D convolutional neural networks are well-suited for extracting spatial information, they neglect correlations in the temporal dimension. Joint utilization of both spatial and temporal cues of MRF signal evolution can potentially improve the capability of eliminating aliasing artifacts and thus, the accuracy of tissue quantification for SMS-MRF acquisition with high MB factors.

In this work, we developed a new deep learning based method, called spatiotemporal U-Net (STUN), for high-quality tissue property mapping from highly under-sampled SMS-MRF with a MB factor up to 4. In particular, we proposed decoupling the spatiotemporal feature learning, in consideration of the nature of how MRF signals correlate differently across the spatial and temporal dimensions. We showed that this spatiotemporal convolutional neural network yields superior performance in accelerating SMS-MRF, outperforming both the standard DM approach and the RCA-UNet approach previously developed for single-slice MRF [16].

2 |. Methods

2.1 |. MRF Data Acquisition and Pre-Processing

All the MRI measurements were performed on a Siemens 3T Vida scanner with a 20-channel head coil. A total of seven healthy adult subjects (M:F, 5:2; mean age, 35 (range 20–61) years) were enrolled in this IRB-approved HIPAA compliant prospective study. A written consent form was obtained from all enrolled subjects before any MRI scans. For each subject, both single-slice 2D MRF and SMS-MRF (MB factor, 3 and 4) were acquired. Single-slice MRF acquisitions were specifically prescribed to cover all the imaging positions obtained in the SMS-MRF scans. Both single-slice and SMS-MRF acquisitions were based on the steady state free precession sequence [17]. The acquisition time was 23 s for each scan with a total of 2304 MRF time frames. The details of the single-slice MRF have been described in previous literature [15]. The SMS-MRF method adopted the same acquisition pattern as the single-slice MRF, except the RF pulses. Instead of the standard SINC pulse for RF excitation, multiband RF pulses generated from SINC waveforms with phase modulation were used in SMS-MRF to excite multiple slices simultaneously. The same variable flip angles and golden-angle spiral readouts were applied to acquire data for all the slices. The slice thickness was 5 mm and the distance between adjacent slices was 10 mm. Similar to the original MRF framework [1], each MRF time frame was highly undersampled in-plane by acquiring only one spiral arm (reduction factor: 48). Other imaging parameters included a 30 × 30 cm2 FOV, a 256 × 256 imaging matrix, flip angles ranging from 5° to 12°, and a constant TR of 7.0 ms. No correction of B1 field inhomogeneities was needed due to the relatively low flip angles applied in the MRF acquisitions [18].

All MRF image processing was performed offline on a desktop computer. The k-space MRF data were converted to image space by using non-uniform Fast Fourier Transform (NUFFT). For the SMS-MRF data, phase demodulation was applied based on the applied MB factor before NUFFT to extract MRF time frames for each imaging slice. The MRF dictionary was simulated using Bloch equations based on 13,123 combinations of T1 (60–5000 ms) and T2 (10–500 ms) values. Various step sizes were used for T1: 10 ms between 60 and 2000 ms, 20 ms between 2000 and 3500 ms, 50 ms between 3000 and 3500 ms, and 500 ms above 3500 ms. Step sizes for T2 included 5 ms between 10 and 200 ms, 10 ms between 200 and 300 ms, and 50 ms above 300 ms [15]. DM was then performed based on the MRF signal evolution from each voxel to extract T1 and T2 maps for each imaging slice in both single-slice MRF and SMS-MRF dataset [1].

2.2 |. Network Architecture

In this study, we propose to replace DM with a new convolutional neural network to 1) achieve high-quality T1 and T2 maps for SMS-MRF with MB factors of 3 and 4 and 2) further accelerate SMS-MRF acquisition with a reduction factor of 4 along the temporal dimension. Given a slice in the SMS-MRF dataset, which consists of a series of T MRF time frames I ∈ ℂX×Y×T (or equivalently I ∈ ℝX×Y×2T, with real- and imaginary parts of the signals concatenated) and a reference tissue property map 𝜃 ∈ ℝX×Y, our goal is to learn a mapping from MRF time frames (I) to tissue property maps (𝜃) via patch P ∈ ℂM×N×T (or equivalently P ∈ ℝM×N×2T) to take into account spatiotemporal information. Here, M and N represent the size of the patch P used in the neural network. As shown in Figure 1, our method consists of two networks that are trained end-to-end: (1) a signal compression network and (2) a spatiotemporal U-Net (STUN) for tissue property mapping.

FIGURE 1 |.

FIGURE 1 |

SMS-MRF reconstruction framework consisting of the signal compression network and the spatiotemporal U-Net (STUN). Spatiotemporal convolution was implemented using a 2D spatial convolution followed by a 1D temporal convolution.

  1. Signal Compression Network. Since MRF data are typically high-dimensional with a large number of time frames, we compress the data temporally into lower dimensionality, retaining only information that is essential for tissue quantification [15]. This is achieved via a micro-network, preceding the spatiotemporal U-Net. Unlike singular value decomposition [1921], the signal compression network is optimized in tandem with the quantification network for data-driven nonlinear dimensionality reduction.

The micro-network for signal compression (Figure 1) is implemented with four convolutional layers with kernels of size 1 × 1 to compress the signal vector of length 2 T at each voxel to a lower-dimensional feature vector of length D < 2 T, followed by batch normalization and rectified linear unit activation. Such 1 × 1 convolutional layers can be viewed as cross-channel parametric pooling, where each layer performs weighted linear combination on the input feature channels, thereby increasing or reducing the dimensionality of the inputs. Processing all voxels in an image patch P ∈ ℝM×N×2T via the micro-network yields a lower-dimensional feature map F ∈ ℝM×N×D, which is used as input to the subsequent tissue quantification network.

  1. Spatiotemporal U-Net (STUN). MRF data are not correlated in the temporal dimension the same way as in the spatial dimensions. When the temporal dimension is treated as the channel dimension in a 2D convolutional neural network, MRF frames are collapsed into a single channel via summation after spatial convolution, hence ineffective in capturing temporal dynamics. Inspired by Tran et al. [22], we factorize the 3D convolution (s × s × t) into two separate and successive operations: a 2D spatial convolution (s × s × 1), followed by a 1D temporal convolution (1 × 1 × t), where t denotes the temporal extent of the kernel and s denotes the spatial size. This allows signal correlation in both spatial and temporal dimensions to be employed for tissue property mapping. Compared to the regular 3D convolution, such factorized (2 + 1)D convolution aligns better with the spatiotemporal nature of MRF data, increases non-linearity due to the additional nonlinear rectification between the two convolutions, and eases optimization with lower training and testing losses [22].

Our proposed network, dubbed spatiotemporal U-Net (STUN), consists of an encoding phase, where features are extracted via a series of spatiotemporal convolutional layers and 2 × downsampling layers, and a decoding phase that upsamples the low-resolution encoded features to a full-resolution image with transposed convolutions. Global contexts are enriched with local details by concatenating the features at different scales. As shown in Figure 1, three 2 × downsampling (i.e., max-pooling) layers and three upsampling layers are employed in the current study.

2.3 |. Network Training and Evaluation

The STUN network was trained using SMS-MRF data from in vivo scans. One key factor to achieve accurate and robust tissue property maps is to ensure high-quality reference T1 and T2 maps for the training purpose. For single-slice MRF applications, this is mostly achieved using DM with sufficiently acquired MRF time frames [15, 16]. For SMS-MRF, however, residual artifacts were still noticed, especially with high MB factors, even with ~2300 time frames (supplemental material Figure S1). To address this, two different training strategies were evaluated in this study, focusing on brain imaging. For the first method, T1 and T2 maps acquired using single-slice MRF from the same imaging slices as the SMS-MRF were employed as the references. The acquired SMS-MRF images from the corresponding positions were used as the input to train STUN (Figure 2). One limitation with this approach is the potential misalignment caused by any head motion that might occur between the single-slice and SMS-MRF scans. To address this limitation, all the dataset with noticeable head displacement between the input (from SMS-MRF acquisitions) and reference (from single-s lice MRF) were removed from the training process, which correspond to ~40% of all the acquired data. For the second approach, data from the single-slice MRF acquisitions were used to simulate the training dataset based on the sampling algorithm for SMS-MRF (Figure 3). Specifically, the same phase modulation used to generate the multiband RF pulses was applied on the reconstructed single-slice MRF data to simulate SMS-MRF dataset and used as input into the network (Figure 3). The reference T1 and T2 maps were obtained via dictionary matching from the corresponding single-slice dataset, using all 2304 time points to ensure high-quality maps. Compared to the first approach, the network input data was inherently aligned with the corresponding reference maps, eliminating any concern of relative motions in between. In addition, by shuffling the orders of the single-slice MRF data in the simulation, a larger training dataset can be generated, in comparison to the first approach, which relies on the actual number of acquired SMS-MRF dataset. The numbers of slices in the training dataset for both approaches are listed in the supplemental material Table S1. For each MB factor, only one network was developed and applied to all the imaging slices obtained in the acquisition. The trained network was validated using the acquired SMS-MRF dataset. T1 and T2 maps obtained using single-slice MRF from corresponding imaging positions were used as references to calculate normalized root-mean-square errors (NRMSE) and structural similarity index (SSIM). Here, NRMSE is defined as root-mean-square errors weighted by the range of observed data. For both training approaches, leave-one-out cross validation was applied and the mean and standard deviation from all seven subjects were reported to evaluate the performance of the network.

FIGURE 2 |.

FIGURE 2 |

Network training using acquired SMS-MRF images as input and corresponding T1 and T2 maps from single-slice MRF acquired from the same slice location as the reference.

FIGURE 3 |.

FIGURE 3 |

Network training using the simulated SMS-MRF data as input. The same phase modulation used to acquire the multiband RF was applied on reconstructed single-slice MRF data to simulate SMS-MRF dataset.

Besides acceleration along the slice-encoding direction by leveraging the SMS imaging, further acceleration along the temporal dimension was also evaluated by using only the first 576 time points from each imaging slice as the network input, representing a factor of four acceleration along the temporal domain. During the training process, the signal time courses for the first 576 time points were compressed to D = 32 features through the signal compression network (Figure 1). The signal compression network and STUN were optimized with the Adam solver, with a fixed learning rate of 0.0002. Relative difference between the reference and network output was used as the loss function as described in the literature [15, 16]. Network weights were initialized with a Gaussian distribution with zero mean and standard deviation of 0.02. The networks were trained for a total of 100 epochs. The proposed method was implemented in Python with PyTorch library and trained using NVIDIA Tesla V100 GPU (32GB).

The effect of the size of the training dataset was also evaluated in this study. The performance for both STUN and RCA-UNet networks was examined using the acquired SMS-MRF dataset. For one testing case, we gradually reduced the number of training subjects from six to just one. The NRMSE value obtained using all six subjects was used as a reference value. The result obtained with a smaller number of training subjects (n < 6) was then weighted against this reference value. The experiment was performed on two testing cases randomly selected from all seven subjects.

3 |. Results

Representative T1 and T2 maps obtained for MB factors of 3 and 4 from two different subjects are shown in Figures 4 and 5, respectively. Only the results obtained from one slice out of the multiple slices acquired in each SMS-MRF acquisition are presented for easy viewing. Skull stripping was performed manually based on the acquired tissue maps. We compared the performance of the proposed STUN with the conventional dictionary matching (DM) [1] and a spatially constrained deep learning based quantification (RCA-UNet) [16]. For the two deep learning methods, both the results obtained using acquired and simulated training dataset are included. All these results were obtained using the first 576 time points and compared to the reference T1 and T2 maps generated using all the 2304 time points acquired from the corresponding single-slice MRF acquisitions. The aliasing artifacts observed in the maps obtained with DM were largely removed with the proposed deep learning approach. Both deep learning methods (RCA-UNet and STUN) trained with the acquired training dataset outperform the same network trained using the simulated training dataset. For T2 quantification, STUN has demonstrated the best performance in terms of NRMSE and SSIM values, outperforming DM and RCA-UNet for both MB factors (3 and 4). On the other hand, similar performance was achieved between STUN and RCA-UNet in T1 quantification, both outperforming the standard DM method.

FIGURE 4 |.

FIGURE 4 |

Representative T1 and T2 maps obtained from the STUN network with a MB factor of 3. Both the results obtained using the acquired data and simulated data are presented. Normalized root-mean-square errors (NRMSE) and structural similarity index (SSIM) were calculated based on the reference maps.

FIGURE 5 |.

FIGURE 5 |

Representative T1 and T2 maps obtained from the STUN network with a MB factor of 4. Both the results obtained using the acquired data and simulated data are presented.

Table 1 summarizes the NRMSE and SSIM values for all experiments performed for the two MB factors. Compared to the results obtained with the MB factor of 3, a slight increase of NRMSE values (0.004–0.006) and decrease of SSIM values (0.022–0.025) were noticed for the results with the MB factor of 4, when using the STUN network trained with the acquired dataset.

TABLE 1 |.

Summary of NRMSE and SSIM values for T1 and T2 measurement obtained with both dictionary matching and deep learning methods.

T1 MB factor DM Training on simulated data
Training on acquired data
RCA-UNet STUN RCA-UNet STUN

NRMSE MB3 0.089 ± 0.020 0.050 ± 0.008 0.048 ± 0.006 0.048 ± 0.005  0.047 ± 0.007
MB4 0.105 ± 0.025 0.053 ± 0.006 0.049 ± 0.005 0.048 ± 0.003 0.048 ± 0.004
SSIM MB3 0.940 ± 0.017 0.967 ± 0.009 0.968 ± 0.008 0.968 ± 0.006 0.969 ± 0.006
MB4 0.931 ± 0.009 0.964 ± 0.006 0.965 ± 0.006 0.966 ± 0.004 0.967 ± 0.004

Training on simulated data
Training on acquired data
T2 MB factor DM RCA-UNet STUN RCA-UNet STUN

NRMSE MB3 0.193 ± 0.022 0.069 ± 0.007 0.056 ± 0.007 0.055 ± 0.007 0.051 ± 0.006
MB4 0.272 ± 0.035 0.073 ± 0.013 0.059 ± 0.009 0.058 ± 0.007 0.054 ± 0.007
SSIM MB3 0.859 ± 0.018 0.943 ± 0.018 0.946 ± 0.010 0.943 ± 0.008 0.947 ± 0.008
MB4 0.858 ± 0.037 0.935 ± 0.011 0.942 ± 0.008 0.938 ± 0.006 0.942 ± 0.005

Note: Values with the best score are highlighted with underlines.

Representative maps for all slices acquired in one SMS-MRF acquisition (MB factor, 3) using the STUN method and acquired training dataset are shown in Figure 6. The reference T1 and T2 maps, along with the results obtained using DM, are included for the central slice. No visual aliasing artifacts were noted from the simultaneously acquired slices using the proposed method and the findings are consistent with the results shown in Figure 4. Representative maps for all slices acquired in one SMS-MRF acquisition with a MB factor of 4 are shown in Figure 7. The T1 and T2 maps obtained with the proposed STUN method matched well with the reference maps obtained using the single-slice MRF method. With the additional reduction factor of 4 along the temporal direction, the effective acquisition time was 2 s per slice for the MB factor of 3 and 1.5 s per slice for the MB factor of 4. All these results suggest the proposed method has great potential in accelerating volumetric, quantitative brain imaging with SMS-MRF.

FIGURE 6 |.

FIGURE 6 |

SMS-MRF for quantitative neuroimaging (multi-band factor, 3). High-quality T1 and T2 maps were obtained using the proposed STUN method and acquired training dataset, outperforming the pattern matching method (only the results for the central slice are presented in the second column). The reference maps for the central slice was acquired from a separate single-slice 2D MRF (24 s/slice).

FIGURE 7 |.

FIGURE 7 |

Representative T1 and T2 maps of simultaneously acquired four slices from one SMS-MRF scan (MB factor 4; STUN trained with acquired dataset). The reference maps acquired from the same slice locations using single-slice 2D MRF are also presented, each taking 24 s/slice.

We further examined the effect of training dataset size on the performance of T1 and T2 quantification. Both the STUN and RCA-UNet, trained using the acquired training dataset, were evaluated (Figure 8). The results were averaged from two randomly selected testing subjects out of the seven subjects. Similar NRMSE values were observed for both network structures when using training datasets from either five or six subjects. A slight increase in the NRMSE values was noticed when the number of training subjects was reduced from five to two. Substantial increase was noticed when using data from only one subject for training. A similar trend was observed for SMS-MRF with a MB factor of 3 (Figure 8A) and 4 (Figure 8B).

FIGURE 8 |.

FIGURE 8 |

Effect of the size of training dataset on the performance of T1 and T2 quantification using the STUN and RCA-UNet networks for MB factors of 3 (A) and 4 (B). Only the results obtained from the actual acquired data are presented. Two subjects were randomly selected as the testing subject from a total of seven subjects. For each subject, the NRMSE values were weighted by the results obtained with all the available training dataset generated from the rest of six subjects. Multiple networks were trained by gradually reducing the size of training dataset generated from six to only one subject.

4 |. Discussion

In this study, we introduced a deep neural network utilizing factorized spatiotemporal convolutions for predicting high-quality T1 and T2 maps from highly-accelerated SMS-MRF acquisitions. Incorporating the proposed (2 + 1)D convolution in a U-Net allows the capturing of low- to high-level spatiotemporal information in a computationally efficient means, providing enriched information for effective tissue property matching. We compared the proposed STUN network with the RCA-UNet developed for single-slice MRF and the conventional dictionary matching approach. For SMS-MRF with MB factors of 3 and 4, our method achieved overall the best performance among all the compared methods.

A decoupled spatiotemporal processing was proposed to accelerate SMS-MRF acquisitions. The rationale for the proposed approach is based on the assumption that MRF data are not correlated similarly along the temporal and spatial dimensions. Unlike 3D U-Net where spatiotemporal information is encoded simultaneously, STUN uses 1D temporal convolutions to specifically capture the temporal dynamics of the signal and 2D spatial convolutions to enhance spatial continuity across adjacent voxels with similar tissue properties. It is worth noting that the separable (2 + 1)D convolution [22] used in our work differs from another factorized convolution, i.e., depth-wise separable convolutions [23, 24], in terms of both motivation and design. The depth-wise separable convolutions consist of channel-wise spatial kernels followed by point-wise 1×1 kernels that aggregate information across all channels. This design is intended to approximate a standard 2D convolution with reduced computational costs. In contrast, the (2 + 1)D convolution consists of a standard 2D convolution and a 1D convolution for the temporal dimension (instead of the channel dimension) aiming to surpass the regular 3D convolution with additional nonlinearity and eased optimization [25]. The performance of the proposed STUN method was compared with RCA-UNet, i.e., a 2D UNet [16] augmented with residual attention blocks [26]. RCA-UNet has been demonstrated to be more effective than a standard 2D UNet, achieving higher acceleration factors and better map quality for single-slice MRF by preserving high-frequency information in the estimated tissue property maps. In this study, STUN outperforms RCA-UNet, especially in T2 quantification for SMS-MRF, suggesting the advantage of the additional temporal convolutions in this application.

There are also non-deep learning studies exploring the utility of spatiotemporal information in the MRF data in improving reconstruction. For instance, Zhao et al. [8] proposed to exploit the spatiotemporal redundancies of the data for acquiring low-rank images before matching the images with the conventional dictionary matching method. The proposed method differs in that it aims to replace DM by treating spatiotemporal information as cues for building a more robust representation, which enhances the pattern-matching process to achieve more accurate tissue quantification. In other words, Zhao’s method focuses on obtaining higher-quality inputs with fewer errors/artifacts, while we aim to improve the robustness of the pattern-matching process.

In this study, two different approaches to prepare the training dataset were explored. For the high MB factor of 3 and 4, it is challenging to obtain high-quality reference T1 and T2 maps based on the SMS-MRF acquisition itself, even with a large number of time frames. One alternative is to use the signal time course from SMS-MRF as the input and the reference maps obtained using single-slice MRF at the same location as the output to train the network. But this method is sensitive to potential motions and mis-alignment between the two scans. While the simulated dataset used in the study mitigate this problem, the approach also faces challenges to precisely simulate the SMS-MRF acquisitions. For example, the potential cross-talking between simultaneously excited slices was not considered in the simulation, which could potentially lead to discrepancies between acquired and simulated SMS-MRF dataset and errors in tissue quantification. In this study, both network architectures (STUN and RCA-UNet), when coupled with the simulated training dataset, largely outperform the DM approach, with slightly inferior performance compared with the results obtained with acquired training dataset. It is worth noting that for the acquired dataset used in this study, all the dataset with visible motions between the MRF images and reference maps had been eliminated before the training process, largely reducing potential limitations related to motion. In addition, compared to RCA-UNet, substantial improvement was achieved with the proposed STUN method in T2 quantification, especially with the simulated dataset. In the current MRF sequence, multiple acquisition schemes, including the inversion recovery modules and variable flip angle patterns, have been used, yielding more robust quantification in T1 as compared to T2. The fact that STUN can further improve the T2 accuracy comparing to RCA-UNet suggests that useful information has been extracted with the additional temporal correlation in STUN. Future work will be performed to further improve the simulation of SMS-MRF acquisition for improved tissue characterization. While not crucial for stationary brain imaging, simulation of training dataset using single-slice MRF dataset for SMS-MRF has great potential for imaging moving organs, such as the liver and kidney, where achieving perfect alignment between single-slice MRF and SMS-MRF is technically challenging.

Besides SMS methods, rapid high-resolution volumetric MRF with 3D k-space encoding has been developed and accelerated using advanced imaging techniques. Multiple studies have been reported to achieve high spatial resolutions with 1 mm or 0.8 mm for brain imaging [2731]. However, with highly reduced data sampling in MRF, most of these techniques rely on advanced post-processing methods, including non-Cartesian parallel imaging or subspace reconstruction to improve the map quality, which increases the burden of the already complicated post-processing pipeline of the MRF technique itself and largely restricts their adoption for clinical applications [2830]. Unwanted head motion is another concern for most of the developed methods, which typically take ~5 min for whole-brain coverage. Advanced imaging and post-processing methods have thus been developed to further improve motion robustness of these 3D MRF techniques [32, 33]. However, the map quality and resolution vary dependent on the magnitude and duration of head motions. The deep-learning-accelerated SMS-MRF method has the potential to address these challenges. The deep learning methods have demonstrated superfast post-processing speed compared to dictionary matching and low-rank-based approaches [11, 15]. The superfast acquisition speed (6 s per acquisition; 1.5–2 s per imaging slice) further reduces its susceptibility to potential head motions. The developed method is thus suitable for challenging populations such as pediatric subjects and patients who cannot hold still during the MRI scans.

Beyond brain imaging, the SMS-MRF has potential to provide volumetric, quantitative imaging in the abdomen. Due to the challenges associated with respiratory motions, navigator-based approaches have been developed to provide volumetric tissue property mapping in the body [34]. While these methods can be performed during completely free breathing, the efficiency needs to be further improved and the accuracy of the acquired maps sometimes relies on the performance of data binning. With 1.5-s per imaging slice, the proposed SMS-M RF can serve as an alternative solution to these free-breathing methods. With a moderate slice thickness of 3–5 mm, whole-organ coverage (e.g. in the liver) can be achieved in 2–3 breath-holds, largely improving not only the efficiency but also the robustness in quantitative abdominal imaging.

There are some limitations in the current method. First, while quantitative tissue maps with reasonable quality were obtained, the accuracy of T1 and T2 values needs to be further improved. Since the ground truths are obtained via dictionary matching using fully-sampled single-slice MRF data, part of the quantitative errors might be attributed to some minor misalignment between the single-slice and multi-slice MRF dataset. Second, the developed method was trained and evaluated based on the MRF data acquired from healthy subjects. Its accuracy, particularly when applied to imaging patients with brain abnormalities, needs to be evaluated in future studies. Third, our method, like other deep-learning-based methods, needs large amounts of training data for network learning. While our experiment has shown that sufficient training dataset was used in this study to train both STUN and RCA-UNet networks, unsupervised deep learning approaches will be explored to mitigate this limitation. Fourth, with each MB factor, we trained one single network and applied it to all simultaneously acquired slices, without leveraging the potential correlation among neighboring slices. Similar as spatially-constrained network, useful information contained cross multiple slices could be utilized to further improve the quantification. Developing a specific trained network for each slice might provide better performance, however, at the cost of the requirement of a larger training dataset. Finally, due to the challenges associated with SMS-MRF as demonstrated in early studies, we designed and acquired SMS-MRF data up to the MB factor of 4. Even higher MB factors were not explored, which will be evaluated in future studies to further enhance the method’s capability in accelerating volumetric MRF imaging.

In conclusion, we introduced a promising deep learning-based approach for efficient and accurate T1 and T2 quantifications for highly accelerated multislice MRF acquisitions. A novel decoupled spatiotemporal regularized anti-aliasing neural network, i.e., STUN, is developed to enhance the robustness of the reconstruction method, in spite of severe in-plane and through-plane aliasing artifacts in SMS-MRF. Experimental results demonstrate that the proposed method surpasses the state-of-the-art methods developed for single-slice MRF in achieving higher MB factors (i.e., MB 4) with superior image quality.

Supplementary Material

sup2
sup1

Acknowledgments

This work made use of the High Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University. The code and datasets generated and analyzed during the current study are available from the corresponding author upon reasonable request.

Funding:

This work was supported by Siemens Healthineers and National Institutes of Health (R01CA266702, R01CA282516, R01NS134849, and R01EB035160).

Abbreviations:

DM

dictionary matching

MB

multiband

MRF

Magnetic Resonance Fingerprinting

NRMSE

normalized root-mean-square errors

RCA-UNet

residual channel attention U-Net

SMS

simultaneous multislice

SSIM

structural similarity index

STUN

spatiotemporal U-Net

Footnotes

Supporting Information

Additional supporting information can be found online in the Supporting Information section.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  • 1.Ma D, Gulani V, Seiberlich N, et al. , “Magnetic Resonance Fingerprinting,” Nature 495, no. 7440 (2013): 187–192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Chen Y, Jiang Y, Pahwa S, et al. , “MR Fingerprinting for Rapid Quantitative Abdominal Imaging,” Radiology 279, no. 1 (2016): 278–286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Gao Y, Chen Y, Ma D, et al. , “Preclinical MR Fingerprinting (MRF) at 7 T: Effective Quantitative Imaging for Rodent Disease Models,” NMR in Biomedicine 28, no. 3 (2015): 384–394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Ye H, Cauley SF, Gagoski B, et al. , “Simultaneous Multislice Magnetic Resonance Fingerprinting (SMS-MRF) With Direct-Spiral Slice-GRAPPA (ds-SG) Reconstruction,” Magnetic Resonance in Medicine 77, no. 5 (2017): 1966–1974. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Jiang Y, Ma D, Bhat H, et al. , “Use of Pattern Recognition for Unaliasing Simultaneously Acquired Slices in Simultaneous MultiSlice Magnetic Resonance Fingerprinting,” Magnetic Resonance in Medicine 78 (2017): 1870–1876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hamilton JI, Jiang Y, Ma D, et al. , “Simultaneous Multislice Cardiac Magnetic Resonance Fingerprinting Using Low Rank Reconstruction,” NMR in Biomedicine 32, no. 2 (2019): e4041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Ye H, Ma D, Jiang Y, et al. , “Accelerating Magnetic Resonance Fingerprinting (MRF) Using t-Blipped Simultaneous Multislice (SMS) Acquisition,” Magnetic Resonance in Medicine 75, no. 5 (2016): 2078–2085. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Zhao B, Bilgic B, Adalsteinsson E, Griswold MA, and Wald LL, “Simultaneous Multislice Magnetic Resonance Fingerprinting With Low-Rank and Subspace Modeling,” 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (2017), 3264–3268. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zhang X, Duchemin Q, Liu K, et al. , “Cramér–Rao Bound-Informed Training of Neural Networks for Quantitative MRI,” Magnetic Resonance in Medicine 88, no. 1 (2022): 436–448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Hermann I, Martínez-Heras E, Rieger B, et al. , “Accelerated White Matter Lesion Analysis Based on Simultaneous T1 and T2* Quantification Using Magnetic Resonance Fingerprinting and Deep Learning,” Magnetic Resonance in Medicine 86, no. 1 (2021): 471–486. [DOI] [PubMed] [Google Scholar]
  • 11.Cohen O, Zhu B, and Rosen MS, “MR Fingerprinting Deep RecOnstruction NEtwork (DRONE),” Magnetic Resonance in Medicine 80, no. 3 (2018): 885–894. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Hoppe E, Korzdorfer G, Nittka M, et al. , “Deep Learning for Magnetic Resonance Fingerprinting: Accelerating the Reconstruction of Quantitative Relaxation Maps,” 26th Annual Meeting of International Society for Magnetic Resonance in Medicine (2018), 2791. [Google Scholar]
  • 13.Oksuz I, Cruz G, Clough J, et al. , “Magnetic Resonance Fingerprinting Using Recurrent Neural Networks,” IEEE 16th Int Symp Biomed Imaging (2019), 1537–1540. [Google Scholar]
  • 14.Hochreiter S and Schmidhuber J, “Long Short-Term Memory,” Neural Computation 9, no. 8 (1997): 1735–1780. [DOI] [PubMed] [Google Scholar]
  • 15.Fang Z, Chen Y, Liu M, et al. , “Deep Learning for Fast and Spatially-Constrained Tissue Quantification From Highly-Accelerated Data in Magnetic Resonance Fingerprinting,” IEEE Transactions on Medical Imaging 38, no. 10 (2019): 2364–2374. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Fang Z, Chen Y, Hung SC, Zhang X, Lin W, and Shen D, “Submillimeter MR Fingerprinting Using Deep Learning-Based Tissue Quantification,” Magnetic Resonance in Medicine 84 (2020): 579–591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Jiang Y, Ma D, Seiberlich N, Gulani V, and Griswold MA, “MR Fingerprinting Using Fast Imaging With Steady State Precession (FISP) With Spiral Readout,” Magnetic Resonance in Medicine 74, no. 6 (2015): 1621–1631. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Chen Y, Chen M-H, Baluyot K, Potts T, Jimenez J, and Lin W, “MR Fingerprinting Enables Quantitative Measures of Brain Tissue Relaxation Times and Myelin Water Fraction in Early Brain Development,” NeuroImage 186 (2019): 782–793. [DOI] [PubMed] [Google Scholar]
  • 19.Mazor G, Weizman L, Tal A, and Eldar YC, “Low-Rank Magnetic Resonance Fingerprinting,” Medical Physics 45, no. 9 (2018): 4066–4084. [DOI] [PubMed] [Google Scholar]
  • 20.McGivney D, Pierre E, Ma D, et al. , “SVD Compression for Magnetic Resonance Fingerprinting in the Time Domain,” IEEE Transactions on Medical Imaging 33, no. 12 (2014): 2311–2322. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Zhao B, Setsompop K, Adalsteinsson E, et al. , “Improved Magnetic Resonance Fingerprinting Reconstruction With Low-Rank and Subspace Modeling,” Magnetic Resonance in Medicine 79, no. 2 (2017): 933–942. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Tran D, Wang H, Torresani L, Ray J, Lecun Y, and Paluri M, “A Closer Look at Spatiotemporal Convolutions for Action Recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), 6450–6459. [Google Scholar]
  • 23.Chollet F, “Xception: Deep Learning With Depthwise Separable Convolutions,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), 1251–1258. [Google Scholar]
  • 24.Howard AG, Zhu M, Chen B, et al. , “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” arXiv (2017): 1704.04861. [Google Scholar]
  • 25.Tran D, Bourdev L, Fergus R, Torresani L, and Paluri M, “Learning Spatiotemporal Features With 3D Convolutional Networks,” Proceedings of IEEE Conference on Computer Vision (2015), 4489–4497. [Google Scholar]
  • 26.Zhang Y, Li K, Li K, Wang L, Zhong B, and Fu Y, “Image Super-Resolution Using Very Deep Residual Channel Attention Networks,” European Conference on Computer Vision (2018), 286–301. [Google Scholar]
  • 27.Adeli E, Meng Y, Li G, Lin W, and Shen D, “Multi-Task Prediction of Infant Cognitive Scores From Longitudinal Incomplete Neuroimaging Data,” NeuroImage 185 (2019): 783–792. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Liao C, Bilgic B, Manhard MK, et al. , “3D MR Fingerprinting With Accelerated Stack-Of-Spirals and Hybrid Sliding-Window and GRAPPA Reconstruction,” NeuroImage 162 (2017): 13–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Chen Y, Fang Z, Hung S-C, Chang W-T, Shen D, and Lin W, “High-Resolution 3D MR Fingerprinting Using Parallel Imaging and Deep Learning,” NeuroImage 206 (2020): 116329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Cao X, Liao C, Iyer SS, et al. , “Optimized Multi-Axis Spiral Projection MR Fingerprinting With Subspace Reconstruction for Rapid Whole-Brain High-Isotropic-Resolution Quantitative Imaging,” Magnetic Resonance in Medicine 88, no. 1 (2022): 133–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Cao X, Ye H, Liao C, Li Q, He H, and Zhong J, “Fast 3D Brain MR Fingerprinting Based on Multi-Axis Spiral Projection Trajectory,” Magnetic Resonance in Medicine 82, no. 1 (2019): 289–301. [DOI] [PubMed] [Google Scholar]
  • 32.Kurzawski JW, Cencini M, Peretti L, et al. , “Retrospective Rigid Motion Correction of Three-Dimensional Magnetic Resonance Fingerprinting of the Human Brain,” Magnetic Resonance in Medicine 84, no. 5 (2020): 2606–2615. [DOI] [PubMed] [Google Scholar]
  • 33.Hu S, Chen Y, Zong X, Lin W, Griswold M, and Ma D, “Improving Motion Robustness of 3D MR Fingerprinting With a Fat Navigator,” Magnetic Resonance in Medicine 90, no. 5 (2023): 1802–1817. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Huang S, Boyacioglu R, Bolding R, Chen Y, and Griswold MA, “Free-Breathing Abdominal Magnetic Resonance Fingerprinting Using a Pilot Tone Navigator,” Journal of Magnetic Resonance Imaging 54, no. 4 (2021): 1138–1151. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sup2
sup1

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

RESOURCES