Abstract
High spatial and temporal resolution across the whole brain is essential to accurately resolve neural activities in fMRI. Therefore, accelerated imaging techniques target improved coverage with high spatio-temporal resolution. Simultaneous multi-slice (SMS) imaging combined with in-plane acceleration are used in large studies that involve ultrahigh field fMRI, such as the Human Connectome Project. However, for even higher acceleration rates, these methods cannot be reliably utilized due to aliasing and noise artifacts. Deep learning (DL) reconstruction techniques have recently gained substantial interest for improving highly-accelerated MRI. Supervised learning of DL reconstructions generally requires fully-sampled training datasets, which is not available for high-resolution fMRI studies. To tackle this challenge, self-supervised learning has been proposed for training of DL reconstruction with only undersampled datasets, showing similar performance to supervised learning. In this study, we utilize a self-supervised physics-guided DL reconstruction on a 5-fold SMS and 4-fold in-plane accelerated 7T fMRI data. Our results show that our self-supervised DL reconstruction produce high-quality images at this 20-fold acceleration, substantially improving on existing methods, while showing similar functional precision and temporal effects in the subsequent analysis compared to a standard 10-fold accelerated acquisition.
I. INTRODUCTION
Functional MRI (fMRI) has been crucial in expanding our understanding of human perception and cognition [1]. Even though significant progress has been made in fMRI acquisition and reconstruction [2], additional improvements for coverage and resolution are imperative to contribute to our broader understanding of the brain function.
As fMRI acquisitions have become a standard part of large-scale studies, such as the Human Connectome Project (HCP) [3], several accelerated MRI methods have been utilized for improved resolution and coverage. Simultaneous multi-slice (SMS) imaging is commonly used in functional neuroimaging with its ability for rapid high-resolution with whole brain coverage [4], [5]. SMS imaging is typically further combined with in-plane acceleration for improving the resolution of fMRI at ultrahigh field strengths, while maintaining a reasonable echo time [2]. These accelerated acquisitions are then reconstructed using linear SMS and parallel imaging reconstruction [4]–[6]. However such reconstructions suffer from spatially varying noise amplification [7], [8], and may suffer from inter and intra-slice residual aliasing artifacts [9] which will further be exacerbated when operating at higher acceleration rates.
Recently, deep learning (DL) methods have gained immense interest as an alternative to improve accelerated MRI [10]–[13]. In particular, physics-guided DL methods have emerged as a powerful strategy due to their robustness and improved reconstruction quality [11]–[14], showing promising results at higher acceleration rates where conventional methods struggle to maintain high quality reconstruction. In general, DL reconstruction methods are trained in a supervised manner, using fully-sampled data as reference [10]–[13]. However, such reference data is not available in highly-accelerated fMRI acquisitions. Recently, self-supervised learning has been proposed for training physics-guided DL reconstruction using only undersampled data, showing similar performance to supervised learning [15].
In this work, we adapt a recent self-supervised DL [15] strategy to reconstruct an SMS-accelerated HCP-style retinotopic mapping acquisition and show that the results of subsequent fMRI analysis are unaltered by the use of such a regularized non-linear DL reconstruction. In doing so, we establish the feasibility of reconstructing 20-fold accelerated 7T fMRI data using self-supervised DL reconstruction. Results show that the proposed DL reconstruction substantially improves upon conventional methods at the 20-fold acceleration rate, while maintaining comparable signal fidelity to conventional methods at standard 10-fold acceleration for fMRI analysis.
II. METHODS
A. Self-supervised DL Reconstruction for SMS fMRI
The forward model for SMS acquisition is given as:
(1) |
where is the acquired SMS k-space, Ω is the in-plane undersampling pattern, S is number of simultaneously excited slices, is the ith slice image, is the multi-coil encoding operator of the ith slice, and n is measurement noise. The notation can be condensed by letting xSMS be the concatenation of the individual slices along the readout direction [4], [16], and , which yields
(2) |
For high acceleration rates, this system is generally ill-conditioned. Thus, a regularized least squares formulation may be used for the inverse problem:
(3) |
where the first term enforces data consistency (DC) with the acquired measurements and is a regularizer. This least squares problem is typically solved by using iterative techniques that decouples the DC term and the regularizer into a series of sub-problems [17]. One such solution is variable splitting with quadratic penalty [13], [15]:
(4) |
(5) |
where is the reconstructed slices at iteration l, while is an auxiliary variable and μ is the penalty parameter. This iterative algorithm is unrolled for a fixed number of iterations in physics-guided DL reconstruction [10], and sub-problem (4) is solved implicitly using neural networks, while (5) is solved by conjugate gradient [13]. The network output for a given test dataset can be written as , where θ are the learnable parameters of the unrolled network.
Although physics-guided neural networks are conventionally trained in a supervised manner, the lack of reference training data for high-resolution fMRI studies prevents this strategy. However, a recent work entitled Self-Supervised learning via Data Undersampling (SSDU) has shown similar performance to supervised learning by splitting the acquired locations, Ω, into two disjoint sets, Θ and Ω, where Θ is used in the DC units of the unrolled network and Λ is used to define the k-space loss [15]. A multi-mask version of SSDU has also been proposed to improve reconstruction quality at very high acceleration rates by utilizing these disjoints sets multiple times [18], [19]. In this study, multi-mask SSDU is used with the following loss function:
(6) |
where N is the number of training data in the database, K is the number of multi-maks (Ω = Θk ∪Λk, k ∈ 1, …, K), is the k-space loss between the unseen acquired points and the network output and the network is parametrized by θ. The kth training and loss masks used on the nth training data sample are represented as Λk, n and Θk, n, respectively. A schematic of the implementation is shown in Figure 1.
B. Imaging Experiments
Imaging was performed on a 7T Siemens Magnex Scientific (Siemens Healthineers, Erlangen, Germany) system with a 32-channel receiver head coil-array. 8 subjects were imaged using the HCP-retinotopy protocol detailed in [20]. The study was approved by our institutional review board, and written informed consent was obtained before each examination.
For each subject, six 5-minute acquisitions were obtained with the protocol in [20]. In this study, the first 2 experimental runs, which are the rotating wedge (RETCCW and RETCW) retinotopic mapping paradigms, were utilized. RETCCW was collected with an anterior to posterior phase encoding direction, while RETCW was collected with a posterior to anterior phase encoding direction. The following imaging parameters were used similar to the 7T HCP fMRI protocol [2]: SMS factor = 5, in-plane acceleration = 2, resolution: 1.6mm isotropic, TR = 1s and whole-brain coverage. In both runs, the subjects viewed wedges which completed a rotation every 32 seconds (0.3125 Hz). These data were used as the retinotopic representation within the visual cortex.
The standard acquisition at 5-fold SMS and 2-fold in-plane acceleration (overall 10-fold acceleration), reconstructed with conventional split-slice GRAPPA (SPSG) [6] was used as the baseline for image quality and fMRI analyses. These acquired k-space data were further retrospectively undersampled to 5-fold SMS and 4-fold in-plane acceleration (overall 20-fold acceleration), which were then reconstructed using the proposed self-supervised DL reconstruction, as well as conventional SPSG. In the following, SMS5×R2 (as 10-fold acceleration) notation is used for the standard acquisition, while SMS5×R4 (as 20-fold acceleration) notation is used for the retrospectively sub-sampled datasets.
C. Functional Processing and Retinotopic Analyses
The reconstructions were each minimally processed identically using afni_proc.py, a tool available with AFNI [21]. To account for differences from the different phase encoding directions, distortion correction was performed. Subsequently, motion correction was performed using the distortion-corrected median image as a registration target. These distortion- and motion-corrected data were then scaled such that each voxel had a mean of 100. The processed data were then imported into MATLAB. The clockwise wedge data were time-reversed and both datasets were shifted to align the hemodynamic responses for identical wedge positions. Next, 9 nuisance regressors were projected out from each dataset. These were polynomials up to order 3 and the 6 (rigid-body) motion estimates from the motion-correction step. The mean of the two datasets were calculated and an FFT was performed to determine the amplitude and phase at 0.3125Hz. In order to threshold the data, the coherence between the two runs was also calculated using the MATLAB tool mscohere. The phase maps were compared for each reconstruction using a coherence threshold of 0.4.
A region of interest was then created from the baseline SMS5×R2 data reconstructed with SPSG using this threshold. A number of metrics were calculated for this region. This included the average absolute phase error, relative to the original SMS5×R2 SPSG reconstruction, of the deep learning and SPSG reconstructions of the SMS5×R4 accelerated data. Additionally, temporal signal to noise ratio (tSNR) was calculated for each reconstruction, estimated as the mean of the signal after processing divided by the standard deviation of the residual following the removal of nuisance regressors. Circular statistics were handled using CircStat [22].
D. Implementation Details
Multi-mask SSDU [15], [18], [19] was used with K = 6 masks to train a physics-guided DL reconstruction network. The sub-problems in (4) and (5) were unrolled for 10 iterations. Each iteration contains a DC unit and a regularizer. The former was solved by 10 conjugate gradient steps [13], while the latter was solved by a convolutional neural network that utilizes a ResNet structure [23]. ESPIRiT was used to generate sensitivity maps from low-resolution calibration scans [24]. The network was trained using Adam optimizer with an ℓ1-ℓ2 loss in k-space [15] with a learning rate of 3 · 10−4 over 80 epochs. The network had a total of 592,129 trainable parameters.
For SMS imaging, to avoid boundary artifacts during regularization, the CAIPI [5] shifts were removed to position each slice to the middle of the FOV before the regularizer ResNet unit, and re-arranged back to the acquired CAIPI shifts before the DC units. The ResNet unit operated on a 2-channel input (real and imaginary), and the SMS = 5 slices were concatenated along the readout direction, as detailed in Section II-A.
Training was performed on 17 slab groups from 4 subjects. Only one time-frame per subject was used. Therefore no temporal correlations/redundancies were exploited by the DL reconstruction. Testing was performed on the whole volume and whole time series of 4 subjects unseen by the network. All training was performed using TensorFlow in Python.
III. RESULTS
SPSG reconstruction at SMS5×R2 is shown in Figure 2 (first column) as the baseline image quality. For the SMS5×R4 accelerated data reconstructed with both SPSG and self-supervised DL, visible noise reduction is observed with the proposed self-supervised DL reconstruction. Compared to SPSG reconstruction at 20-fold acceleration, self-supervised DL removes residual aliasing artifacts leading to a closer match to the reference image quality. Substantial improvement is observed in thalamus region (third slice) with the proposed DL method.
Fig. 3 depicts tSNR maps of axial, sagittal and coronal slices of a subject reconstructed with SPSG and self-supervised DL reconstructions prior to any processing for distortion or motion. The first column depicts the baseline 10-fold accelerated acquisition using SPSG reconstruction. Self-supervised DL reconstruction substantially improves upon SPSG at SMS5×R4 acceleration showing good fit to reference tSNR maps. Mean and standard deviation of the tSNR values in a 10 × 10 × 10 voxel (gray boxes) agree with these visual observations. 10-fold acceleration has the highest tSNR value (18.99±0.32) while SPSG at SMS5×R4 acceleration has the lowest tSNR value (8.94±0.25). Self-supervised DL reconstruction at SMS5×R4 acceleration leads to a substantially improved tSNR value (17.38±0.33) close to baseline tSNR of the 10-fold accelerated acquisition.
Following distortion and motion correction, the average of the absolute difference in phase estimates across subjects was 0.36±0.02 radians for the SMS5×R4 self-supervised DL and 0.46±0.40 radians for the SMS5×R4 SPSG reconstructions, within the ROI derived from the original SMS5×R2 SPSG reconstruction. In this setting, the average tSNR within the ROI was 32.5±3.2 for SMS5×R2 SPSG, 22.5±2.0 for SMS5×R4 self-supervised DL and 15.1±1.5 for the SMS5×R4 SPSG reconstructions.
IV. DISCUSSION
We proposed a self-supervised physics-guided deep learning reconstruction for highly-accelerated HCP-style fMRI. A retrospective study with SMS5×R4 acceleration was evaluated both for image reconstruction and subsequent fMRI analysis. Proposed self-supervised DL reconstruction showed improved image quality compared to existing SPSG reconstruction. tSNR improvement is observed with self-supervised DL reconstruction compared to SPSG. Furthermore, fMRI analysis shows that self-supervised DL reconstruction has higher coherence than the SPSG reconstruction in the visual cortex similar to standard acquisition.
For the 20-fold accelerated data, the self-supervised DL reconstruction strongly outperforms the SPSG reconstruction across multiple metrics, including tSNR and absolute difference in phase estimates. For the latter, substantial improvement in standard deviation is observed in addition to a 20.40±0.96% improvement in the mean, highlighting less extreme errors in phase estimations. This is also apparent from the images themselves, which possess visible artifacts for SMS5×R4 SPSG reconstruction. With the DL reconstruction, these artifacts are no longer visible, and the image structure resembles the original SMS5×R2 data. The self-supervised DL functional data are also associated with higher tSNR compared to the SMS5×R4 SPSG reconstruction.
While the SMS5×R4 DL reconstruction do not fully reproduce the SMS5×R2 case, the phase maps in Figure 4 suggest that much of the important temporal information is retained across the two runs. The activation patterns are robust, and retinotopic mapping is preserved, leading to highly correlated phase values and low error in phase estimation.
V. Conclusion
The proposed self-supervised DL reconstruction reduced residual artifacts and improved the tSNR compared to existing reconstruction technique at 20-fold accelerated fMRI acquisition. The fMRI analysis led to similar results compared to standard acquisition, showing that self-supervised DL reconstruction did not alter the temporal effects or functional spatial precision.
ACKNOWLEDGMENTS
First two authors contributed equally to this work. Grant support: NIH, Grant numbers: P30NS076408, R01HL153146, U01EB025144, P41EB027061; NSF, Grant number: CAREER CCF-1651825.
References
- [1].Logothetis NK, “What we can do and what we cannot do with fMRI,” Nature, vol. 453, no. 7197, pp. 869–878, 2008. [DOI] [PubMed] [Google Scholar]
- [2].Uğurbil K, Xu J, et al. , “Pushing spatial and temporal resolution for functional and diffusion MRI in the Human Connectome Project,” Neuroimage, vol. 80, pp. 80–104, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Van Essen DC, Smith SM, et al. , “The WU-Minn Human Connectome Project: an overview,” Neuroimage, vol. 80, pp. 62–79, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Moeller S, Yacoub E, et al. , “Multiband multislice GE-EPI at 7 tesla, with 16-fold acceleration using partial parallel imaging with application to high spatial and temporal whole-brain fMRI,” Magn Reson Med, vol. 63, no. 5, pp. 1144–1153, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Setsompop K, Gagoski BA, et al. , “Blipped-controlled aliasing in parallel imaging for simultaneous multislice echo planar imaging with reduced g-factor penalty,” Magn Reson Med, vol. 67, no. 5, pp. 1210–1224, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Cauley SF, Polimeni JR, Bhat H, Wald LL, and Setsompop K, “Interslice leakage artifact reduction technique for simultaneous multislice acquisitions,” Magn Reson Med, vol. 72, no. 1, pp. 93–102, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Hamilton J, Franson D, and Seiberlich N, “Recent advances in parallel imaging for MRI,” Prog NMR Spec, vol. 101, pp. 71–95, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Barth M, Breuer F, Koopmans PJ, Norris DG, and Poser BA, “Simultaneous multislice (SMS) imaging techniques,” Magn Reson Med, vol. 75, no. 1, pp. 63–81, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Todd N, Moeller S, et al. , “Evaluation of 2D multiband EPI imaging for high-resolution, whole-brain, task-based fMRI studies at 3T: Sensitivity and slice leakage artifacts,” Neuroimage, vol. 124, pp. 32–42, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Knoll F, Hammernik K, et al. , “Deep-learning methods for parallel magnetic resonance imaging reconstruction: A survey of the current approaches, trends, and issues,” IEEE Sig Proc Mag, vol. 37, no. 1, pp. 128–140, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Hammernik K, Klatzer T, et al. , “Learning a variational network for reconstruction of accelerated MRI data,” Magn Reson Med, vol. 79, no. 6, pp. 3055–3071, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Schlemper J, Caballero J, Hajnal JV, Price AN, and Rueckert D, “A deep cascade of convolutional neural networks for dynamic MR image reconstruction,” IEEE Trans Med Imaging, vol. 37, no. 2, pp. 491–503, 2017. [DOI] [PubMed] [Google Scholar]
- [13].Aggarwal HK, Mani MP, and Jacob M, “MoDL: Model-based deep learning architecture for inverse problems,” IEEE Trans Med Imaging, vol. 38, no. 2, pp. 394–405, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Hosseini SAH, Yaman B, Moeller S, Hong M, and Akçakaya M, “Dense recurrent neural networks for accelerated MRI: History-cognizant unrolling of optimization algorithms,” IEEE J Sel Top Signal Process, vol. 14, no. 6, pp. 1280–1291, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Yaman B, Hosseini SAH, et al. , “Self-supervised learning of physics-guided reconstruction neural networks without fully sampled reference data,” Magn Reson Med, vol. 84, no. 6, pp. 3172–3191, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Demirel OB, Weingärtner S, Moeller S, and Akçakaya M, “Im-¨ proved simultaneous multislice cardiac MRI using readout concatenated k-space SPIRiT (ROCK-SPIRiT),” Magn Reson Med, vol. 85, no. 6, pp. 3036–3048, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Fessler JA, “Optimization methods for magnetic resonance image reconstruction: Key models and optimization algorithms,” IEEE Sig Proc Mag, vol. 37, no. 1, pp. 33–40, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].Yaman B, Hosseini SAH, et al. , “Multi-mask self-supervised learning for physics-guided neural networks in highly accelerated MRI,” arXiv preprint arXiv:2008.06029, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Yaman B, Hosseini SAH, et al. , “Ground-truth free multi-mask self-supervised physics-guided deep learning in highly accelerated MRI,” in Proc. ISBI IEEE, 2021. [Google Scholar]
- [20].Benson NC, Jamison KW, et al. , “The Human Connectome Project 7 Tesla retinotopy dataset: Description and population receptive field analysis,” Journal of Vision, vol. 18, no. 13, pp. 23–23, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Cox RW, “AFNI: software for analysis and visualization of functional magnetic resonance neuroimages,” Computers and Biomedical Research, vol. 29, no. 3, pp. 162–173, 1996. [DOI] [PubMed] [Google Scholar]
- [22].Berens P et al. , “CircStat: a MATLAB toolbox for circular statistics,” J Stat Softw, vol. 31, no. 10, pp. 1–21, 2009. [Google Scholar]
- [23].Timofte R, Agustsson E, Van Gool L, Yang M-H, and Zhang L, “NTIRE 2017 challenge on single image super-resolution: Methods and results,” in Proc IEEE CVPR, 2017, pp. 114–125. [Google Scholar]
- [24].Uecker M, Lai P, et al. , “ESPIRiT—an eigenvalue approach to autocalibrating parallel MRI: where SENSE meets GRAPPA,” Magn Reson Med, vol. 71, no. 3, pp. 990–1001, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]