Skip to main content
. Author manuscript; available in PMC: 2024 Apr 1.
Published in final edited form as: Med Image Anal. 2023 Jan 31;85:102762. doi: 10.1016/j.media.2023.102762

Table 4.

The summarized review of Transformer-based model for medical image reconstruction and enhancement. "N" denotes not reported or not applicable on model parameters.

Reference Architecture 2D/3D #Param Modality Dataset ViT as Enc/Inter/Dec Highlights
Reconstruction ReconFormer (Guo et al., 2022d) Conv-Transformer Hybrid 2D 1.414M MRI fastMRI (Knoll et al.,2020), HPKS (Jiang etal.,2019a) No/Yes/No The Pyramid Transformer Layer (PTL) introduces a locally pyramidal but globally columnar structure.
DSFormer (Zhou et al., 2022b) Conv-Transformer Hybrid 2D 0.18M MRI Multi-coil Brain Data from IXI* No/Yes/No The proposed Swin Transformer Reconstruction Network enables a self-supervised reconstruction process with lightweight backbone.
SLATER (Korkmaz et al., 2022) Conv-Transformer Hybrid 2D N MRI Single-coil Brain Data from IXI*, Multi-coil Brain Data from fastMRI (Knoll et al., 2020) No/Yes/Yes An unsupervised MRI reconstruction design with the long-range dependency of Transformers.
DuDoCAF (Lyu et al., 2022) Conv-Transformer Hybrid 2D 1.428M MRI fastMRI (Knoll et al.,2020), Clinical Brain MRI Dataset No/Yes/No The proposed recurrent blocks with transformers are employed to capture long-range dependencies from the fused multi-contrast features maps, which boosts target-contrast under-sampled imaging.
SDAUT (Huang et al., 2022a) Conv-Transformer Hybrid 2D N MRI Calgary Campinas dataset (Souza et al., 2018 No/Yes/No The proposed U-Net-based Transformer combines dense and sparse deformable attention in separate stages, improving performances and speed while revealing explainability.
MIST-net (Pan et al., 2021) Conv-Transformer Hybrid 2D 12.0M CT NIH-AAPM-Mayo (McCollough, 2016 No/Yes/No The Swin Transformer and convolutional layers are combined in the High-definition Reconstruction Module, achieving high-quality reconstruction.
DuDoTrans (Wang et al., 2021a) Conv-Transformer Hybrid 2D 0.44M CT NIH-AAPM-Mayo (McCollough, 2016), COVID-19 No/Yes/No The Sinogram Restoration Transformer (SRT) Module is proposed for projection domain enhancement, improving sparse-view CT reconstruction.
FIT (Buchholz and Jug, 2021) Conventional Transformer 2D N CT LoDoPaB (Leuschner et al. 2021, Yes/No/Yes The carefully designed FDE representations mitigate the computational burden of traditional Transformer structures in the image domain.
RegFormer (Xia et al., 2022a) Conv-Transformer Hybrid 2D N CT NIH-AAPM-Mayo (McCollough, 2016 Yes/Yes/Yes The unrolled iterative scheme is redesigned with transformer encoders and decoders for learning nonlocal prior, alleviating the sparse-view artifacts.
Enhancement TransCT (Zhang et al., 2021e) Conv-Transformer Hybrid 2D N CT NIH-AAPM-Mayo (McCollough, 2016), Clinical CBCT Images No/Yes/No Decomposing Low Dose CT (LDCT) into high and low frequency parts, and then denoise the blurry high-frequency part with the basic Transformer structure
TED-Net (Wang et al., 2021b) Conv-like Transformer 2D N CT NIH-AAPM-Mayo (McCollough, 2016 Yes/Yes/Yes Their design makes use of the tokenization and detokenization operations in the convolution-free encoder-decoder architecture.
Eformer (Luthra et al., 2021) Conv-Transformer Hybrid 2D N CT NIH-AAPM-Mayo (McCollough, 2016 Yes/Yes/Yes A residual Transformer is proposed, which redesigns the residual block in the denoising encoder-decoder architecture with nonoverlapping window-based Multi-head Self-Attention (MSA).
TVSRN (Yu et al., 2022a) Conv-like Transformer 3D 1.73M CT RPLHR-CT dataset Yes/Yes/Yes They design an asymmetric encoder-decoder architecture composed of pure transformers. The structure efficiently models the context relevance in CT volumes and the long-range dependencies.
T2Net(Feng et al., 2021) Conv-Transformer Hybrid 2D N MRI Single-coil Brain Data from IXI*, Clinical Brain MRI Dataset No/Yes/Yes The task Transformer module is designed in a multi-task learning process of super-resolution and reconstruction, and the super-resolution features are enriched with the low-resolution reconstruction features
WavTrans(Li et al., 2022a) Conv-Transformer Hybrid 2D 2.102M MRI fastMRI (Knoll et al., 2020), Clinical Brain MRI Dataset No/Yes/No The Residual Cross-attention Swin Transformer is proposed to deal with cross-modality features and boost target contrast MRI super-resolution.