Table 4.
The summarized review of Transformer-based model for medical image reconstruction and enhancement. "N" denotes not reported or not applicable on model parameters.
Reference | Architecture | 2D/3D | #Param | Modality | Dataset | ViT as Enc/Inter/Dec | Highlights | |
---|---|---|---|---|---|---|---|---|
Reconstruction | ReconFormer (Guo et al., 2022d) | Conv-Transformer Hybrid | 2D | 1.414M | MRI | fastMRI (Knoll et al.,2020), HPKS (Jiang etal.,2019a) | No/Yes/No | The Pyramid Transformer Layer (PTL) introduces a locally pyramidal but globally columnar structure. |
DSFormer (Zhou et al., 2022b) | Conv-Transformer Hybrid | 2D | 0.18M | MRI | Multi-coil Brain Data from IXI* | No/Yes/No | The proposed Swin Transformer Reconstruction Network enables a self-supervised reconstruction process with lightweight backbone. | |
SLATER (Korkmaz et al., 2022) | Conv-Transformer Hybrid | 2D | N | MRI | Single-coil Brain Data from IXI*, Multi-coil Brain Data from fastMRI (Knoll et al., 2020) | No/Yes/Yes | An unsupervised MRI reconstruction design with the long-range dependency of Transformers. | |
DuDoCAF (Lyu et al., 2022) | Conv-Transformer Hybrid | 2D | 1.428M | MRI | fastMRI (Knoll et al.,2020), Clinical Brain MRI Dataset | No/Yes/No | The proposed recurrent blocks with transformers are employed to capture long-range dependencies from the fused multi-contrast features maps, which boosts target-contrast under-sampled imaging. | |
SDAUT (Huang et al., 2022a) | Conv-Transformer Hybrid | 2D | N | MRI | Calgary Campinas dataset (Souza et al., 2018 | No/Yes/No | The proposed U-Net-based Transformer combines dense and sparse deformable attention in separate stages, improving performances and speed while revealing explainability. | |
MIST-net (Pan et al., 2021) | Conv-Transformer Hybrid | 2D | 12.0M | CT | NIH-AAPM-Mayo (McCollough, 2016 | No/Yes/No | The Swin Transformer and convolutional layers are combined in the High-definition Reconstruction Module, achieving high-quality reconstruction. | |
DuDoTrans (Wang et al., 2021a) | Conv-Transformer Hybrid | 2D | 0.44M | CT | NIH-AAPM-Mayo (McCollough, 2016), COVID-19 | No/Yes/No | The Sinogram Restoration Transformer (SRT) Module is proposed for projection domain enhancement, improving sparse-view CT reconstruction. | |
FIT (Buchholz and Jug, 2021) | Conventional Transformer | 2D | N | CT | LoDoPaB (Leuschner et al. 2021, | Yes/No/Yes | The carefully designed FDE representations mitigate the computational burden of traditional Transformer structures in the image domain. | |
RegFormer (Xia et al., 2022a) | Conv-Transformer Hybrid | 2D | N | CT | NIH-AAPM-Mayo (McCollough, 2016 | Yes/Yes/Yes | The unrolled iterative scheme is redesigned with transformer encoders and decoders for learning nonlocal prior, alleviating the sparse-view artifacts. | |
Enhancement | TransCT (Zhang et al., 2021e) | Conv-Transformer Hybrid | 2D | N | CT | NIH-AAPM-Mayo (McCollough, 2016), Clinical CBCT Images | No/Yes/No | Decomposing Low Dose CT (LDCT) into high and low frequency parts, and then denoise the blurry high-frequency part with the basic Transformer structure |
TED-Net (Wang et al., 2021b) | Conv-like Transformer | 2D | N | CT | NIH-AAPM-Mayo (McCollough, 2016 | Yes/Yes/Yes | Their design makes use of the tokenization and detokenization operations in the convolution-free encoder-decoder architecture. | |
Eformer (Luthra et al., 2021) | Conv-Transformer Hybrid | 2D | N | CT | NIH-AAPM-Mayo (McCollough, 2016 | Yes/Yes/Yes | A residual Transformer is proposed, which redesigns the residual block in the denoising encoder-decoder architecture with nonoverlapping window-based Multi-head Self-Attention (MSA). | |
TVSRN (Yu et al., 2022a) | Conv-like Transformer | 3D | 1.73M | CT | RPLHR-CT† dataset | Yes/Yes/Yes | They design an asymmetric encoder-decoder architecture composed of pure transformers. The structure efficiently models the context relevance in CT volumes and the long-range dependencies. | |
T2Net(Feng et al., 2021) | Conv-Transformer Hybrid | 2D | N | MRI | Single-coil Brain Data from IXI*, Clinical Brain MRI Dataset | No/Yes/Yes | The task Transformer module is designed in a multi-task learning process of super-resolution and reconstruction, and the super-resolution features are enriched with the low-resolution reconstruction features | |
WavTrans(Li et al., 2022a) | Conv-Transformer Hybrid | 2D | 2.102M | MRI | fastMRI (Knoll et al., 2020), Clinical Brain MRI Dataset | No/Yes/No | The Residual Cross-attention Swin Transformer is proposed to deal with cross-modality features and boost target contrast MRI super-resolution. |