. Author manuscript; available in PMC: 2024 Apr 1.

Published in final edited form as: Med Image Anal. 2023 Jan 31;85:102762. doi: 10.1016/j.media.2023.102762

Table 4.

The summarized review of Transformer-based model for medical image reconstruction and enhancement. "N" denotes not reported or not applicable on model parameters.

	Reference	Architecture	2D/3D	#Param	Modality	Dataset	ViT as Enc/Inter/Dec	Highlights
Reconstruction	ReconFormer (Guo et al., 2022d)	Conv-Transformer Hybrid	2D	1.414M	MRI	fastMRI (Knoll et al.,2020), HPKS (Jiang etal.,2019a)	No/Yes/No	The Pyramid Transformer Layer (PTL) introduces a locally pyramidal but globally columnar structure.
	DSFormer (Zhou et al., 2022b)	Conv-Transformer Hybrid	2D	0.18M	MRI	Multi-coil Brain Data from IXI^*	No/Yes/No	The proposed Swin Transformer Reconstruction Network enables a self-supervised reconstruction process with lightweight backbone.
	SLATER (Korkmaz et al., 2022)	Conv-Transformer Hybrid	2D	N	MRI	Single-coil Brain Data from IXI^*, Multi-coil Brain Data from fastMRI (Knoll et al., 2020)	No/Yes/Yes	An unsupervised MRI reconstruction design with the long-range dependency of Transformers.
	DuDoCAF (Lyu et al., 2022)	Conv-Transformer Hybrid	2D	1.428M	MRI	fastMRI (Knoll et al.,2020), Clinical Brain MRI Dataset	No/Yes/No	The proposed recurrent blocks with transformers are employed to capture long-range dependencies from the fused multi-contrast features maps, which boosts target-contrast under-sampled imaging.
	SDAUT (Huang et al., 2022a)	Conv-Transformer Hybrid	2D	N	MRI	Calgary Campinas dataset (Souza et al., 2018	No/Yes/No	The proposed U-Net-based Transformer combines dense and sparse deformable attention in separate stages, improving performances and speed while revealing explainability.
	MIST-net (Pan et al., 2021)	Conv-Transformer Hybrid	2D	12.0M	CT	NIH-AAPM-Mayo (McCollough, 2016	No/Yes/No	The Swin Transformer and convolutional layers are combined in the High-definition Reconstruction Module, achieving high-quality reconstruction.
	DuDoTrans (Wang et al., 2021a)	Conv-Transformer Hybrid	2D	0.44M	CT	NIH-AAPM-Mayo (McCollough, 2016), COVID-19	No/Yes/No	The Sinogram Restoration Transformer (SRT) Module is proposed for projection domain enhancement, improving sparse-view CT reconstruction.
	FIT (Buchholz and Jug, 2021)	Conventional Transformer	2D	N	CT	LoDoPaB (Leuschner et al. 2021,	Yes/No/Yes	The carefully designed FDE representations mitigate the computational burden of traditional Transformer structures in the image domain.
	RegFormer (Xia et al., 2022a)	Conv-Transformer Hybrid	2D	N	CT	NIH-AAPM-Mayo (McCollough, 2016	Yes/Yes/Yes	The unrolled iterative scheme is redesigned with transformer encoders and decoders for learning nonlocal prior, alleviating the sparse-view artifacts.
Enhancement	TransCT (Zhang et al., 2021e)	Conv-Transformer Hybrid	2D	N	CT	NIH-AAPM-Mayo (McCollough, 2016), Clinical CBCT Images	No/Yes/No	Decomposing Low Dose CT (LDCT) into high and low frequency parts, and then denoise the blurry high-frequency part with the basic Transformer structure
	TED-Net (Wang et al., 2021b)	Conv-like Transformer	2D	N	CT	NIH-AAPM-Mayo (McCollough, 2016	Yes/Yes/Yes	Their design makes use of the tokenization and detokenization operations in the convolution-free encoder-decoder architecture.
	Eformer (Luthra et al., 2021)	Conv-Transformer Hybrid	2D	N	CT	NIH-AAPM-Mayo (McCollough, 2016	Yes/Yes/Yes	A residual Transformer is proposed, which redesigns the residual block in the denoising encoder-decoder architecture with nonoverlapping window-based Multi-head Self-Attention (MSA).
	TVSRN (Yu et al., 2022a)	Conv-like Transformer	3D	1.73M	CT	RPLHR-CT^† dataset	Yes/Yes/Yes	They design an asymmetric encoder-decoder architecture composed of pure transformers. The structure efficiently models the context relevance in CT volumes and the long-range dependencies.
	T²Net(Feng et al., 2021)	Conv-Transformer Hybrid	2D	N	MRI	Single-coil Brain Data from IXI^*, Clinical Brain MRI Dataset	No/Yes/Yes	The task Transformer module is designed in a multi-task learning process of super-resolution and reconstruction, and the super-resolution features are enriched with the low-resolution reconstruction features
	WavTrans(Li et al., 2022a)	Conv-Transformer Hybrid	2D	2.102M	MRI	fastMRI (Knoll et al., 2020), Clinical Brain MRI Dataset	No/Yes/No	The Residual Cross-attention Swin Transformer is proposed to deal with cross-modality features and boost target contrast MRI super-resolution.

https://brain-development.org/ixi-dataset/

^†

https://github.com/smilenaxx/RPLHR-CT/