| AP | Average Precision |
| BSCA | Bit-Slice Context Attention |
| BWM | Balanced Weight Module |
| CMAM | Convolutional Multi-Scale Attention Module |
| CNN | Convolutional Neural Network |
| CRC | Colorectal Cancer |
| DCL | Domain Contrastive Learning |
| DCT | Discrete Cosine Transform |
| DDPM | Denoising Diffusion Probabilistic Model |
| DLCA | Dataset-Level Color Augmentation |
| DPC | Dual-Path Complementary |
| DSC | Dice Similarity Coefficient |
| FCNs | Fully Convolutional Networks |
| FGW | Flow-Guided Warping |
| FoBS | Focus on Boundary Segmentation |
| FPS | Frames Per Second |
| FT | Fourier Transform |
| GFlops | Giga Floating Point Operations Per Second |
| IFU | Iterative Feedback Unit |
| IoU | Intersection over Union |
| MACE | Multi-Path Attention Encoder |
| MAST | Mixture-Attention Siamese Transformer |
| MCAD | Multi-Path Attention Decoder |
| mDice | mean Dice coefficient |
| mIoU | mean Intersection over Union |
| MLP | Multi-Layer Perceptron |
| PAM | Parallel Attention Module |
| PVTv2 | Pyramid Vision Transformer v2 |
| RFS | Reference Frame Selection |
| RSA | Region Self-Attention |
| SAM | Segment Anything Model |
| SAM2 | Segment Anything Model 2 |
| SSBU | Segmentation-Squeeze-Bottleneck Union |
| SSFM | Selective Shared Fusion Module |
| SSM | State Space Model |
| TRM | Temporal Reasoning Module |
| ViT | Vision Transformer |
| VSS | Visual State Space |
| WoS | Web of Science |