| FPN | Feature Pyramid Network |
| PAN | Path Aggregation Network |
| IoT | Internet of Things |
| SOTA | State-of-the-Art |
| CNNs | Convolutional Neural Networks |
| CBAM | Convolutional Block Attention Module |
| CAM | Channel Attention Module |
| SAM | Spatial Attention Module |
| SSD | Single Shot MultiBox Detector |
| DBN | Deep Belief Network |
| YOLO | You Only Look Once |
| ViT | Vision Transformer |
| DETR | Detection Transformer |
| W-MSA | Windowed Multihead Self-Attention |
| SW-MSA | Sliding-Window Multihead Self-Attention |
| MLP | Multilayer Perceptron |
| LN | Layer Normalization |
| CARAFE | Content-Aware Reassembly of Features |
| mAP | Mean Average Precision |
| RPN | Region Proposal Network |
| CBL | Convolutional Block Layer |
| PAN | Path Aggregation Network |
| ELAN | Efficient Local Attention Network |
| CAT | Category-aware Transformation |