Table 2. Parameterization of model and training.
| Model parameter | Value | Training parameter | Value | Training parameter | Value |
|---|---|---|---|---|---|
| Image size | 256 | Batch size | 64 | Warmup lr | 0.0000005 |
| Windows size | 16 | Epochs | 300 | Scheduler | Cosine |
| Embed dim | 96 | Warmup epochs | 20 | Decay epochs | 30 |
| Depth | 2,2,6,2 | Weight decay | 0.05 | Decay rate | 0.1 |
| Num heads | 3,6,12,24 | Optimizer | AdamW | Random erase mode | Pixel |
| Drop path rate | 0.2 | Base lr | 0.0005 | Mixup alpha | 0.8 |