Table 10.
Training and inference time of different model scales. Training iteration time reports the time for processing a single 96×96×96 patch. For the testing cases, the latencies are reported on MNI space size of 172 × 220 × 156 using GPU RTX 3090Ti, regular inference pipeline without mixed precision, and no torchscript and TensorRT conversions. Note the inference patch size is 96 × 96 × 96, sliding window overlap has a significant impact on the inference time because the increase of overlap percentage will result in exponentially increased patches.
| Model | UNesT-S | UNesT-B | UNesT-L |
|---|---|---|---|
| Training | |||
|
| |||
| Iteration (s) | 0.29 | 0.46 | 0.82 |
|
| |||
| Testing | |||
|
| |||
| overlap = 0.3 (s) | 0.84 | 0.98 | 1.23 |
| overlap = 0.5 (s) | 2.30 | 2.34 | 2.97 |
| overlap = 0.7 (s) | 6.76 | 6.99 | 7.85 |