Skip to main content
. 2024 Oct 28;24(21):6914. doi: 10.3390/s24216914

Table 3.

Performances when using different backbone networks on the Nyuv2 test set. VGG16 and Resnet50 were reproduced and trained from scratch to ensure the consistency of the encoder structure. The pyramid pooling module is denoted as PPM, which was added before the decoder. The epochs were set to 250.

Models Global acc (%) Mean IoU (%) Mean acc (%)
Depth (VGG16)-RGB (VGG16) 61.1 25.7 35.1
Depth (ResNet50)-RGB (VGG16) 59.7 24.4 33.2
Depth (VGG16)-RGB (ResNet50) 61.7 26.1 35.7
Depth (ResNet50)-RGB (ResNet50) 60.6 25.1 34.3
Depth (VGG16)-RGB(VGG16)-PPM 61.7 29.8 41.6