. 2024 Oct 28;24(21):6914. doi: 10.3390/s24216914

Table 3.

Performances when using different backbone networks on the Nyuv2 test set. VGG16 and Resnet50 were reproduced and trained from scratch to ensure the consistency of the encoder structure. The pyramid pooling module is denoted as PPM, which was added before the decoder. The epochs were set to 250.

Models	Global acc (%)	Mean IoU (%)	Mean acc (%)
Depth (VGG16)-RGB (VGG16)	61.1	25.7	35.1
Depth (ResNet50)-RGB (VGG16)	59.7	24.4	33.2
Depth (VGG16)-RGB (ResNet50)	61.7	26.1	35.7
Depth (ResNet50)-RGB (ResNet50)	60.6	25.1	34.3
Depth (VGG16)-RGB(VGG16)-PPM	61.7	29.8	41.6