Skip to main content
. 2020 Mar 27;20(7):1861. doi: 10.3390/s20071861

Table 5.

Network structure description of different trials.

Model Structure Description
YOLO-LITE YOLO-LITE raw network structure [19], as shown in Table 1
YOLOv3 YOLOv3 raw network structure [14], as shown in Figure 1
MobileNetV1-YOLOv3 Backbone uses MobileNetV1 while using YOLOv3 detector part
MobileNetV2-YOLOv3 Backbone uses MobileNetV2 while using YOLOv3 detector part
Trial 1 All convolution layers in YOLOv3 were replaced by depth-separable convolution, and the number of ResBlocks in Darknet53 was replaced from 1-2-8-8-4 to 1-2-4-6-4.
Trial 2 The convolution layer was reduced in the detector part of Trial 1 by one layer.
Trial 3 The number of ResBlocks in the backbone network of Trial 2 was reduced from 1-2-4-6-4 to 1-1-1-1-1.
Trial 4 A parallel structure was added based on Trial 2, the resolution was reconstructed using a 1 × 1 convolutional kernel, and the channel was fused using a 3 × 3 convolutional kernel after the connection.
Trial 5 Based on Trial 4, the number of ResBlocks in the backbone network was replaced by 1-1-2-4-2, and the resolution was reconstructed using a 3 × 3 convolutional kernel.
Trial 6 A parallel structure was added based on YOLOv3, which used a 1 × 1 ordinary convolution.
Trial 7 All convolutions in Trial 6 were replaced by depth-separable convolutions.
Trial 8 The region was exactly the same as that of YOLOv3, and the last layer became wider when the backbone extracted features.
Trial 9 The backbone was exactly the same as that in Trial 8, and three region levels were reduced by two layers for each.
Trial 10 Three region levels were reduced by two layers for each, the region was narrowed simultaneously, and the backbone was exactly the same as that of YOLO-LITE.
Trial 11 The backbone was exactly the same as that in Trial 8, and three region levels were reduced by four layers for each.
Trial 12 The backbone was exactly the same as that of YOLO-LITE, three region levels were reduced by four layers for each, and the region was narrowed simultaneously (three region levels were reduced by two layers for each based on Trial 10).
Trial 13 Three ResBlocks were added based on Trial 12.
Trial 14 Three HR structures were added based on Trial 12.
Trial 15 Based on Trial 14, the downsampling method was changed from the convolution step to the maximum pool, and a layer of convolution was added after the downsampling.
Trial 16 The convolution kernel of the last layer of HR was changed from 1 × 1 to 3 × 3 based on Trial 15.
Trial 17 Three ResBlocks were added to Trial 15.
Trial 18 Nine layers of inverted-bottleneck ResBlocks were added to Trial 15.
Trial 19 Based on Trial 18, the output layers of HR structure were increased by one 3 × 3 convolution layer for each, for a total of three layers.
Trial 20 The number of ResBlocks per part was adjusted to three, based on Trial 17.
Trial 21 The last ResBlocks was moved forward to reduce the number of channels, based on Trial 20.