Skip to main content
. 2022 Mar 15;52(2):219–224. doi: 10.5624/isd.20210287

Fig. 2. The structure of YOLOv3 used in this study. In training, the input data to Darknet-53, the backbone of this network, are resized images to 512 (width)×512 (height) pixels and their annotation files. Then, 3 feature maps in different resolutions are produced with classification during the training via convolution layers. The final weights acquired from this procedure are used in testing.

Fig. 2