Skip to main content
PLOS One logoLink to PLOS One
. 2023 Jan 26;18(1):e0271051. doi: 10.1371/journal.pone.0271051

MaskID: An effective deep-learning-based algorithm for dense rebar counting

Wenrui Li 2, Jian Cheng 2, Bo Chen 2, Yu Xue 2, Yi Wang 3, Yan Fu 1,4, Junlin Zhou 1,4, Duanbing Chen 1,4,*
Editor: Humaira Nisar5
PMCID: PMC9879489  PMID: 36701317

Abstract

As a dense instance segmentation problem, rebar counting in a complex environment such as rebar yard and rebar transpotation has received significant attention in both academic and industrial contexts. Traditional counting approaches, such as manual counting and machine vision-based algorithms, are often inefficient or inaccurate since rebars with varied sizes and shapes are stacked overlapping, rebar image is not clear for complex light condition such as dawn, night and strong light, and other environmental noises exist in rebar image; thus, they no longer fulfil the requirements of modern automation. This paper proposes MaskID, an innovative counting method based on deep learning and heuristic strategies. First, an improved version of the Mask region-based convolutional neural network (Mask R-CNN) was designed to obtain the segmentation results through splitting and rescaling so as to capture more detail in a large-scale rebar image. Then, a series of intelligent denoising strategies corresponding to aspect ratio of recognized box, overlapping recognized objects, object distribution and environmental noise, were applied to improve the segmentation results. The performance of the proposed method was evaluated on open-competition and test-platform datasets. The F1-score was found to be over 0.99 on all datasets. The experimental results demonstrate that the proposed method is effective for dense rebar counting and significantly outperforms existing state-of-the-art methods.

Introduction

Motivation

Up to now, several in-depth studies of the rebar counting problem have obtained good results. However, certain factors make accurate counting a challenging task. These include the following:

  1. Counting problems with densely stacked rebars: overlapping and unclear borders make accurate recognition very difficult.

  2. Environmental constraints: several challenges are faced in obtaining clear and usable images for accurate counting under different lighting conditions, such as dawn, day, and night.

  3. Various sizes of rebars to be detected: the rebars have a wide range of diameters; moreover, in practice, the shape of the rebar may be irregular.

Therefore, to enable accurate counting in a complex environment, this paper proposes a novel deep-learning-based rebar recognition and heuristic strategy-based counting approach. Two main principles of the approach are as follows. Firstly, splitting and rescaling are introduced in the original Mask R-CNN [1] to capture more details of large-scale images and an improved version of Mask R-CNN is designed. Secondly, to achieve highly accurate recognition results, multiple intelligent denoising strategies corresponding to aspect ratio of recognized box, overlapping recognized objects, object distribution and environmental noise, are designed and applied to remove the noise in the results obtained by improved Mask R-CNN.

Background and related works

The rebar counting problem essentially calls for the segmentation of densely stacked objects. The earliest counting machine with an automatic sorting mechanism originated in Japan; it was a dual-transmission mechanical system realized by phototubes, which could handle situations such as bending, overlapping, and shielding of bar-like materials. Offoiach [2] attempted to improve automatic counting by using both impulsers and phototubes, but this did not achieve the expected outcome; the rebars were only recognized sequentially. Unexpected situations, such as bending, could result in undercounting or miscounting. Dynamic Ventures Inc. developed a commercial APP called CountThings (https://www.countthingsfromphotos.com/) to count rebars, logs, pipes, etc. For a typical image, rebars can be counted with low miscounting errors. However, for certain scene, template should be developed while counting the objects in an image.

In recent years, several deep-learning-based approaches have been developed for object detection. In 2012, AlexNet [3] was applied to large-scale object classification for the first time, and it won the ILSVRC-2012 image classification competition. AlexNet achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry. Thus far, most object detection methods have been anchor-based [4]. They can be broadly classified into two categories: region proposal–based object detection (two-stage structure) and regression-based object detection (one-stage structure). The representational networks of a two-stage structure include the region-based convolutional neural network (R-CNN) [5] and its improved versions such as Fast R-CNN [6], Faster R-CNN [4, 7], Cascade R-CNN [8] and Mask R-CNN [1]. Retina Net [9] and YOLO [10] are representation of regression-based networks. These algorithms exhibit excellent performance in general scenarios. However, they frequently fail in special cases, such as those of overlapping or dense objects. Very recently, densely packed object detection has received significant attention. Precise detection in densely packed scenes (PDDPS) [11] is the first network that aimed for dense object detection. It introduced Gaussian mixture models to handle overlapping objects and was shown to be highly effective. Ding et al. [12] introduced the RoI transformer to handle the misalignment of densely stacked objects in aerial images. Moreover, Zhu et al. [13] created an object detector, known as ScratchDet, which took advantage of the Root-ResNet of the original image to enhance the recognition capability of small objects in combination with ResNet and VGGNet. In 2020, an increasing number of studies have focused on densely stacked object detection. Yang et al. [14] proposed dense repoints, a creative object representation realized by a dense point set at multiple levels, which strengthens the performance of contour-based counterparts. Chu et al. [15] presented the concept of allowing one proposal to be responsible for a set of correlated instances, which gains 4.9% average precision (AP) elevation using Earth Mover’s Distance (EMD) and non-maximum suppression (NMS). Furthermore, Pan et al. [16] presented a dynamic refinement network which can dynamically optimize classification tasks using two novel components: a feature selection module(FSM) and a dynamic refinement head (DRH). FSM enables neurons to adjust receptive fields in conformity with the shapes and orientations of the targets. On the other hand, DRH allows the model to dynamically improve the prediction. Qiu et al. [17] designed BorderDet, an innovative detection architecture which utilizes border information for more persuasive classification and precise localization. It obtains a 50.3% AP and outperforms the majority of recent approaches with the ResNeXt-101-DCN backbone. Xie [18] proposed a semantic segmentation and keypoint detection algorithm based on weak supervision to count rebars. It shows superior performance compared to that of Faster R-CNN and cascaded R-CNN. Lee et al. [19] presented a robust algorithm for detecting small and dense objects in images from autonomous aerial vehicles based on cascaded R-CNN and recursive feature pyramid. It won the VisDrone-DET 2020 challenge with 34.57% mAP, compared to 34.54% achieved by the second-best entry. Zhao et al. [20] presented a knowledge-aided CNN for small organ segmentation with limited training data using two cascade steps, i.e., localization and segmentation. The results are rather good on classical datasets such as ISBI 2015 VISCERAL.

Key contributions

The performance of the proposed method was evaluated on datasets of open competition and a test platform built for this purpose. The results demonstrate that it significantly outperforms state-of-the-art methods.

The key contributions of the paper are as follows:

  • A novel algorithm MaskID is proposed to count the rebars in complex environment.

  • An improved version of Mask R-CNN is presented through splitting and rescaling so as to capture more details in a large-scale rebar image.

  • Intelligent denoising strategies are designed to remove the noise in the results obtained by improved Mask R-CNN.

  • Experiments on open-competition and test-platform datasets show that MaskID can obtain more accurate counting results than other state-of-the-art methods.

Organization of the paper

The remainder of this paper is organized as follows. Section 2 describes the rebar counting method based on the improved version of Mask R-CNN and heuristic denoising strategies. Section 3 presents the experimental results and discussions on open-competition and test-platform datasets. Finally, Section 4 presents the conclusions and future works.

Materials and methods

Mask R-CNN

For a rebar image, if a series of region proposals can be used to accurately frame each rebar in image, the number of rebars can be achieved. Because the region proposal extracted from model may have a large deviation from the real bounding box of rebar, e.g., IoU < 0.5, it is necessary to fine tune region proposal to make the fine tuned bounding box closer to the real bounding box than original region proposal. The bounding box regression is used to fine tune the region proposal in deep learning method.

As previously illustrated, deep-learning models are backbones in instance segmentation. Some excellent methods have been presented to segment objects in various scene. Wu et al. [8] proposed a multiple attention encoded cascade R-CNN to detect scene-text in complex natural scene which progressively refines accurate boundaries of text instances. In their method, two core stages are included, i.e., feature generation and cascade detection. Wan and Goudos [7] proposed a faster R-CNN to detect multi-class fruit for facilitating high level smart farm. Three key strategies are included in their method, i.e., fruit image library creation, data argumentation and improved faster R-CNN model generation. Among all the models, Mask R-CNN [1] was selected as the backbone model in our approach. This is because Mask R-CNN outperforms the majority of other models in instance segmentation tasks, which is highly significant for small, intensively-stacked object detection. As an extension framework of Faster R-CNN [4], Mask R-CNN completes the instance segmentation task by adding a mask prediction branch in parallel with the existing branch for bounding box regression. Because RoIPool in Faster R-CNN is not a pixel-to-pixel alignment, the quantization operation will significantly increase the mask error rate. To address this issue, RoIAlign was introduced in Mask R-CNN to replace the RoIPool operation, enhancing the mask accuracy by using a bilinear interpolation strategy [16]. The structure of the Mask R-CNN used in this study is shown in Fig 1. In this structure, convolutional neural network and region proposal network are utilized to extract the feature map and to obtain the region proposals of each object(Fig 1(2)) from original image firstly, based on this, fixed size feature map (Fig 1(3)) is obtained through RoIAlign layer, finally, bounding box of object is obtained by bounding box regression (Fig 1(4)). To better understanding the Mask R-CNN, several details are elaborated. For model architecture, Mask R-CNN with ResNeXt101 [21] backbone with four stages is adopted and the backbone is refined by replacing standard convolutional layers with deformable convolutional layers in last three stages. Besides, in the first stage, multiple candidate bounding boxes are proposed to classify foreground and background and to regress bounding-box coordinate offsets. In the second stage, features are extracted using RoIAlign from each candidate box and performs classification and bounding-box regression in parallel. The Mask R-CNN is trained by using SGD optimizer with 0.02 learning rate.

Fig 1. (Color online)Mask R-CNN framework.

Fig 1

(1) original rebar image, (2) feature map and region proposals, (3) fixed size feature map, (4) bounding box regression, and (5) final bounding box of rebar.

In the case of dense rebar counting, owing to the large difference in the sizes of rebars in an image, the original Mask R-CNN is not very effective for detection, because some small-sized rebars are not very clear. Thus, to effectively detect rebars of various sizes, it is necessary to split a large image into several smaller images before applying Mask R-CNN. So, each smaller image is an input in the Mask R-CNN (Fig 1(1)) in this study, the counting results of large image can be obtained by fusing the counting results of each smaller image. To retain the original information of the image, a large image is split into several overlapping small images, which guarantees that the model can inspect the intact image of rebar sections. In this study, the original rebar image is compartmentalized into 384 × 384-pixel small images with 160 pixels overlapping to retain the complete information of the object. After being split, each small image is rescaled to 1024 × 1024 pixels to obtain more feature information. The schema for splitting and rescaling are shown in Fig 2.

Fig 2. (Color online)Schema for splitting and rescaling.

Fig 2

The left image is the original rebar image, and the right image is a slice from the original image that rescaled to 1024 × 1024 pixels.

Intelligent denoising strategies

Because overlapping regions exist in split images, some rebars are counted repeatedly in multiple split images and some background noise is identified as rebar, the recognition results should be processed further to eliminate noise.

Noise detection based on aspect ratio of a box

Because the cross-section of a rebar is generally approximated by a circle, the length and width of the recognized box will not differ significantly. If the aspect ratio of the recognized box is below a certain threshold, it indicates that the recognized box is abnormal and needs to be removed. After removing boxes with abnormal aspect ratios, inscribed circles of boxes were obtained as candidate objects.

Noise detection based on two overlapping objects

For a dense rebar stacking scenario, the sizes of two adjacent rebars do not differ significantly. For this reason, any two overlapping recognized objects need to be further processed based on the specific situation in order to remove noise.

  • (1) An object is entirely contained in another object.

    For this specific situation, the smaller object is most likely to be noise and needs to be removed. As shown in Fig 3(a), the smaller light gray circle will be removed.

  • (2) One object is significantly smaller than another.

    If the size of the larger object is basically the same as that of other rebars around it, the smaller object is most likely noise and needs to be removed. As shown in Fig 3(b), the smaller light gray circle will be removed.

  • (3) The objects are similar in size.

    If the distance between their centers is below a certain threshold δ (δ < 0.5(r1 + r2), where r1 and r2 are the radii of two circles, respectively), the object with lower reliability (each recognized object predicted by Mask R-CNN has a score reflecting its prediction reliability) will be removed. As shown in Fig 3(c), the light gray circle with lower reliability will be removed. If the distance between their centers is larger than δ, both candidate objects will be retained. As shown in Fig 3(d), both the light orange and light gray circles will be retained.

Fig 3. (Color online)Mechanism of eliminating overlapping region.

Fig 3

In (a), (b), and (c), the light gray circles are noise, and in (d), both circles are real rebars and will be retained. (a) An object is entirely contained in another object. (b) One object is significantly smaller than another. (c) Two objects of similar size with large embedding. (d) Two objects of similar size with small embedding.

Noise detection based on object distribution

For dense rebar recognition, the size of a recognized object is approximately the same as that of the surrounding objects. If the area of the currently recognized object si is significantly smaller or larger than the average area of the surrounding objects 〈s〉 (sis<0.2 or ssi<0.2 in this study), the current recognized object is considered to be noise and needs to be removed. In addition, if the area of the current object i is significantly smaller than that of the surrounding objects, object i will be removed. Essentially, for any adjacent object j where the distance to object i is less than k * ri, where k is a tunable parameter between 5 to 10 (the noise cannot be removed for too much k and the real rebar might be removed for too small k, k = 8 in this study) and ri is the radius of i, if the area sj of object j is greater than m * si, where m is a tunable parameter between 3 to 7 (the noise cannot be removed for too much m and the real rebar might be removed for too small m, m = 5 in this study) and si is the area of object i, object i is considered to be noise and needs to be removed.

Environmental noise detection

For an image of dense rebars, some regions that are distant from actual rebars may be mistakenly identified as rebars because of the influence of environmental noise. The following rule is used to check for this case: in a certain region, if the total area of objects ssum is significantly smaller than the area of the envelope rectangle srec of these objects (ssum < 0.05srec in this study), all objects in this region are considered to be noise and are removed.

Model training and testing

In the training process, each original training image is split into small images for labeling. The total number of small images is about 4000, and the number of marked rebars is more than 50000. The labeled samples are input to improved Mask R-CNN for training model with SGD optimizer and 0.02 learning rate to obtain the rebar recognition model. In rebar recongnition process, trained improved Mask R-CNN model is used to identify the rebar firstly, and heuristic denoising strategies are applied to remove the noise to get the final recognition results.

Results and discussion

Datasets

The performance of the proposed method was evaluated using two datasets. One is provided by the DataFountain competition, which contains 450 images where 250 images are training set and 200 images are testing set. All images were captured using smartphones and can be obtained from https://www.datafountain.cn/competitions/332/datasets. This dataset possesses some typical features that are described as follows. (1) Rebars are densely stacked, which increases the difficulty of segmentation. (2) The rebar sizes vary significantly. (3) There are notable differences in the angles from which the photographs were captured.

The other dataset is from a physical test platform established in this study, which contains approximately 8000 rebars of various sizes. The physical test platform is constructed in the building site, which is similar to a truck loaded with densely stacked rebars and can be used to simulate the actual complex rebar counting scenario.

In this paper, 250 training images from DataFountain competition are used to train MaskID. 200 testing images from DataFountain competition and one image with 6405 × 3597 pixels obtained from physical test platform are used to evaluate the performance of MaskID.

Evaluation metrics

Because the rebar recognition problem can be considered as an object classification task in the field of machine learning, Precision, Recall and F1-score are generally used for evaluation. First, the confusion matrix was created as shown in Table 1. In this table, precision Pr and recall rate Re can be calculated using the confusion matrix, as shown in Eq 1.

Pr=TPTP+FPRe=TPTP+FN}. (1)

Table 1. Confusion matrix.

TP denotes the number of prediction objects that are real rebars, FP denotes the number of prediction objects that are not real rebars, and FN denotes the number of real rebars that are not recognized.

Actual Positive Actual Negative
Predict Positive TP FP
Predict Negative FN TN

In general, high precision leads to low recall and vice versa. The F1 score takes into account both precision and recall rate of the model and can be defined by Eq 2.

F1=2Pr×RePr+Re. (2)

Experimental results

Experimental results on competition dataset

As previously illustrated, there have been several approaches to solve the rebars recognition problem. Among of them, WSSKP [18] was a state-of-the-art algorithm, semantic segmentation and key point detection based on weak supervision were utilized in the model. The model achieved 98.8% F1 score on DataFountain competition dataset. Fig 4 shows the comparison between WSSKP [18] and the proposed MaskID approach for two images. From Fig 4, it can be seen that one rebar is not recognized by WSSKP in each of the two images, as shown in Fig 4(a) and 4(c). For MaskID, all rebars were recognized, as shown in Fig 4(b) and 4(d). The model size, Precision, Recall and F1-scores of the benchmark methods and MaskID for 200 testing images taken from the DataFountain competition dataset are shown in Table 2. It is noted that the results in Table 2 are averaged on 200 testing images. Because the results of benchmark methods are directly taken from the references, the Precision and Recall of these methods are not listed in Table 2 since these two metrics were not given in the original references. From this table, it can be seen that MaskID obtained the highest F1-score (99.38%), which is a 0.58% improvement compared to the state-of-the-art WSSKP method [18].

Fig 4. (Color online)Recognition results on two images.

Fig 4

(a) and (c) show the results of WSSKP, and (b) and (d) show the results of MaskID.

Table 2. Average Precision, Recall and F1-score on 200 testing images from DataFountain competition dataset.
Model size Precision Recall F1-Score
Retina Net [9] 16.3M - - 0.9802
Faster RCNN [4] 23.2M - - 0.9830
Cascade RCNN [22] 42.1M - - 0.9870
WSWA-Seg [23] 5.8M - - 0.9883
WSSKP [18] 9.2M - - 0.9880
MaskID 10.1M 0.9934 0.9942 0.9938

Experimental results on test platform dataset

The performance of MaskID is also evaluated on our own experimental platform. In order to effectively detect multiscale rebars, the image is split into 384 × 384-pixel small images with 160 pixels overlapping. The recognition results obtained from MaskID is shown in Fig 5. The Precision, Recall and F1 score are listed in Table 3. It can be seen that all three metrics are larger than 0.99. From Fig 5 and Table 3, it can be seen that MaskID has low miscounting errors, and even very small rebars with sizes below 8 × 8 pixels in the right-down region are accurately recognized.

Fig 5. (Color online) recognition results on platform dataset.

Fig 5

Table 3. The rebar counting results on test platform dataset.
The number of real rebars Precision Recall F1-Score
7453 0.9978 0.9910 0.9944

Summary

Accurate counting of densely-stacked rebars is essential in several real-world scenarios. Many deep learning-based methods, such as Faster R-CNN, Cascade R-CNN, and WSWA-Reg, have been developed and have demonstrated effectiveness. However, there remain several challenges in counting densely-stacked rebars, as well as those with varied sizes and shapes. Based on the characteristics of the rebar section and the specific case of densely stacked rebars, a novel counting method based on deep learning and intelligent denoising strategies is proposed in this paper. First, improved Mask R-CNN was applied to obtain the segmentation results initially by utilizing a splitting and rescaling strategy. Then, intelligent denoising strategies corresponding to aspect ratio of recognized box, overlapping recognized objects, object distribution and environmental noise, were designed to eliminate background noise, including abnormal shape, reduplicative recognition, and distribution abnormality. The results show that the proposed method is capable of accomplishing the recognition and counting task with a score of over 0.9938 F1-score, and outperforms state-of-the-art rebar recognition methods. The proposed method can be applied to building sites and rebar manufacturers for the rapid and accurate rebar counting. In this paper, fixed splitting and overlapping size are set in the proposed method. However, a rebar with certain real size might have different image size in different scene. So, the identifying results might unstable if the image size of rebar is particularly small or large. For example, the image size of rebar is larger than 400 × 400 pixels or smaller than 5 × 5 pixels, it is difficult to identify the rebars accurately utilizing the proposed algorithm with split window 384 × 384 pixels. In order to further improve the generalization ability of the algorithm, several splitting and overlapping size might be used to identify the rebars, and the best result will be adopted. Due to parallax, some rebars are completely covered by other rebars. For this specific case, a separate image can be retaken so that the covered rebars in the original image can be seen. Then stitch the recognition results of original image and retaken image to get the final results. In some ultra wide scenarios, it is necessary to take multiple images, and recognition results of multiple images will be stitched to final results. These problems will be further studied deeply in the future.

Supporting information

S1 File

(RAR)

Acknowledgments

We thank Fei He and Haizhen Xie for data collection, and Wenyu Fu and Tianxiang He for their helpful discussion. We also thank reviewers for their constructive suggestions and comments, which have provided great help for us to improve the paper.

Data Availability

All relevant data are within the manuscript and its Supporting information files.

Funding Statement

This work was partially supported by the National Natural Science Foundation of China (Grant No. 61673085), by the Science Strength Promotion Program of UESTC under Grant No. Y03111023901014006. There was no additional external funding received for this study.

References

  • 1.He K, Gkioxari G, Dollár P, Ross G. Mask R-CNN. In: The IEEE International Conference on Computer Vision; 2017. p. 2980–2988.
  • 2.Offoiach R. Counter and splitter device for bars in layers, particularly for final packing of rolling mill products. European Patent. 1995; p. EP19910110809.
  • 3.Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems; 2012. p. 1097–1105.
  • 4. Ren S, He K, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017;39(6):1137–1149. doi: 10.1109/TPAMI.2016.2577031 [DOI] [PubMed] [Google Scholar]
  • 5.Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition; 2014. p. 580–587.
  • 6.Girshick R. Fast R-CNN. In: The IEEE International Conference on Computer Vision; 2015. p. 1440–1448.
  • 7. Wan S, Goudos S. Faster R-CNN for multi-class fruit detection using a robotic vision system. Computer Networks. 2020;168:107036. doi: 10.1016/j.comnet.2019.107036 [DOI] [Google Scholar]
  • 8. Wu Y, Liu W, Wan S. Multiple attention encoded cascade R-CNN for scene text detection. Journal of Visual Communication and Image Representation. 2021;80:103261. doi: 10.1016/j.jvcir.2021.103261 [DOI] [Google Scholar]
  • 9.Lin TY, Goyal P, Girshick R, He K, Dollár P. Focal loss for dense object detection. In: The IEEE International Conference on Computer Vision; 2017. p. 2999–3007.
  • 10.Redmon J, Farhadi A. Yolo9000: Better, faster, stronger. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2017. p. 6517–6525.
  • 11.Goldman E, Herzig R, Eisenschtata A, Goldberger J, Hassner T. Precise detection in densely packed scenes. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2019. p. 5222–5231.
  • 12.Ding J, Xue N, Long Y, Xia GS, Lu Q. Learning RoI transformer for oriented object detection in aerial images. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2019. p. 2844–2853.
  • 13.Zhu R, Zhang S, Wang X, Wen L, Shi H, Bo L, et al. ScratchDet: training single-shot object detectors from scratch. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2019. p. 2263–2272.
  • 14.Yang Z, Xu Y, Xue H, Zhang Z, Urtasun R, Wang L, et al. Dense RepPoints: representing visual objects with dense point sets. In: 16th European Conference on Computer Vision; 2020. p. 227–244.
  • 15.Chu X, Zheng A, Zhang X, Sun J. Detection in crowded scenes: one proposal, multiple predictions. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020. p. 12214–12223.
  • 16.Pan X, Ren Y, Sheng K, Dong W, Xu C. Dynamic refinement network for oriented and densely packed object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020. p. 11207–11216.
  • 17.Qiu H, Ma Y, Li Z, Liu S, Sun J. BorderDet: Border feature for dense object detection. In: 16th European Conference on Computer Vision; 2020. p. 549–564.
  • 18.Xie H. Research and application of dense object detection algorithm based on deep learning (in Chinese). University of Electronic and Science Technology of China, Master Thesis, 2019. 2019.
  • 19. Lee JC, Yoo J, Kim Y, Moon S, Ko JH. Robust detection of small and dense objects in images from autonomous aerial vehicles. Electronics Letters. 2021;57(16):611–613. doi: 10.1049/ell2.12245 [DOI] [Google Scholar]
  • 20. Zhao Y, Li H, Wan S, Sekuboyina A, Hu X, Tetteh G, et al. Knowledge-aided convolutional neural network for small organ segmentation. IEEE Journal of Biomedical and Health Informatics. 2019;23(4):1363–1373. doi: 10.1109/JBHI.2019.2891526 [DOI] [PubMed] [Google Scholar]
  • 21.Mahajan D, Girshick R, Ramanathan V, He K, Paluri M, Li Y, et al. Exploring the limits of weakly supervised pretraining. In: European Conference on Computer Vision; 2018. p. 185–201.
  • 22.Cai Z, Vasconcelos N. Cascade R-CNN: Delving into high quality object detection. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2018. p. 6154–6162.
  • 23.Cheng Z, Wu Y, Xu Z, Lukasiewicz T, Wang W. Segmentation is all you need. arXiv. 2019; p. 1904.13300.

Decision Letter 0

Humaira Nisar

28 Mar 2022

PONE-D-21-28703MaskID: an effective deep-learning-based algorithm for dense rebar countingPLOS ONE

Dear Dr. Chen,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by May 12 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Humaira Nisar

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating in your Funding Statement: "This work was partially supported by the National Natural Science Foundation of China (Grant No. 61673085) by the Science Strength Promotion Program of UESTC under Grant No. Y03111023901014006."

Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now.  Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement. 

Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf.

3. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

4. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. 

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

5. We note that Figure 5 in your submission contain copyrighted images. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission:

a. You may seek permission from the original copyright holder of Figure 5 to publish the content specifically under the CC BY 4.0 license. 

We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:

“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.”

Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission. 

In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].”

b. If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

6. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. 

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The manuscript has presented a deep learning-based method for dense rebar counting. The manuscript is well eligible for publication because of the following reasons:

1. The case study and literature are well documented and drafted with a suitable English language.

2. The performance of the proposed method is accurate and outperformed the state-of-the-art methods.

Therefore, I would like to recommend the manuscript for acceptance in its present form.

Reviewer #2: This paper proposes MaskID, an innovative counting method based on deep learning and heuristic strategies. First, a sliced version of the Mask region-based convolutional neural network (R-CNN) was designed to obtain the segmentation results. Their extensive experimental results demonstrated that their approach reaches a high accuracy.

The proposed system is very interesting and quite straight-forward, however, there are still many important issues that should be addressed before final publication.

1. Authors may revise the abstract to elaborate more on problem statement and their findings and contributions.

2. Authors may elaborate more on the novelty of their work. How it contributes to the literature. In page 2, the authors write, "The two main principles of the approach are as follows." I don't see the novelty in the MaskID algorithm used. If no one has proposed before a method like the proposed algorithm, this claim should be highlighted much more. Else, it should be indicated who has done this, and it should be indicated what the innovations of the current paper are. Furthermore, briefly describe the major contributions in bullet form, just before the organization paragraph.

3. Introduction can be improved by having four clear and concise subsections on motivation of your research; background and related works; list of key contributions; and organization of the paper.

4. A comparative literature review may require identifying the problem/ research gap in Section "Materials and Methods". A few of the references are missing some information, may complete them critically. Provided references are better enough. However, authors are recommended to consider more latest and related such as,

https://www.sciencedirect.com/science/article/abs/pii/S1047320321001711

https://www.sciencedirect.com/science/article/abs/pii/S1389128619306978

https://ieeexplore.ieee.org/abstract/document/8606255.

5. What are the shortcomings of the algorithm and what is the room for improvement? I suggest adding a brief description in the conclusion section.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Jan 26;18(1):e0271051. doi: 10.1371/journal.pone.0271051.r002

Author response to Decision Letter 0


13 May 2022

We have revised the manuscript according to editor's and reviewers' suggestions and comments. We have responded to the editor's and reviewers' suggestions or comments one by one in the response letter.

Attachment

Submitted filename: Reply-PONE-D-21-28703.docx

Decision Letter 1

Humaira Nisar

30 May 2022

PONE-D-21-28703R1MaskID: an effective deep-learning-based algorithm for dense rebar countingPLOS ONE

Dear Dr. Chen,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

1) Section- Datasets: As mentioned by the authors 2 datasets have been used, it will be advisable if sample images from the 2 datasets should be shown to get an overview of the type of images. The first dataset has 200 images, but for the 2nd dataset, it is not clear, how many images are there. The results are only tabulated in the form of Table 2, using F1 score. The division of the datasets into training/validation and testing is not given. Nor the training and testing accuracy is mentioned. These parameters should be mentioned in the manuscript.

2) it looks like the combined results from the 2 datasets are given, which means that the 2 datasets were combined for the analysis, or dataset 2 was only used for testing and dataset 1 for training? Need to explain

3) In addition to the F-1 score it is advisable to include the values for precision and recall.

4) The details of the model should also be included.

5) Table 2 compares the result of the proposed algorithm with the state of the art. Are all these methods use the same datasets?

6) The references are a bit confusing, Ref 9 and Ref 21 are the same. It is mentioned as Retina Net, in which a dense detector is designed and named as retina NET. IT is not clear if the same dataset is used, then how to justify the results in Table 2.

7) Similar to above, ref 4 and Ref 22 are the same.

8) for table 2, if the same dataset is not used then the comparisons are not meaningful, it will be better if the authors give a detailed table in which they include the type of dataset the no of images, and results to get an overview of what is happening.

9)There are some spelling and grammatical mistakes that should be corrected. For example, line 33, should be scene.... line 85 novelty----novel.

Please submit your revised manuscript by Jul 14 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Humaira Nisar

Academic Editor

PLOS ONE

Additional Editor Comments:

Thank you for revising the manuscript.

I have the following comments:

1) Section- Datasets: As mentioned by the authors 2 datasets have been used, it will be advisable if sample images from the 2 datasets should be shown to get an overview of the type of images. The first dataset has 200 images, but for the 2nd dataset, it is not clear, how many images are there. The results are only tabulated in the form of Table 2, using F1 score. The division of the datasets into training/validation and testing is not given. Nor the training and testing accuracy is mentioned. These parameters should be mentioned in the manuscript.

2) it looks like the combined results from the 2 datasets are given, which means that the 2 datasets were combined for the analysis, or dataset 2 was only used for testing and dataset 1 for training? Need to explain

3) In addition to the F-1 score it is advisable to include the values for precision and recall.

4) The details of the model should also be included.

5) Table 2 compares the result of the proposed algorithm with the state of the art. Are all these methods use the same datasets?

6) The references are a bit confusing, Ref 9 and Ref 21 are the same. It is mentioned as Retina Net, in which a dense detector is designed and named as retina NET. IT is not clear if the same dataset is used, then how to justify the results in Table 2.

7) Similar to above, ref 4 and Ref 22 are the same.

8) for table 2, if the same dataset is not used then the comparisons are not meaningful, it will be better if the authors give a detailed table in which they include the type of dataset the no of images, and results to get an overview of what is happening.

There are some spelling and grammatical mistakes that should be corrected. For example, line 33, should be scene.... line 85 novelty----novel.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The updates in the manuscript are satisfactory and the case study is suitable for the publication. I would like to recommend it for acceptance.

Reviewer #2: The revised manuscript is satisfactory and can be beneficial to readers. Therefore, I would like to recommend the publication.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Jan 26;18(1):e0271051. doi: 10.1371/journal.pone.0271051.r004

Author response to Decision Letter 1


17 Jun 2022

Thank you very much for processing our manuscript entitled “MaskID: an effective deep-learning-based algorithm for dense rebar counting”. We have considered all the points and revised the manuscript accordingly. Enclosed please find a detailed response. For the sake of convenience, modifications are marked in red in the revised manuscript. Moreover, a clean version of revised manuscript is also re-submited.

Attachment

Submitted filename: response-2nd.docx

Decision Letter 2

Humaira Nisar

23 Jun 2022

MaskID: an effective deep-learning-based algorithm for dense rebar counting

PONE-D-21-28703R2

Dear Dr. Chen,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Humaira Nisar

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Humaira Nisar

27 Jun 2022

PONE-D-21-28703R2

MaskID: an effective deep-learning-based algorithm for dense rebar counting

Dear Dr. Chen:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Humaira Nisar

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File

    (RAR)

    Attachment

    Submitted filename: Reply-PONE-D-21-28703.docx

    Attachment

    Submitted filename: response-2nd.docx

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting information files.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES