Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2024 Oct 23;14:25047. doi: 10.1038/s41598-024-75156-z

Survivor detection approach for post earthquake search and rescue missions based on deep learning inspired algorithms

Rajendrasinh Jadeja 1,, Tapankumar Trivedi 1, Jaymit Surve 1
PMCID: PMC11500390  PMID: 39443536

Abstract

Rapid and reliable detection of human survivors trapped under debris is crucial for effective post-earthquake search and rescue (SAR) operations. This paper presents a novel approach to survivor detection using a snake robot equipped with deep learning (DL) based object identification algorithms. We evaluated the performance of three main algorithms: Faster R-CNN, Single Shot MultiBox Detector (SSD), and You Only Look Once (YOLO). While these algorithms are initially trained on the PASCAL VOC 2012 dataset for human identification, we address the lack of a dedicated dataset for trapped individuals by compiling a new dataset of 200 images that specifically depicts this scenario, featuring cluttered environment and occlusion. Our evaluation takes into account detection accuracy, confidence interval, and running time. The results demonstrate that the YOLOv10 algorithm achieves the 98.4 mAP@0.5, accuracy of 98.5% for inference time of 15 ms. We validate the performance of these algorithms using images of human survivors trapped under debris and subjected to various occlusions.

Keywords: CNN, Deep learning, Human detection, Object detection, Search and rescue

Subject terms: Electrical and electronic engineering, Computer science

Introduction

Natural disasters, particularly earthquakes, have caused significant loss of life worldwide. Earthquakes can have devastating consequences, and aftershocks often lead to further destruction. Recent severe earthquakes have caused many deaths as people became trapped in collapsed buildings1. As the frequency of such catastrophic events and global population density increase, there is an urgent need for advanced disaster recovery technology, that may result in more casualties. Rescuing these trapped individuals requires quick and efficient techniques, as people’s chances of survival drop significantly after the first few days due to factors such as dehydration and heat. Therefore, rescue teams need quick tools to effectively detect survivors in such disaster scenarios. Due to limited access and visibility, human-led SAR operations can be time-consuming with many variables to consider. Therefore, prolonged decision-making can be life-threatening for survivors.

In recent decades, various technologies have been developed to support SAR operations in disaster areas2. Rescue robots equipped with thermal imaging cameras, acoustic sensors, and biosensors can detect survivors. In fact, robots have emerged as a promising resource for disaster relief, with potential roles in hazardous environments, augmenting human capabilities, and overcoming the limitations of single robots with multi-robot systems. By integrating intelligent systems and autonomous technologies, disaster relief can be significantly improved.

Object detection algorithms are a crucial component of such systems3 as they enable robots to identify and locate survivors amidst the rubble. Current SAR methods for locating earthquake survivors, such as acoustic detectors, dogs, and cameras, have limitations. There is a need for more effective technologies to locate people trapped under debris. However, challenges still need to be overcome to maximize the effectiveness of robotic technologies in SAR operations. In particular, the complex and dynamic environments in disaster zones, such as unstable structures, rubble, and unpredictable conditions, pose major challenges for autonomous systems.

Snake robots, with their unique ability to navigate complex and narrow spaces, offer a promising solution for survivor detection in post-earthquake scenarios. Their slender, agile bodies allow them to traverse through rubble and debris and reach areas inaccessible to humans and larger robots4,5. Equipped with sensors such as thermal imaging cameras, CO2 detectors and microphones, snake robots can effectively locate survivors trapped in collapsed structures and assess the condition. In addition, the integration of DL algorithms into snake robots can further enhance their capabilities. DL-based computer vision models, trained on datasets of earthquake debris and images of survivors, can enable snake robots to accurately identify the presence of humans and classify their physical condition. This can provide rescue teams with vital information that allows them to prioritize and expedite the recovery of survivors.

The subsequent literature review in Sect. Literature survey addresses the development and applications of snake robots, with a particular focus on how DL techniques can be used to improve their effectiveness in detecting survivors in earthquake SAR missions.

Literature survey

Different types of robots have been developed for SAR missions, each with its own advantages and limitations:

  • Ground Robots: Wheeled and tracked robots are commonly used for navigating relatively open areas within collapsed structures. They can be equipped with sensors such as cameras, thermal imaging cameras, and gas detectors to locate survivors and assess environmental hazards6. However, their mobility is limited in highly unstructured and narrow spaces.

  • Aerial Robots: Unmanned Aerial Vehicles (UAV), provide aerial perspectives of disaster areas, aiding in damage assessment, survivor identification, and communication relaying7,8. Their ability to quickly cover large areas and access inaccessible locations makes them valuable tools in SAR operations. However, their flight time, payload capacity, and vulnerability to adverse weather conditions may pose limitations.

  • Snake Robots: Snake robots excel at navigating confined and irregular spaces, making them ideal for penetrating collapsed structures and reaching trapped survivors. Their slender and articulated bodies enable them to traverse through narrow openings and rubble9.

Previous research has demonstrated the potential of snake robots in post-earthquake SAR scenarios. Snake robots typically consist of several serially connected modules, each with one or more degrees of freedom that enable bending and rotational motions. Researchers have used a variety of actuation mechanisms, such as electric motors, pneumatic actuators, and tendon-driven systems4. The choice of actuation influences the size, weight, power consumption, and force output of the robot.

Researchers have conducted extensive research into the locomotion and control of snake robots to improve their maneuverability and adaptability to complex environments. Snake robots achieve their locomotion through the coordination of these modules, which mimic the serpentine movements of biological snakes. Researchers have developed numerous gaits or movement patterns, inspired by biological observations and mathematical models. These gaits enable snake robots to move across different surfaces, including flat terrain, slopes, stairs, and even water10. Depth cameras and stereo vision techniques provide 3D information about the environment, that is critical for obstacle detection and path planning. Snake robots equipped with depth sensors can create 3D maps of their surroundings and navigate complex terrain9.

By classifying each pixel in an image, semantic segmentation algorithms enable a more comprehensive understanding of the environment. Snake robots can use this information to distinguish between different surfaces, identify obstacles, and plan paths accordingly. While semantic segmentation algorithms can provide a richer understanding of the environment, their practical application to resource-constrained snake robots remains a challenge. The computationally intensive nature of these algorithms can limit their real-time performance, hindering the ability of snake robots to make timely decisions and navigate effectively in dynamic, unstructured environments. Additionally, changing lighting conditions, occlusions, and other environmental factors common in disaster scenarios can affect the reliability of semantic segmentation11. Therefore, snake robots may need to rely on a combination of sensors and algorithms to achieve robust perception and navigation, rather than relying solely on semantic segmentation.

Computer vision empowers snake robots to perceive and map their surroundings, collecting valuable information. DL-based object detection algorithms12 enable snake robots to identify and locate survivors in disaster scenarios or infrastructure components during inspection tasks. While computer vision empowers snake robots to perceive and map their environment, its practical application in SAR missions is limited by several challenges. DL-based object detection and recognition algorithms, while promising, can struggle with the highly unstructured and dynamic nature of disaster scenarios12. The computationally intensive nature of these algorithms can also hinder real-time performance on resource-constrained snake robot platforms, potentially affecting their ability to make timely decisions and navigate effectively. Further research is needed to develop more efficient computer vision algorithms that can operate reliably in the complex and variable conditions of post-earthquake SAR missions.

Simultaneous Localization and Mapping (SLAM) algorithms combine sensor data, often from cameras and inertial measurement units, to create a map of the environment while simultaneously estimating the robot’s pose within that map. Researchers have successfully implemented SLAM on snake robots, enabling them to autonomously explore and map unknown environments13. However, implementation of SLAM algorithms on snake robots for SAR missions remains challenging. While researchers have demonstrated the feasibility of SLAM on snake robots in controlled environments, the highly unstructured and dynamic nature of post-earthquake scenarios poses significant barriers13. The computational requirements of SLAM algorithms can overwhelm the limited computing power of small snake robots and impact their ability to navigate and map in real-time. Additionally, the presence of debris, dust, and poor lighting conditions common at disaster sites can degrade the performance of vision-based SLAM and lead to inaccurate localization and mapping results. Further advancements in computationally efficient SLAM techniques and multi-sensor fusion are required to reliably deploy snake robots equipped with SLAM capabilities in the complex and unpredictable environments of post-earthquake SAR operations.

The authors conducted a thorough bibliometric study of snake robots using VOSViewer software as part of bibliometric analysis, and their reported applications by various reputed publishers (as per SCOPUS database). According to the bibliometric study in Fig. 1, it is evident that most research focuses on locomotion, environment detection, and trajectory tracking, while less research is conducted on the use of snake robots to identify survivors.

Fig. 1.

Fig. 1

Bibliometric analysis of snake robotics (Generated using VOSviewer software).

In recent years, DL-inspired algorithms have emerged as a promising approach to address the challenges of survivor detection in post-earthquake SAR missions. DL models trained on large datasets of images and sensor data have exhibited the ability to effectively detect and locate survivors amid the debris of disaster zones. These models leverage the powerful feature extraction and pattern recognition capabilities of deep neural networks to overcome the limitations of traditional computer vision techniques, which can struggle with the highly cluttered and occluded environments encountered in post-earthquake scenarios. A key advantage of DL-inspired survival detection is the ability to generalize to unseen situations, allowing the algorithms to adapt to the dynamic and unpredictable nature of disaster sites1417.

However, the practical implementation of DL-based survivor detection on snake robots poses several challenges. The computationally intensive nature of DL algorithms18 can strain the limited processing capabilities of small snake robots and potentially impact their ability to operate in real time. Furthermore, training DL models requires large, high-quality datasets that can be difficult to obtain, especially for the unique and rapidly evolving conditions in post-earthquake scenarios.

Researchers have explored strategies to address these challenges, such as developing lightweight DL architectures, using transfer learning to reduce the need for large-scale training data, and deploying edge computing solutions to offload processing to more powerful remote servers. By integrating these advances, snake robots equipped with DL-inspired survivor detection algorithms can potentially play a crucial role in enhancing the efficiency and effectiveness of post-earthquake SAR operations.

In this article, we examine the state-of-the-art DL-inspired algorithms for survivor detection, focusing on their potential application in snake robot-based SAR missions. The DL-inspired algorithms for survivor detection primarily employ two-stage detectors or single-stage detectors. Two-stage detectors, such as R-CNN, Fast R-CNN, and Faster R-CNN, first generate region proposals and then classify these proposals into different object categories16. Conversely, single-stage detectors such as YOLO and SSD directly predict the bounding boxes and class probabilities in a single step. Recent studies have demonstrated that single-stage detectors like YOLO can achieve comparable accuracy to two-stage detectors while being significantly faster, making them more suitable for real-time applications on resource-constrained platforms such as snake robots. This work explores the performance of various DL-inspired algorithms, including Faster R-CNN, SSD300, SSD512, YOLOv8, and YOLOv10, for the task of survivor detection. Table 11517,19 shows the comparison of various algorithms in the context of the proposed study.

Table 1.

Comparison of various deep learning inspired algorithms for SAR Operation.

Feature Faster R-CNN SSD300 SSD512 YOLOv8 YOLOv10
Architecture Two-stage Single-shot, multi-scale feature maps Single-shot, multi-scale feature maps Single-shot, grid-based detection Single-shot, grid-based detection with enhancements
Model Size Large Medium Large Medium Medium to Large
Input Image size Variable 300 × 300 × 3 512 × 512 × 3 640 × 640 × 3 640 × 640 × 3
Accuracy Very high High Higher than SSD300 Excellent Excellent, better than YOLOv8
Handling Occlusion Good Moderate Moderate Very Good Excellent
GPU Utilization High, due to complex architecture Efficient, real-time Higher, needs more memory Highly efficient Highly efficient
Use in Real-time Systems Not suitable for real-time Suitable for real time systems Suitable Suitable, requires higher resources Suitable, requires higher resources
Power Efficiency Low Moderate to High Moderate Very High Very High

A major hurdle in application of DL for survivor detection is the lack of availability of realistic datasets. Real-world disaster data, particularly images and sensor readings data used to capture presence of survivors, is limited and sensitive, making ethical and responsible data acquisition difficult. Furthermore, the high variability of disaster scenarios, ranging from earthquakes and floods to building collapses, necessitates datasets that has diverse environments, lighting conditions, and survivor appearances to ensure generalization of the model.

This article directly addresses this challenge by introducing a new testing dataset of 200 images, that enables the development and evaluation of more robust and reliable DL models for survivor detection. The test dataset contains various images of people above and below the debris with cluttered backgrounds, people trapped under debris, and human survivor with occlusion. The performance of the above algorithms has been tested on the newly developed dataset of survivors. In addition to this, the paper proposes a novel SAR mechanism for effective SAR operations.

The manuscript is organized as follows: Sect. Overview of proposed SAR Mechanism details the proposed SAR approach and focuses on the integration of a snake robot platform with deep learning-based object detection algorithms. Section Algorithms, datasets, and evaluation metrics details the methodology, providing a description of the selected algorithms, the datasets used for training and testing (including the dataset for trapped individuals), and the evaluation metrics used. Section Results and Discussion presents a comparative analysis of the results of the different algorithms, evaluates their performance in terms of mAP, accuracy, and running time. Section Developed Prototype of 3D printed Snake Robot illustrates the snake robot illustrates the snake robot prototype, illustrating its key features and capabilities. Finally, Sect. Conclusion concludes the manuscript by summarizing the findings, and outlining future work in the area.

Overview of proposed SAR mechanism

This paper proposes a novel SAR mechanism that leverages the agility of snake robots, the power of DL, and real-time data transmission to enhance survivor detection and rescue operations in earthquake-affected zones. Figure 2 illustrates the proposed system architecture.

Fig. 2.

Fig. 2

Proposed SAR Operation mechanism with snake robot.

The key components of the proposed system are:

  1. A snake robot, equipped with a high-resolution camera, a sensitive microphone, and a wireless transmitter, is deployed into the disaster-stricken area. The robot’s slender and flexible body allows it to navigate through narrow spaces and rubble, accessing areas inaccessible to humans or larger robots.

  2. As the snake robot traverses the environment, it continuously captures visual and auditory data using its onboard camera and microphone. This data is transmitted in real-time to a remote-control centre via the wireless transmitter.

  3. At the control centre, the received images are analysed by a pre-trained DL model specifically designed for survivor detection. The model processes the images, identifying potential human survivors within the debris and rubble based on learned features and patterns.

  4. Upon successful survivor detection, the DL model provides information about the survivor’s location within the robot’s field of view. This information, along with the GPS coordinates of robot, is relayed to a nearby rescue team. The rescue team utilizes this precise location data to efficiently navigate to the survivor’s location and carry out rescue operations.

This approach not only conserves power but also extends the battery life of the robot. By offloading intensive processing tasks to a more powerful and energy-efficient platform, we ensure timely and accurate detection while maintaining the operational efficiency of the snake robot.

All communication between snake robot, operator and rescue station takes place in a secure environment such that the communication is protected from the cyber-attacks19. The proposed SAR mechanism can offer several advantages such as real time detection, automated analysis using DL, precise location of the trapped survivor. The performance of the above mechanism depends largely on detection of the survivor at the control centre. In the upcoming section, authors discuss various algorithms and their architecture in the context of the present study.

Algorithms, datasets, and evaluation metrics

The block diagram in Fig. 3 describes the methodology used in this study to evaluate the performance of various object detection models for SAR operations.

Fig. 3.

Fig. 3

Methodology for training and testing of proposed rescue support mechanism.

The authors used the PASCAL VOC 2012 dataset, which is utilized for training and validation purposes. The dataset is divided into training and validation subsets to ensure that the models are well-tuned before testing. In the present work, 80% of the images are used for training, and 20% of the images are used for validation. Several object detection models, including SSD300, SSD512, Faster R-CNN, YOLOv8, and YOLOv10, are trained using this dataset. It is noted that the PASCAL VOC2012 dataset overrepresents the person/human class, resulting in a dataset biased towards human detection. In the context of the present study, this class imbalance20 has proven to be advantageous. The focus of the research is on detecting human bodies, including scenarios involving occlusion. The overrepresentation of the human class in the dataset enhances the models’ ability to accurately detect humans even under challenging conditions, thereby positively contributing to the objectives of this study.

After training, the models are tested on a well-developed set of images that depict trapped humans under debris. These test images are crucial for simulating real-world SAR scenarios where detection accuracy is important. The performance of each model is then evaluated based on various parameters such as precision, recall, and other relevant metrics.

A. Algorithms

  • i.

    Faster Region-Convolutional Neural Network (Faster R-CNN) Algorithm.

Faster R-CNN is the object detection architecture that streamlines the process with a Region Proposal Network. It works in two stages: The first stage consists of a Region Proposal Network in which a fully convolutional network takes the input image and generates a set of rectangular object proposals, each with an objectness score. Region Proposal Network shares convolutional layers with the detection network enabling efficient proposal generation. In the second stage of Fast R-CNN Detection, the proposed regions from RPN are fed into the Fast R-CNN module. This module uses RoI pooling to extract fixed-size feature maps for each proposal, which are then used for object classification and bounding box refinement. The key feature of the algorithm lies in the shared convolutional layers and the integrated RPN, which enable near real-time performance without sacrificing accuracy. This end-to-end trainable architecture marked a significant step toward more efficient and accurate object detection systems. With PASCAL VOC 2012 dataset, the algorithm achieves up to 5 fps on a high-performance GPU.

In the present work, the model uses ResNet-50 as the backbone, which is further improved by Feature Pyramid Network. The model has 41,375,941 parameters, of which 41,322,885 are trainable parameters and 53,056 non-trainable parameters which are associated with batch normalization layers. Due to the architecture, the present model maintains a balance between speed and accuracy, resulting in near real time object detection tasks.

  • ii.

    SSD Algorithm.

SSD features an elegant architecture designed for efficient and accurate object detection. It starts with a base network (like VGG-16), responsible for extracting high-level feature maps from the input image. SSD then adds auxiliary convolutional layers on top of this base network, producing a set of multi-scale feature maps. These maps capture object representations at different resolutions, enabling the detection of objects at various sizes. For each location on these feature maps, SSD deploys a set of pre-defined default boxes, each with specific aspect ratios and scales. The network predicts offsets from these default boxes to refine object locations and assigns confidence scores for different object categories. This process occurs simultaneously across all feature maps, allowing for efficient single-stage detection.

Finally, non-maximum suppression is applied to the predicted bounding boxes to eliminate redundant detections, resulting in a final set of detected objects with their corresponding class labels and locations. This streamlined architecture, combined with the multi-scale feature representation and default box mechanism, enables SSD to achieve a balance between speed and accuracy in object detection tasks.

In the context of real-time post-earthquake SAR missions, the SSD algorithm balances high detection accuracy and computational efficiency through its single-shot processing approach, allowing simultaneous prediction of bounding boxes and class scores in one pass. This design choice significantly reduces inference time compared to models such as Faster R-CNN, which require multiple stages of processing. In addition, SSD’s use of multi-scale feature maps ensures that objects of various sizes are accurately detected, improving its applicability in SAR scenarios.

We have employed SSD300 and SSD512 models, which use image input resolutions of 300 × 300 and 512 × 512. In the present work, the model utilizes modified VGG16 as the base network, where fully connected layers are replaced by convolution layers. The SSD300 model has 26,285,486 trainable parameters. This relatively lightweight architecture is capable of performing real time detection on most hardware platforms. On the other hand, SSD512 has more convolution layers and higher dimension of feature extraction layers compared to SSD300. Thus, the resultant model has a deeper and more complex network, which improves the performance. The model has 27,189,028 trainable parameters, improving its detection capabilities with marginal increase in the computational demand.

  • iii.

    YOLO algorithm.

YOLOv8 is the most widely adopted version of the YOLO object detection family, known for its trade-off between speed and accuracy. YOLOv8 introduces architectural and functional changes. A key difference from its predecessors such as YOLOv5 is the shift to an anchor-free detection mechanism. Instead of predicting bounding boxes based on pre-defined anchors, YOLOv8 directly predicts the center of the object. This simplifies the process, reduces the number of predictions, and speeds up post-processing. Additionally, YOLOv8 consolidates the multiple output head found in previous versions into a single output head. This further adds to its efficiency. While YOLOv8 has better performance, it is worth noting that its inference speed is dependent on the computing power due to the high number of parameters. In the present work, the YOLOv8 model has 3,152,856 trainable parameters.

On the other hand, YOLOv10 has consistent dual assignment for NMS free training which reduces end-to-end latency21.It also has efficiency driven model design that optimizes both speed and accuracy. In addition to this, YOLOv10 has advance backbone network, enhanced Feature Pyramid Network and Improved anchor boxes which contribute to more accurate object localization and reduced false positives. In the present work, the YOLOv10 model has 23,258,208 parameters. The depth of the model has resulted in improved accuracy of the algorithm.

  • B.

    Dataset.

Training of dataset

The PASCAL VOC2012 dataset is a widely used benchmark for object detection algorithms. It contains images from 20 object categories, including people, animals, vehicles, and household items. Researchers have used this dataset, with its standardized images and annotations, to evaluate and compare the performance of their object detection models. The dataset has played a significant role in advancing object detection research, even though it is considered smaller compared to more modern datasets like MS-COCO. This dataset contains the training/validation data of 11,530 images containing 27,450 ROI annotated objects and 6929 segmentations.

Test set

It is difficult to collect the dataset of humans trapped in earthquake-like situations and we collected 200 of these images to develop our test dataset of earthquake survivors. The test dataset contains various images of people above and below the debris with cluttered background. People with occlusion are included in the testing dataset. Figure 4 depicts some of these images used in the testing of algorithms. These distinct images are selected based on type of background, debris around the object, position of the object, brightness of the picture, and occlusion. As the images of people trapped under debris might be sensitive for readers, such images are excluded from the reporting of the results.

Fig. 4.

Fig. 4

Example of test images on which the proposed deep learning architectures are tested.

  • C.

    Evaluation metrics.

The mAP is employed to evaluate proposed object detection algorithms. The mAP evaluates the similarity between the GT BB and the detected box, yielding a numerical score. Greater scores indicate better accuracy in the model’s detections. Some of the important evaluation metrics are shown in Table 2.

Table 2.

Performance parameter of algorithms.

Sr. No. Evaluation Metric Definition
1 Intersection over Union Inline graphic
2 Precision Inline graphic
3 Recall Inline graphic
4 Average Precision Inline graphic
5 mAP Inline graphic

Where, A = predicted bounding box, B = actual bounding box, TP = True Positive, FP = False Positive, and FN = False Negative.

It is important to note that we only used these methods only to detect humans. Thus, there is one class of objects which implies that the average precision and mAP score will be the same for SSD300, SSD512, Faster R-CNN, YOLOv8 and YOLOv10 algorithms.

Results and discussion

All algorithms are trained and validated on the PASCAL VOC 2012 dataset. For all algorithms, a new earthquake survivor dataset is created on which these algorithms are tested. The system used in the present work is the PARAM SHAVAK supercomputer with Intel® Xeon® Gold processor with 40 cores and 96 GB of RAM. The graphics card deployed was NVIDIA- GP100 graphics card with 16 GB of memory and 3584 CUDA cores.

A precision vs. recall curve is plotted in Fig. 5 using matplotlib library of python during the training. The precision recall curve illustrates the trade-offs between precision and recall across all algorithms i.e. YOLOv8, YOLOv10, SSD300, SSD512, and Faster R-CNN. From the graph, YOLOv10 demonstrates the highest area under the curve (AUC) at 0.920, indicating superior performance in maintaining high precision across a wide range of recall values. This makes YOLOv10 a highly reliable model in scenarios where both false positives and false negatives must be minimized, such as in SAR operations. YOLOv8 follows with an AUC of 0.895, showcasing a balanced trade-off between precision and recall, making it a viable alternative when computational efficiency is a priority.

Fig. 5.

Fig. 5

Recall vs. Precision curve of algorithms.

On the other hand, SSD512 and SSD300, show a moderate decrease in performance with AUCs of 0.862 and 0.817 respectively, particularly in regions of higher recall, where precision drops more steeply. This can lead to more false positives, which could be problematic in critical situations like locating victims under debris and occlusions. The Faster R-CNN model, with an AUC of 0.878, performs better than the SSD variants but still falls short of YOLOv8 and YOLOv10, indicating that it might not be as robust for real-time applications despite its strong precision at lower recall values. Overall, the precision recall curve analysis highlights YOLOv10 as the most effective model for SAR operations, closely followed by YOLOv8, while SSD variants and Faster R-CNN may be more suitable for applications where precision is less critical, or computational resources are limited.

The IoU vs. Recall curve of Fig. 6 illustrates the performance of the algorithms in object detection. YOLOv10 demonstrates consistently high recall across most IoU thresholds, indicating that the algorithm has robust performance. YOLOv8 performs closely behind YOLOv10 at higher IoU values but shows more variation at lower IoU values. SSD512 and Faster R-CNN exhibit similar performance, maintaining high recall across a wide range of IoU thresholds. However, there is a significant drop in recall at higher IoU thresholds. SSD300 lags in performance compared to the other models, and has lower recall, especially as the IoU threshold increases. The analysis suggests that YOLOv10 is more reliable for high-accuracy survivor detection, while YOLOv8 performs moderately well. On the other hand, Faster R-CNN and SSD512 offer competitive performance, but their overall performance is inferior to that of the YOLO models. SSD300 is not an ideal candidate for survivor detection due to its lower recall at higher thresholds.

Fig. 6.

Fig. 6

IoU vs. Recall plot of algorithms.

A confusion matrix was generated for each algorithm to assess their performance. Accuracy and error rate were derived from the confusion matrices. As evident from Fig. 7, YOLOv10 exhibits the highest accuracy of 98.5% and lowest error rates among all the evaluated algorithms. This suggests that YOLOv10 demonstrates superior performance in accurately detecting survivors with occlusion. Faster RCNN also offers superior performance in comparison to other algorithms with 95% accuracy. On the other hand, SSD512 with an accuracy of 93.5% outperforms its counterpart SSD300 with an accuracy of 91%. While YOLOv8 achieves a respectable accuracy of 93%, it lacks performance in comparison to the advanced version YOLOv10 showing inferior performance in handling occlusion.

Fig. 7.

Fig. 7

Confusion matrix of various methods for identification of survivor.

The performance of the algorithms is reported in Fig. 8 which is evaluated on key parameters such as GPU inference time and accuracy. The inference time of the YOLOv8 and YOLOv10 algorithms are 12 ms and 15 ms respectively while maintaining the high accuracy of 93% and 98.5%. On the other hand, SSD300 and SSD512 which are known for applicability in the resource constrained environment have shown inference time of 42 ms and 61 ms respectively. For the snake robot requiring on board computations, SSD512 can prove to be useful. Faster R-CNN, which is known for its accuracy, has the highest inference time of 78 ms, making it less suitable for real time application even though it has high accuracy.

Fig. 8.

Fig. 8

Inference Time vs. Accuracy of DL algorithms in survivor detection.

In the interval [0,1], a SoftMax score and a category label are associated with each output box. By applying a score threshold of 0.6, these images are displayed. The Faster RCNN algorithm observes 97.2% mAP@0.5 whereas the execution time required is mentioned in Fig. 9. The results for the SSD300 and SSD512 algorithm are shown in Figs. 10 and 11 respectively. The SSD300 algorithm shows 90.5% mAP@0.5 whereas the SSD512 algorithm shows 94.1% mAP@0.5. While achieving higher accuracy, the inference time of SSD512 is greatly increased in comparison to SSD300 algorithm.

Fig. 9.

Fig. 9

Selected instances of object detection outcomes achieved with the Faster RCNN algorithm.

Fig. 10.

Fig. 10

Selected instances of object detection outcomes achieved with the SSD300 algorithm.

Fig. 11.

Fig. 11

Selected instances of object detection outcomes achieved with the SSD512 algorithm.

Figure 12 and Fig. 13 depict the performance of YOLOv8 and YOLOv10 algorithms. With 98.4% mAP@0.5, YOLOv10 has outperformed its YOLOv8 predecessor which has scored 92.8% mAP@0.5. This performance outlines that despite the promising performance of YOLOv10 in many applications, there is still the scope of optimization in the algorithm in survivor detection.

Fig. 12.

Fig. 12

Selected instances of object detection outcomes achieved with the YOLOv8 algorithm.

Fig. 13.

Fig. 13

Selected instances of object detection outcomes achieved with the YOLOv10 algorithm.

Developed prototype of 3D printed snake robot

A developed 3D printed prototype of the snake robot is depicted in Fig. 14, where side and top views of the snake robot are presented which includes seven modules mounted in in vertical and horizontal positions to attain the snake locomotion including rectilinear, sidewinding, and concertina. Each module consists of an MG996R servo motor, and the cap of the servo motor is modified and printed with a 3D printer. In addition, an Arduino nano is employed for controlling the snake robot and to obtain the snake locomotion, a sinusoidal signal has been employed to every alternate module, and the rest of the modules employed a sinusoidal signal with a phase shift of 90˚. The wireless SMRS cameras are attached to the front and rear modules to detect humans trapped under the debris in earthquake situations and for motion planning to avoid obstacles at both ends. For more precise motion planning, we can also include two more cameras on the right and left sides of the snake robot. Currently, the snake robot consists of seven modules, and in the future, it can also be extended as per the requirements of the SAR operation.

Fig. 14.

Fig. 14

(a) Side view and (b) top view of the developed snake robot.

Conclusion

In this work, we have evaluated various deep learning algorithms for SAR operation in post-earthquake missions, using the PASCAL VOC 2012 dataset and newly developed earthquake survivor dataset. Various algorithms are compared in terms of precision recall curves, inference times and precision. Further, the strengths and limitation of each model are highlighted. YOLOv10 consistently demonstrated superior performance achieving the highest accuracy of 98.5% and mAP of 98.4% indicating its robust performance and reliability in identifying survivors, even with the occlusion. While YOLOv8 has exhibited lower accuracy, its faster inference time is advantageous when computation efficiency is of importance. Although SSD300 and SSD512 algorithms offer advantages in resource constrained environments, their lower accuracy, makes them less ideal for survivor detection. Faster R-CNN, despite having high accuracy of 95%, suffer from significantly larger inference time(78ms), hindering its real time applicability. Hence, YOLOv10 emerges as the most effective model for post-earthquake survivor detection, balancing high accuracy with acceptable inference times. Future research could explore optimizing YOLOv10 architecture for further speed enhancements without compromising its detection capabilities.

Future work

The future work can be divided into three parts where we will focus on the following three important updates on algorithms and snake robot mechanism (a) algorithm upgradation to assess the health condition of the survivor (b) testing of the 3D printed snake robot under the debris (c) implementation of hybrid mechanism of snake robot and Unmanned Aerial Vehicle (UAV) to find the optimal path for snake robot.

Acknowledgements

This work is supported by the Gujarat Council on Science and Technology, Department of Science and Technology, Govt. of Gujarat, under the Science Technology and Innovation Policy. Grant No: [GUJCOST/STI/2020-21/2271].

Author contributions

Conceptualization, R.J.; Software, R.J., T.T. and J.S.; Data curation, T.T, J.S.; Formal analysis, R.J., T.T., and J.S.; Funding acquisition, R.J., T.T.; Investigation, R.J., T.T. and J.S.; Methodology, R.J., T.T. and J.S.; Project administration, R.J.; Supervision, R.J., and T.T.; Validation, R.J., T.T. and J.S.; Writing—original draft, R.J., T.T. and J.S.; Writing—review and editing, R.J., T.T. and J.S.

Data availability

The data will be made available at a reasonable request to the corresponding author.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Ritchie H, Rosado P, Roser M. Natural Disasters. Our World in Data (2022)
  • 2.Konyo M, Ambe Y, Nagano H, Yamauchi Y, Tadokoro S, Bando Y, et al. ImPACT-TRC Thin Serpentine Robot platform for urban search and rescue. In: Tadokoro S, Editor Disaster Robotics: Results from the ImPACT Tough Robotics Challenge. (Springer International Publishing, Cham, 2019), p 25–76.doi:10.1007/978-3-030-05321-5_2 [Google Scholar]
  • 3.Verschae R, Ruiz-del-Solar J. Object detection: current and future directions. Frontiers in Robotics and AI (2015); 2: 1–7.doi:10.3389/frobt.2015.00029 [Google Scholar]
  • 4.Seeja G, Doss ASA, Hency VB. A Survey on Snake Robot Locomotion. IEEE Access (2022); 10: 112100–16.doi:10.1109/ACCESS.2022.3215162 [Google Scholar]
  • 5.Pettersen KY. Snake robots. Annual Reviews in Control (2017); 44: 19–44.doi:10.1016/j.arcontrol.2017.09.006 [Google Scholar]
  • 6.Li F, Hou S, Bu C, Qu B. Rescue Robots for the Urban Earthquake Environment. Disaster Medicine and Public Health Preparedness (2023); 17.doi:10.1017/dmp.2022.98 [DOI] [PubMed]
  • 7.Dong J, Ota K, Dong M. UAV-Based real-time survivor detection system in Post-disaster Search and Rescue operations. IEEE Journal on Miniaturization for Air and Space Systems (2021); 2: 209–19.doi:10.1109/jmass.2021.3083659 [Google Scholar]
  • 8.Shakhatreh H, Khreishah A, Ji B. UAVs to the rescue: prolonging the lifetime of Wireless devices under Disaster situations. IEEE Transactions on Green Communications and Networking (2019); 3: 942–54.doi:10.1109/TGCN.2019.2930642 [Google Scholar]
  • 9.Liu J, Tong Y, Liu J. Review of snake robots in constrained environments. Robotics and Autonomous Systems (2021); 141: 103785.doi:10.1016/j.robot.2021.103785 [Google Scholar]
  • 10.Hirose S, Yamada H. Snake-like robots. IEEE Robotics & Automation Magazine (2009); 16: 88–98.doi:10.1109/MRA.2009.932130 [Google Scholar]
  • 11.Teng TW, Veerajagadheswar P, Ramalingam B, Yin J, Elara Mohan R, Gómez BF. Vision Based Wall following Framework: a Case Study with HSR Robot for cleaning application. Sensors (2020); 20: 3298.10.3390/s20113298 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Amin MS, Ahn H. Earthquake disaster avoidance learning system using deep learning. Cognitive Systems Research (2021); 66: 221–35.doi:10.1016/j.cogsys.2020.11.002 [Google Scholar]
  • 13.Sanfilippo F, Azpiazu J, Marafioti G, Transeth AA, Ø S, Liljebäck P. 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV). p 1–7.doi:10.1109/ICARCV.2016.7838565
  • 14.Chen G, Hou Y, Cui T, Li H, Shangguan F, Cao L. YOLOv8-CML: a lightweight target detection method for color-changing melon ripening in intelligent agriculture. Scientific Reports (2024); 14: 14400.doi:10.1038/s41598-024-65293-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, et al. In: Leibe B, Matas J, Sebe N,Welling M, Editors. Computer Vision – ECCV 2016. (Springer International Publishing), p 21–37
  • 16.Ren S, He K, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017); 39: 1137–49.doi:10.1109/TPAMI.2016.2577031 [DOI] [PubMed] [Google Scholar]
  • 17.Dong C, Du G. An enhanced real-time human pose estimation method based on modified YOLOv8 framework. Scientific Reports (2024); 14: 8012.doi:10.1038/s41598-024-58146-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, et al. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). p 3296-7.doi:10.1109/CVPR.2017.351
  • 19.Li Y, Wei X, Li Y, Dong Z, Shahidehpour M. Detection of False Data Injection Attacks in Smart Grid: a secure Federated Deep Learning Approach. IEEE Transactions on Smart Grid (2022); 13: 4862–72.doi:10.1109/TSG.2022.3204796 [Google Scholar]
  • 20.Li Y, Cao J, Xu Y, Zhu L, Dong ZY. Deep learning based on Transformer architecture for power system short-term voltage stability assessment with class imbalance. Renewable and Sustainable Energy Reviews (2024); 189: 113913.10.1016/j.rser.2023.113913 [Google Scholar]
  • 21.Wang A, Chen H, Liu L, Chen K, Lin Z, Han J, et al. YOLOv10: Real-Time End-to-End Object Detection. (2024)

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data will be made available at a reasonable request to the corresponding author.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES