Skip to main content
iScience logoLink to iScience
. 2024 Mar 11;27(4):109479. doi: 10.1016/j.isci.2024.109479

Underwater smart glasses: A visual-tactile fusion hazard detection system

Zhongze Ma 1,4, Chenjie Zhang 1,4, Pengcheng Jiao 1,2,3,5,
PMCID: PMC10973206  PMID: 38550982

Summary

Marine activities typically face various risk factors such as marine animal attacks or unexpected collisions. In this paper, we develop underwater smart glasses (USGs) based on visual-tactile fusion for underwater hazard detection in real-time, ensuring operational safety. The proposed USG is composed of the vision module by artificial intelligence (AI)-enabled optical sensing and the tactile module by triboelectric metamaterials-enabled mechanical sensing. The vision module is obtained based on the underwater target detection algorithm you only look once-underwater hazard (YOLO-UH) developed by the dataset to detect toxic marine organisms in the visual field. The tactile module is designed with the kirigami tribo-materials (KTMs) to sensitively detect and warn of collisions outside the visual field. Through numerical simulations, laboratory tests, and real-world experiments, we validated the performance of both modules. The reported USG with its visual-tactile fusion concept enables near-far all-around hazard detection and reduces the danger for divers working underwater.

Subject areas: Artificial intelligence, Engineering, Metamaterials

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • Underwater smart glasses with visual-tactile fusion for underwater hazard detection

  • YOLO-underwater hazard (YOLO-UH) for optical sensing analysis of toxic organisms

  • Kirigami tribo-material (KTM) sensor to monitor potential collision


Artificial intelligence; Engineering; Metamaterials

Introduction

Smart wearable devices are mainly designed to sense physiological data1,2,3,4 while enhancing human perception.5,6 Smart glasses, as a typical smart wearable device in the field of vision, have great potential to revolutionize how we access and interact with digital information.7,8,9,10,11,12 As the finite nature of land resources becomes more apparent, people are looking for ways to develop and use these resources in the ocean, but there are also many risks and challenges. Obviously, underwater workers often face various dangers when performing tasks, so it is necessary to develop underwater wearable intelligent devices for underwater warning and detection to ensure life safety and improve work efficiency. However, most of the smart glasses are developed for applications on land13,14,15,16,17,18,19,20 and underwater smart glasses (USGs) have rarely been proposed.21,22

Meanwhile, traditional smart glasses have only visual functions and lack multi-modal perception. Headbands, as an indispensable component of wearing glasses, are a perfect match for tactile perception, but the contradiction between large deformation capacity and detection sensitivity23,24 makes the design of tactile modules hard. Mechanical metamaterials have unique and superior mechanical properties, which come from the rational design of the microstructure rather than the material itself, and can further integrate advanced functions into their texture.25 For example, mechanical metamaterials that master sensing capabilities,26 energy harvesting capabilities,27 and wireless signal transmission capabilities.28,29 As a new kind of energy harvesting and self-powered sensing device,30 triboelectric nanogenerator (TENG) has been widely used in motion monitoring,31,32,33 healthcare,34 smart textile,35,36,37,38 and environmental sensing,39 showing good coupling property of mechanical metamaterials.40,41,42 The combination of TENGs and mechanical metamaterials can help us design multi-functional stretchable headbands, but there is no such research in the smart glasses field.

Large-scale image data integration43,44,45 is an extremely important resource for the development of advanced visual perception algorithms. Datasets with a large number of market demand scenarios such as vehicle detection have been perfect but large underwater target detection datasets are still lacking. For human beings to explore the underwater world more safely and deeply, the dataset of underwater biological target detection needs to be developed urgently. You only look once (YOLO) network has been widely used for deployment in underwater devices due to its excellent performance in precision and speed.46,47 However, the underwater environment is complex, and factors such as sediment and current can easily hurt the effective identification distance. Up to now, few small target improvement strategies for underwater YOLO have been developed, resulting in a short effective detection distance.

In this work, since divers’ limbs need to be used for snorkeling and operations during diving, we proposed a real-time underwater danger detection system based on USG by expanding divers’ vision and touch, which can detect biological threats and physical collisions and give real-time feedback warning to the wearer, thus reducing divers’ risks during ocean engineering tasks and ensuring their personal safety. The proposed USG is divided into two main functional modules: the visual module and the tactile module, where the former is located in the front and the latter is located to the side, as shown in Figure 1A. On one hand, about the tactile module, we designed a new mechanical metamaterial-based kirigami tribo-material (KTM), which can be stretched axially and compressed radially, ensuring the deformation ability and detection sensitivity of the sensing system. The proposed collision sensor uses an origami-kirigami-forged structural unit to couple 2D triboelectric films into a spatial structure with controllable mechanical deformation. On the other hand, about the visual module, first, we established a dataset of toxic marine organisms with 7 categories and 9379 pictures. Then we improved YOLO v5 in small target detection and obtained a new YOLO-UH algorithm (mAP(0.5) improved by 4.6%). With everything in place, we successfully tested the tactile and visual perception functions in the swimming pool and in the real ocean. The results show that the proposed USG can operate safely and efficiently underwater, combining the precision of target detection and the sensitivity of collision detection, which can optimize the diver’s underwater perception at a high level and help him explore the blue world more safely.

Figure 1.

Figure 1

Underwater Smart Glasses

(A) Underwater danger detection system based on underwater smart glasses with both tactile perception and visual perception.

(B) Working mechanism of the tactile module (KTM).

(C) Working mechanism of the visual module (YOLO-UH).

Overview of the principles of tactile and visual perception

First of all, we will introduce the principle of USG’s tactile and visual modules. Figure 1B shows the assembly of the KTM and how it works under radial loads, i.e., the tactile module. Initially, the polytetrafluoroethylene (PTFE) and paper dielectric materials were placed in parallel, and their respective back surfaces were bonded to the conductive material copper as a metal electrode. Subsequently, the PTFE gradually approaches the paper under a radial load, and when the two electrodes are connected by a load, electrons flow from one electrode to the other through the load. When the KTM is in the unloading stage, the PTFE and the paper gradually separate, at which time the electrons will return. Cyclic loading makes the previous process repeat periodically and finally appears as an alternating current output. Figure 1C shows the structure of the YOLO-UH, i.e., the visual module. When an image is input, it will first go through our special attention mechanism processing and small object detection layer, and then through the classic YOLO network backbone-neck-head architecture, after advance warning processing output. The details will be developed below, but before that, we will cover the preparation and deployment process of USG.

Preparation and deployment

As shown in Figure 2A, the supporting structure and stretchable headband of USG were 3D printed by stereolithography apparatus (SLAs) and fused deposition modeling (FDM), respectively. The previous components were assembled to produce a shell model of the USG, where the joints are sealed by silicone waterproof membrane using silicone 705 (Table S1). Figure 2B shows the array structure design of the stretchable headband and the comparison of the experiment and numerical analysis during stretching. The design of the structure is inspired by the concept of “spatial kirigami”, including axial kirigami for stretchability and radial kirigami for a 3 x 3 array triboelectric nanogenerator (TENG). This design relies on the structural means to realize the electrical signal output without affecting the mechanical properties of the device. It can be seen from the numerical results that the structure does not have serious stress concentration during tensile. Also, we have conducted fatigue experiments (see Figure S3) and the result indicates that after 1000 cycles of tensile testing, the structure’s maximum force only decreases by 4.34%, showing good fatigue performance. Detailed mechanical experimental and numerical results, including shear and torsion, are shown in Figure S1. The structure can be restored after large deformation without any visible damage to the device (see Video S1). As an underwater augmented reality (AR) device, we want both ambient light and screen light to enter the user’s eyes at the same time. Therefore, we introduced a spectroscopic prism, and the designed optical path is shown in Figure 2C. In the underwater environment, human interaction with USG is difficult, so we designed a self-starting function system. Once the external switch is turned on, the YOLO model will automatically turn on, providing users with a real-time recognition function (see Figure S2). The specific deployment steps of the microcontroller unit (MCU) are shown in Figure 2D. Figure 2E shows the block diagram of the USG system, including the tactile module, visual module, and user interface. The waterproof treatment of the KTM is realized through desiccant and silicone membranes. The KTM generates an electrical signal upon impact, which is differentially passed to the MCU through a 16-bit analog-to-digital converter (ADC) chip. The screen of the user interface includes an AR area with an identification box and a collision monitor area.

Figure 2.

Figure 2

The preparation process, hardware, and software composition of USG

(A) Preparation of the shell, including the silicone waterproof membrane, SLA-printed supporting structure, and FDM-printed stretching structure.

(B) Structural design of the tactile module, including axial kirigami for stretchability and radial kirigami for 3 × 3 array TENG, and experimental and numerical results of axial stretching.

(C) Design of the optical path.

(D) Hardware deployment of the YOLO algorithm.

(E) Block diagrams of the USG, including tactile module, vision module, and user interface.

Video S1. Tension, shearing and torsion test of the kirigami module, related to Figure 2
Download video file (12.2MB, mp4)

Results and discussion

Sensor design and characterization

Figure 3A illustrates the experimental setup of cyclic loading and the displacement-electric characteristics of KTM during deformation. To ensure the accuracy of measurement, the outer profile of KTM is 3D printed to make its constraint mold, and cyclic load is applied to it using an elliptic cylinder mold, and then each of them is fixed on the loading machine (see Figure S4 and Video S2). In addition, to study the influence of dielectric material types on the strength of electrical signals, paper was replaced with Kapton and aluminum, respectively, to characterize electrical properties in subsequent experiments. The peak voltage (Figure 3B) and current (Figure S5) of KTM with three dielectric material combinations (PTFE-Paper, PTFE-Kapton, PTFE-Aluminum) were studied at different amplitudes and frequencies. At first, to study the effect of dielectric material type on the electrical response, the peak voltage and current of the three material combinations at the same frequency were tested, as shown in Figures 3C and 3D.

Figure 3.

Figure 3

Electrical performance test of KTM

(A) Experimental setup and the deformation process.

(B) Distribution of peak voltage to the dielectric material combination, frequency, and amplitude.

(C) Comparison of the magnitude of peak voltage and (D) peak current in the combination of three dielectric materials.

(E) The tendency of frequency to influence voltage and (F) current at constant amplitude.

(G) The tendency of amplitude to influence voltage and (H) current at constant frequency.

(I) Verification of the signal attenuation at 20,000 cycles (constant frequency of 3Hz and constant amplitude of 9mm).

Video S2. The electrical test scenario of KTM, related to Figure 3
Download video file (16.6MB, mp4)

The results showed that the electrical output of the PTFE-Paper group was the highest, while that of the PTFE-Aluminum group was the lowest. Therefore, the PTFE-Paper group was used to analyze the contribution of frequency and amplitude to the electrical response. Figures 3E and 3F show that under the maximum amplitude, the voltage and current of KTM presents an upward trend with the increase of loading frequency. This is because the deformation rate of the device increases as the frequency increases. Accordingly, friction between materials is more frequent, which means that more charge is transferred per unit time, manifested as a stronger electrical signal. Next, we control the frequency unchanged and increase the amplitude from 1 mm to 9 mm. Fortunately, we find that both voltage and current show a clear linear upward trend in this case, as shown in Figures 3G and 3H. This not only means that the KTM can linearly describe the relationship between force excitation and electrical response, simplifying the data processing process but also means that the sensing system is predictable when the voltage value received by the processor changes, the magnitude of the external force can also be predicted by a linear relationship, which may be able to assist in post-collision medical evaluation and management. Multiple periodic loading is to verify whether KTM will produce signal attenuation due to fatigue after 20,000 cycles under the condition of 3 Hz frequency and 9 mm amplitude. We report the results of the fatigue experiment in Figure 3I, from which good signal stability can be observed. Another advantage of the tactile module is that it relies on the short-term changes in pressure to detect collisions rather than the absolute pressure magnitude. Consequently, the underwater pressure caused by depth has no significant impact on the detection effect.

Dataset and algorithm improvement

Due to the irreplaceability of artificial operations in some underwater environments and the consideration of the life safety of ocean engineering workers, we selected 7 toxic marine organisms commonly found in the ocean, such as jellyfish, puffers, sea anemones, conidaes, starfish, sea snakes, and sea urchins (see Figure S9), as detection targets to help them detect and identify toxic marine organisms from a certain distance. Through early warning to achieve the purpose of reducing the risk of work. The existing open-source marine biological dataset WildFish,48 Labeled fishes in the wild,49 Fish4Knowledge,50 etc., are either inconsistent in species or insufficient in volume and are not satisfied with the requirements of this study on datasets. Therefore, we chose to collect pictures to build our own dataset, and finally formed a dataset of toxic marine organisms with a total of 9379 pictures, and ensured that the number of pictures collected for each toxic marine organism was over 800. This study hoped to complete the detection and identification of toxic marine organisms from as far away as possible. Therefore, the improvement of the YOLOv5 model mainly focuses on small target detection. Here, a small target instance is defined by using a relatively small scale, that is, the ratio of the width and height of the boundary box of the target instance to the width and height of the image is less than 10%. By comparison, the precision of YOLOv5l is higher than that of YOLOv5s and YOLOv5m, and it is close to YOLOv5x but the detecting time is shorter (see Table S2) so the subsequent improvement of small target detection will be carried out based on YOLOv5l. Figure 4A shows the framework and precision of YOLO-UH. Through the small target verification dataset test analysis, compared with the original YOLOv5, YOLO-UH achieved 4.6% and 7.2% improvement in mAP(0.5) and mAP(0.5:0.95), respectively (see Figures 4A and S10). It can be seen that YOLO-UH plays a more positive role in the USG detection of small targets. Moreover, it can be seen from the confusion matrix that YOLO-UH has a better ability to classify toxic marine organisms (see Figure 4B). However, due to the certain transparency of jellyfish, it is occasionally possible to misidentify other species as jellyfish during validation. However, in the demonstration example of toxic marine organisms target detection in Figure 4C, YOLO-UH can locate and classify all the targets.

Figure 4.

Figure 4

YOLO-UH and onsite simulation experiment

(A) YOLO-UH framework and precision.

(B) Toxic Marine organism dataset confusion matrix.

(C) Demonstration of toxic marine organism target detection.

Practical test of target recognition effect

Validation in the context of the dataset alone does not yield practical and credible results, so we tested for the actual effective detection distance of the USG using a real underwater biological model. In this study, accuracy is defined as:

accuracy=nN (Equation 1)

where N is the total number of frames in the video clip, and n is the number of frames in the video clip where the target is detected correctly. We selected starfish, sea snake, puffer, and conidae from the pre-defined categories and added different weights of sand into the same amount of water, respectively, to study the relationship between accuracy and sediment concentration, as shown in Figures S11 and 5A. From the curves of accuracy-sediment concentration, it can be seen that with the increase of sediment concentration, the accuracy will decrease abruptly. This is because in the case of clear water, there are fewer particles in the water, and the propagation of light will not be hindered, so the underwater target is relatively easy to detect and the accuracy is higher. Once the sediment concentration increases, the number of particles in the water will increase, which will lead to lighter scattering and absorption, making the underwater target more blurred and difficult to distinguish visually, and destroying the color, texture, and other features of the image, so the accuracy begins to decline. Different from assumption, the accuracy will rise to a certain extent when the sediment concentration reaches 6–8 g. This may be due to the contrast between the detection target and the background will increase after the sediment concentration is higher than this critical point, making the target boundary relatively clearer, so the detection difficulty will be slightly reduced, but it still cannot return to the performance when the water is clear.

Figure 5.

Figure 5

Field test results and application prospects

(A) Experiment of accuracy-sediment concentration.

(B) Experiment of accuracy-distance.

(C) Field test results of the tactile module. The words safe and warning in the human field of view represent real-time collision detection results.

(D) Field test results of the vision module. The identification boxes in the human’s field of view are used to indicate the location of a toxic organism, and the letters and numbers on the identification box represent the species of the organism and the probability that the AI believes that the identification box is a toxic organism of that type, respectively.

(E) Ocean testing of swimming and diving with the USG around the Gouqi island in the Hangzhou Bay, East China Sea.

The experiment on the impact of distance on accuracy was completed in the natatorium. As shown in Figure 5B, we found that the overall curve of accuracy distance increased first and then decreased. It is easy to understand that when the distance is very close, the target takes up most of the space in the image, and some parts and outlines are out of the image range, resulting in only partial features of the target being captured. However, when the target is far away from the camera, its size will gradually shrink to an appropriate size in the image, its features will be fully extracted, and the detection accuracy will also be improved. After that, when the distance further increases, the features of the target will gradually become blurred. It is worth noting that in the experiment, when someone swims from the side, turbulence will be generated, resulting in image distortion and instability, which will also adversely affect target detection. As for the problem of not detecting conidae, we think it may be because the size of the solid model of conidae is much smaller than that of the other three toxic marine organisms. As a result, when these targets occupy the same space in the image, the actual detection distance corresponding to conidae is closer, and the appropriate detection range is within 0.5 m. By examining the training set, we found that the photos of the conch were indeed almost all taken at close range (around 0.15 m).

On-site test

We field-tested the tactile and vision modules in real waters. For the tactile module, as shown in Figure 5C, we used a floating board to impact the USG-wearing volunteer twice, with the first impact being smaller and the second being larger. In the first impact, the word “Safe” remained unchanged on the screen, but in the second impact, we could observe the “Warning!!!” on the screen (see Video S4). The previous phenomenon shows that KTM can realize a human-like tactile collision detection effect in underwater wet environments, which serves as a warning function for human beings. In the process of exploring the underwater world, human beings do not adapt to the environment, resulting in nervousness, lack of oxygen, and sensory decline of the situation occurs from time to time, the tactile module with the collision detection function can provide safety warning for human diving to help. For the vision module, as shown in Figure 5D, the USG can accurately recognize and warn of starfish targets that appear in the field of view (see Video S5 for the field test video with more identification results). Meanwhile, USG provides real-time display of the detection frame rates in the upper left corner of the screen. Typically, the system can sustain a frame rate of around 13 frames per second (FPS) during the process of target detection. This on-screen feedback allows users to monitor the detection efficiency in real-time. Recognition of underwater toxic organisms can help humans reduce the risk of injury and have a better diving experience. The USG system consists of tactile and vision modules and is primarily used for near-far safety warnings during diving. The tactile module detects collisions as a near safety warning, while the vision module identifies toxic underwater creatures as a far safety warning.

Video S4. Impact test in the swimming pool for tactile warning, related to Figure 5
Download video file (7.5MB, mp4)
Video S5. Target detection test in the swimming pool for vision warning, related to Figure 5
Download video file (8.3MB, mp4)

We conducted the ocean testing of swimming and diving with the USG around the Gouqi island in the Hangzhou Bay, East China Sea, as shown in Figure 5E (also see Figure S12 and Video S6) with the help of volunteers. The result shows that the USG remained in normal working conditions after 20 min of intense exercise. This demonstrates the feasibility of USG use in a real marine environment.

Video S6. Field tests at Gouqi Island, China, related to Figure 5

Breaststroke in the ocean with USG.

Download video file (10.1MB, mp4)

Future work

Energy is a crucial issue in any edge system, in the design of this paper we have used batteries to power the USG and it is difficult to avoid some trouble in changing the batteries. In our future work we will try to improve the output performance of the KTM from both material and structural perspectives, if the KTM acts as a collision monitoring device and also has some energy harvesting effect, this will be a great breakthrough. In addition, KTM’s axial stretchability and excellent monitoring performance in the radial direction provide new ideas for the design of flexible wearable, and in the future, we will focus on improving the coverage of KTM to carry out research on the new generation of smart wearable.

Conclusions

In this paper, we propose a visual-tactile fusion hazard detection system for underwater exploration—USG. We designed the collision detection module KTM based on functional mechanical metamaterials. The module realizes axial stretchability and radial pressure-electrical signal output through the design of a spatial kirigami structure. We built a large toxic marine organism target detection dataset with 7 categories and a total of 9379 images and proposed YOLO-UH based on small target detection improvement. YOLO-UH achieved 4.6% and 7.2% improvement in mAP(0.5) and mAP(0.5:0.95) compared with YOLO v5, respectively. Both tactile and visual modules were successfully tested in real water, and the system still worked when the volunteers wore the USG for 20 min in the actual sea. The USG with its visual-tactile fusion concept enables near-far all-around hazard detection, reducing the danger for divers working underwater. Furthermore, the practice of perception fusion in this paper provides new ideas for the research of wearable systems.

Limitations of the study

Further optimization is still required for the proposed USG system. The limitations of this study can be summarized as follows.

  • Due to the limitations of the design of glasses, the collision monitoring of the USG is currently limited to one area. In future works, the integrated design of the glasses and the diving suit can realize the collision monitoring of the head and even the whole body.

  • By optimizing the target detection algorithm and improving the camera property, we believe a longer detection distance can be achieved.

  • The current study concentrates on the fusion of tactile and visual perception for hazard detection, the addition of smell, hearing, and even taste will allow wearable systems to enhance human perception to a new level.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Chemicals, peptides, and recombinant proteins

Silicone Sealant Kafuter K-705
PLA Polymaker PolySonic PLA Pro
Photosensitive resin eSUN S200 Standard Resin

Software and algorithms

Python 3.8 Python Software Foundation https://www.python.org/
PyTorch 1.12.0 PyTorch Foundation https://pytorch.org/get-started/previous-versions/
YOLO v5 YOLO Foundation https://github.com/ultralytics/yolov5

Other

RTX3090ti (training) Nvidia 24G
Jetson nano (deploying) Nvidia 4G

Resource availability

Lead contact

Pengcheng Jiao (Email: pjiao@zju.edu.cn) takes responsibility for the Lead Contact role.

Materials availability

This study did not generate new unique reagents.

Data and code availability

  • All data reported in this paper will be shared by the lead contact (Pengcheng Jiao, pjiao@zju.edu.cn) upon request.

  • Any additional information required to reanalyze the data reported in this paper is also available from the lead contact upon request.

Method details

Sealant

We used silicone 705 to seal the joints. We chose this material because silicone 705 is transparent in color, which will not affect the optical path, and at the same time, it also has superior properties such as moisture-proof, electrical insulation, resistance to high and low temperatures with stable performance, which is perfectly in line with the needs of the working scene of the underwater glasses.

Switching mechanism design

Due to the special characteristics of the underwater environment, we need to design a switch that is on the outside. During the design process, we intentionally used the mechanical transmission structure to realize the function to improve the stability of the system under extreme conditions and reduce the power consumption at the same time. Inspired by the lever, the switch mechanism scheme is designed (see Figure S2), which is based on the principle that the upper and lower connecting rods are hinged to the center ring, and the front end of the upper rod is set to the internal switch to drive its displacement. When the rear end of the lower lever is pressed, driven by the two arms of the center ring, the upper lever will move in the opposite direction, and the front end of the upper lever drives the internal switch to open, thus starting the system; when the rear end of the upper lever is pressed, the front end of the upper lever directly drives the internal switch to close, thus shutting down the system.

The electrical test scenario of KTM

To test the electrical performance of KTM, we built an experimental scenario. We used the Hanshen fatigue testing machine, clamped the fixture to the lower end of the machine with screws, and the KTM was stuck in the vacancy of the fixture, and the two were tightly connected. At the same time, the striking source which simulates the real strike is fixed on the upper end of the machine with screws, and the KTM is connected to an electrometer (Keithley system 6514) to measure the voltage/current output from the device in real time, and a laptop is connected to the electrometer to record the voltage/current output. The laptop is connected to the electrometer to record the data. At the beginning of the experiment, the upper end of the machine was loaded with a specific frequency and amplitude controlled by the software of the host computer, and a smooth and continuous signal was obtained on the laptop.

Methods for improving the dataset

In the process of improving this dataset, firstly, the background of the images in the dataset is randomly selected, and the instances in the figure are minified according to the standard that the absolute scale is less than 64 × 64 pixels. Then, the reduced instances are placed in the same image, and the solid color background is added. The resulting picture size is 640 × 640 pixels. Make the relative width and height of the instance less than 10%. Due to the difference in the actual size of these toxic marine organisms in reality, the actual detection distance corresponding to different kinds of targets can be up to 3 m away, which is in line with the research purpose of this study.

Steps to make small target improvements to the YOLO model

First, we added a Transformer self-attention mechanism to YOLOv5’s backbone network. Transformer is a neural network architecture based on a self-attention mechanism, which was originally used for machine translation tasks in natural language processing.51 Later, it has been successfully applied to some tasks in the field of computer vision, such as image classification, object detection, and semantic segmentation. The main task of YOLOv5’s backbone network is to extract features from the original image, and integrating the Transformer into the backbone network can help the model capture more global context information and improve detection precision. In addition, this approach allows the model to share parameters during training, which helps reduce the training time and resource requirements, and avoids the large number of operations when it is added to the detection head. Secondly, detection feature maps of different scales can be obtained by using different feature pyramid layers, while the detection layer of the original YOLOv5 does not have a good way of dealing with small targets. Therefore, we add the xsmall layer in front of the small layer, using a smaller receptive field and higher resolution feature map to detect small targets. Finally, an attention module is added at the end of each detection layer. The human brain can achieve selective regulation of attention by selectively enhancing or suppressing certain neural signals, a mechanism that helps us focus on information of interest while ignoring irrelevant information, saving cognitive resources. Due to the similarity between computer vision and human vision, the attention mechanism can also realize the transfer application in computer vision, that is, under the limitation of computing power resources, focus on the useful information in the image to improve efficiency. Here, we propose a new Attention Module ECSAM (Efficient Channel and Spatial Attention Module) based on the ECA module. The original ECA module was a lightweight attention module that performed one-dimensional convolution operations on each channel to learn their respective weights,52 thus saving computing resources in terms of channel dimensionality processing than using global averaging pooling and full connection layers, which is critical in mobile computer vision applications where computing resources are limited. But that doesn’t mean that improvements in precision aren’t important; instead, we want to balance the leverage that exists between reducing the cost of computing and improving the precision of computing. Therefore, spatial attention has been added to the ECSAM module to help the model learn the distribution of attention in spatial dimensions, that is, to calculate the weight of each spatial position. As a result, the ECSAM module can improve the performance of toxic marine biological detection, which is more sensitive to channel characteristics and spatial characteristics, and is more suitable for deployment in the USG.

Acknowledgments

This work was supported by the National Key R&D Program of China (2023YFC3008100). P.J. acknowledges the Startup Fund of the One-Hundred Talent Program at the Zhejiang University, China.

Author contributions

Methodology: Z.M. and C.Z.; software: Z.M.; resources: P.J.; writing—original draft: Z.M. and C.Z.; writing—review and editing: P.J.; conceptualization: P.J.; supervision: P.J.; funding acquisition: P.J.

Declaration of interests

The authors declare no competing interests.

Published: March 11, 2024

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.isci.2024.109479.

Supplemental information

Document S1. Figures S1–S12 and Tables S1–S4
mmc1.pdf (1.6MB, pdf)
Video S3. Experimental study on the influence of humidity on KTM output, related to Figure 3
Download video file (7.5MB, mp4)

References

  • 1.Gao W., Emaminejad S., Nyein H.Y.Y., Challa S., Chen K., Peck A., Fahad H.M., Ota H., Shiraki H., Kiriya D., et al. Fully integrated wearable sensor arrays for multiplexed in situ perspiration analysis. Nature. 2016;529:509–514. doi: 10.1038/nature16521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Kireev D., Sel K., Ibrahim B., Kumar N., Akbari A., Jafari R., Akinwande D. Continuous cuffless monitoring of arterial blood pressure via graphene bioimpedance tattoos. Nat. Nanotechnol. 2022;17:864–870. doi: 10.1038/s41565-022-01145-w. [DOI] [PubMed] [Google Scholar]
  • 3.Zhang J., Kim K., Kim H.J., Meyer D., Park W., Lee S.A., Dai Y., Kim B., Moon H., Shah J.V., et al. Smart soft contact lenses for continuous 24-hour monitoring of intraocular pressure in glaucoma care. Nat. Commun. 2022;13:5518–5615. doi: 10.1038/s41467-022-33254-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Zhu H., Yang H., Zhan L., Chen Y., Wang J., Xu F. Hydrogel-based smart contact lens for highly sensitive wireless intraocular pressure monitoring. ACS Sens. 2022;7:3014–3022. doi: 10.1021/acssensors.2c01299. [DOI] [PubMed] [Google Scholar]
  • 5.Lee J., Kim D., Ryoo H.Y., Shin B.S. Sustainable wearables: Wearable technology for enhancing the quality of human life. Sustainability. 2016;8:466. doi: 10.3390/su8050466. [DOI] [Google Scholar]
  • 6.Dai N., Lei I.M., Li Z., Li Y., Fang P., Zhong J. Recent advances in wearable electromechanical sensors—Moving towards machine learning-assisted wearable sensing systems. Nano Energy. 2023;105 doi: 10.1016/j.nanoen.2022.108041. [DOI] [Google Scholar]
  • 7.Chang W.J., Chen L.B., Hsu C.H., Chen J.H., Yang T.C., Lin C.P. MedGlasses: a wearable smart-glasses-based drug pill recognition system using deep learning for visually impaired chronic patients. IEEE Access. 2020;8:17013–17024. https://ieeexplore.ieee.org/abstract/document/8962044 [Google Scholar]
  • 8.Chang W.J., Chen L.B., Chiou Y.Z. Design and implementation of a drowsiness-fatigue-detection system based on wearable smart glasses to increase road safety. IEEE Trans. Consum. Electron. 2018;64:461–469. https://ieeexplore.ieee.org/abstract/document/8493318 [Google Scholar]
  • 9.Sempionatto J.R., Brazaca L.C., García-Carmona L., Bolat G., Campbell A.S., Martin A., Tang G., Shah R., Mishra R.K., Kim J., et al. Eyeglasses-based tear biosensing system: non-invasive detection of alcohol, vitamins, and glucose. Biosens. Bioelectron. 2019;137:161–170. doi: 10.1016/j.bios.2019.04.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Moshayedi A.J., Uddin N.M.I., Khan A.S., Zhu J., Emadi Andani M. Designing and Developing a Vision-Based System to Investigate the Emotional Effects of News on Short Sleep at Noon: An Experimental Case Study. Sensors. 2023;23:8422. doi: 10.3390/s23208422. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Yu J., Sun H., Xia Z., Zhu J., Zhang Z. Sample balancing of curves for lens distortion modeling and decoupled camera calibration. Opt Commun. 2023;537 doi: 10.1016/j.optcom.2022.129221. [DOI] [Google Scholar]
  • 12.Yu J., Yang Y., Zhang H., Sun H., Zhang Z., Xia Z., Zhu J., Dai M., Wen H. Spectrum Analysis Enabled Periodic Feature Reconstruction Based Automatic Defect Detection System for Electroluminescence Images of Photovoltaic Modules. Micromachines. 2022;13:332. doi: 10.3390/mi13020332. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Niknejad N., Ismail W.B., Mardani A., Liao H., Ghani I. A comprehensive overview of smart wearables: the state of the art literature, recent advances, and future challenges. Eng. Appl. Artif. Intell. 2020;90 doi: 10.1016/j.engappai.2020.103529. [DOI] [Google Scholar]
  • 14.Kim D., Choi Y. Application of Smart Glasses for Field Workers Performing Soil Contamination Surveys with Portable Equipment. Sustainability. 2022;14 doi: 10.3390/su141912370. [DOI] [Google Scholar]
  • 15.Matsuhashi K., Kanamoto T., Kurokawa A. Thermal model and countermeasures for future smart glasses. Sensors. 2020;20:1446. doi: 10.3390/s20051446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Rallapalli S., Ganesan A., Chintalapudi K., Padmanabhan V.N., Qiu L. Proceedings of the 20th annual international conference on Mobile computing and networking. 2014. Enabling physical analytics in retail stores using smart glasses; pp. 115–126. [DOI] [Google Scholar]
  • 17.Rao N., Zhang L., Chu S.L., Jurczyk K., Candelora C., Su S., Kozlin C. 2020 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW) IEEE; 2020. Investigating the necessity of meaningful context anchoring in AR smart glasses interaction for everyday learning; pp. 427–432.https://ieeexplore.ieee.org/abstract/document/9090481 [Google Scholar]
  • 18.Sara G., Todde G., Caria M. Assessment of video see-through smart glasses for augmented reality to support technicians during milking machine maintenance. Sci. Rep. 2022;12 doi: 10.1038/s41598-022-20154-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Zhao C., Chavan S., He X., Zhou M., Cazzonelli C.I., Chen Z.H., Tissue D.T., Ghannoum O. Smart glass impacts stomatal sensitivity of greenhouse Capsicum through altered light. J. Exp. Bot. 2021;72:3235–3248. doi: 10.1093/jxb/erab028. [DOI] [PubMed] [Google Scholar]
  • 20.Ashok A., Xu C., Vu T., Gruteser M., Howard R., Zhang Y., Mandayam N., Yuan W., Dana K. What am i looking at? low-power radio-optical beacons for in-view recognition on smart-glass. IEEE Trans. Mobile Comput. 2016;15:3185–3199. https://ieeexplore.ieee.org/abstract/document/7394151 [Google Scholar]
  • 21.Bruno F., Barbieri L., Mangeruga M., Cozza M., Lagudi A., Čejka J., Liarokapis F., Skarlatos D. Underwater augmented reality for improving the diving experience in submerged archaeological sites. Ocean Eng. 2019;190 doi: 10.1016/j.oceaneng.2019.106487. [DOI] [Google Scholar]
  • 22.Čejka J., Zsíros A., Liarokapis F. A hybrid augmented reality guide for underwater cultural heritage sites. Personal Ubiquitous Comput. 2020;24:815–828. doi: 10.1007/s00779-019-01354-6. [DOI] [Google Scholar]
  • 23.Bai S., Xu Q., Gu L., Ma F., Qin Y., Wang Z.L. Single crystalline lead zirconate titanate (PZT) nano/micro-wire based self-powered UV sensor. Nano Energy. 2012;1:789–795. doi: 10.1016/j.nanoen.2012.09.001. [DOI] [Google Scholar]
  • 24.Liu Y., Zhao L., Wang L., Zheng H., Li D., Avila R., Lai K.W.C., Wang Z., Xie Z., Zi Y., Yu X. Skin-integrated graphene-embedded lead zirconate titanate rubber for energy harvesting and mechanical sensing. Adv. Mater. Technol. 2019;4 doi: 10.1002/admt.201900744. [DOI] [Google Scholar]
  • 25.Jiao P., Mueller J., Raney J.R., Zheng X.R., Alavi A.H. Mechanical metamaterials and beyond. Nat. Commun. 2023;14:6004. doi: 10.1038/s41467-023-41679-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Mohammadi A., Tan Y., Choong P., Oetomo D. Flexible mechanical metamaterials enabling soft tactile sensors with multiple sensitivities at multiple force sensing ranges. Sci. Rep. 2021;11 doi: 10.1038/s41598-021-03588-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ma K., Tan T., Yan Z., Liu F., Liao W.H., Zhang W. Metamaterial and Helmholtz coupled resonator for high-density acoustic energy harvesting. Nano Energy. 2021;82 doi: 10.1016/j.nanoen.2020.105693. [DOI] [Google Scholar]
  • 28.Tian X., Zeng Q., Kurt S.A., Li R.R., Nguyen D.T., Xiong Z., Li Z., Yang X., Xiao X., Wu C., et al. Implant-to-implant wireless networking with metamaterial textiles. Nat. Commun. 2023;14:4335. doi: 10.1038/s41467-023-39850-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Xu J., Cao J., Guo M., Yang S., Yao H., Lei M., Hao Y., Bi K. Metamaterial mechanical antenna for very low frequency wireless communication. Adv. Compos. Hybrid Mater. 2021;4:761–767. doi: 10.1007/s42114-021-00278-1. [DOI] [Google Scholar]
  • 30.Zhu J., Zhu M., Shi Q., Wen F., Liu L., Dong B., Haroun A., Yang Y., Vachon P., Guo X., et al. Progress in TENG technology—A journey from energy harvesting to nanoenergy and nanosystem. EcoMat. 2020;2 doi: 10.1002/eom2.12058. [DOI] [Google Scholar]
  • 31.Mao Y., Wen Y., Liu B., Sun F., Zhu Y., Wang J., Zhang R., Yu Z., Chu L., Zhou A. Flexible wearable intelligent sensing system for wheelchair sports monitoring. iScience. 2023;26 doi: 10.1016/j.isci.2023.108126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Zhu Y., Zhao T., Sun F., Jia C., Ye H., Jiang Y., Wang K., Huang C., Xie Y., Mao Y. Multi-functional triboelectric nanogenerators on printed circuit board for metaverse sport interactive system. Nano Energy. 2023;113 doi: 10.1016/j.nanoen.2023.108520. [DOI] [Google Scholar]
  • 33.Sun F., Zhu Y., Jia C., Wen Y., Zhang Y., Chu L., Zhao T., Liu B., Mao Y. Deep-Learning-Assisted Neck Motion Monitoring System Self-Powered Through Biodegradable Triboelectric Sensors. Adv. Funct. Mater. 2023;2310742 doi: 10.1002/adfm.202310742. [DOI] [Google Scholar]
  • 34.Chen X., Xie X., Liu Y., Zhao C., Wen M., Wen Z. Advances in healthcare electronics enabled by triboelectric nanogenerators. Adv. Funct. Mater. 2020;30 doi: 10.1002/adfm.202004673. [DOI] [Google Scholar]
  • 35.Dong K., Peng X., Cheng R., Ning C., Jiang Y., Zhang Y., Wang Z.L. Advances in High-Performance Autonomous Energy and Self-Powered Sensing Textiles with Novel 3D Fabric Structures. Adv. Mater. 2022;34 doi: 10.1002/adma.202109355. [DOI] [PubMed] [Google Scholar]
  • 36.Dong K., Wang Z.L. Self-charging power textiles integrating energy harvesting triboelectric nanogenerators with energy storage batteries/supercapacitors. J. Semiconduct. 2021;42 doi: 10.1088/1674-4926/42/10/101601. [DOI] [Google Scholar]
  • 37.Dong K., Hu Y., Yang J., Kim S.W., Hu W., Wang Z.L. Smart textile triboelectric nanogenerators: Current status and perspectives. MRS Bull. 2021;46:512–521. doi: 10.1557/s43577-021-00123-2. [DOI] [Google Scholar]
  • 38.Dong K., Peng X., Cheng R., Wang Z.L. Smart textile triboelectric nanogenerators: prospective strategies for improving electricity output performance. Nanoenergy Adv. 2022;2:133–164. doi: 10.3390/nanoenergyadv2010006. [DOI] [Google Scholar]
  • 39.Chang A., Uy C., Xiao X., Xiao X., Chen J. Self-powered environmental monitoring via a triboelectric nanogenerator. Nano Energy. 2022;98 doi: 10.1016/j.nanoen.2022.107282. [DOI] [Google Scholar]
  • 40.Xia K., Liu J., Li W., Jiao P., He Z., Wei Y., Qu F., Xu Z., Wang L., Ren X., et al. A self-powered bridge health monitoring system driven by elastic origami triboelectric nanogenerator. Nano Energy. 2023;105 doi: 10.1016/j.nanoen.2022.107974. [DOI] [Google Scholar]
  • 41.Xu X., Wu Q., Pang Y., Cao Y., Fang Y., Huang G., Cao C. Multifunctional Metamaterials for Energy Harvesting and Vibration Control. Adv. Funct. Mater. 2022;32 doi: 10.1002/adfm.202107896. [DOI] [Google Scholar]
  • 42.Barri K., Jiao P., Zhang Q., Chen J., Lin Wang Z., Alavi A.H. Multifunctional meta-tribomaterial nanogenerators for energy harvesting and active sensing. Nano Energy. 2021;86 doi: 10.1016/j.nanoen.2021.106074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Deng J., Dong W., Socher R., Li L.J., Li K., Fei-Fei L. 2009 IEEE conference on computer vision and pattern recognition. IEEE; 2009. Imagenet: A large-scale hierarchical image database; pp. 248–255.https://ieeexplore.ieee.org/abstract/document/5206848 [Google Scholar]
  • 44.Lin T.Y., Maire M., Belongie S., Hays J., Perona P., Ramanan D., Dollár P., Zitnick C.L. Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer International Publishing; 2014. Microsoft coco: Common objects in context; pp. 740–755.https://link.springer.com/chapter/10.1007/978-3-319-10602-1_48 [Google Scholar]
  • 45.Everingham M., Van Gool L., Williams C.K.I., Winn J., Zisserman A. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 2010;88:303–338. doi: 10.1007/s11263-009-0275-4. [DOI] [Google Scholar]
  • 46.Redmon J., Divvala S., Girshick R., Farhadi A. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. You only look once: Unified, real-time object detection; pp. 779–788.https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf [Google Scholar]
  • 47.Ren S., He K., Girshick R., Sun J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015;28 doi: 10.1109/TPAMI.2016.2577031. https://proceedings.neurips.cc/paper_files/paper/2015/hash/14bfa6bb14875e45bba028a21ed38046-Abstract.html [DOI] [PubMed] [Google Scholar]
  • 48.Zhuang P., Wang Y., Qiao Y. Proceedings of the 26th ACM international conference on Multimedia. 2018. Wildfish: A large benchmark for fish recognition in the wild; pp. 1301–1309. [DOI] [Google Scholar]
  • 49.Cutter G., Stierhoff K., Zeng J. 2015 IEEE Winter Applications and Computer Vision Workshops. IEEE; 2015. Automated detection of rockfish in unconstrained underwater videos using haar cascades and a new image dataset: Labeled fishes in the wild; pp. 57–62.https://ieeexplore.ieee.org/abstract/document/7046815 [Google Scholar]
  • 50.Hsiao Y.H., Chen C.C., Lin S.I., Lin F.P. Real-world underwater fish recognition and identification, using sparse representation. Ecol. Inf. 2014;23:13–21. doi: 10.1016/j.ecoinf.2013.10.002. [DOI] [Google Scholar]
  • 51.Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser Ł., Polosukhin I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017;30 https://proceedings.neurips.cc/paper_files/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html [Google Scholar]
  • 52.Wang Q., Wu B., Zhu P., Li P., Zuo W., Hu Q. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020. ECA-Net: Efficient channel attention for deep convolutional neural networks; pp. 11534–11542.https://openaccess.thecvf.com/content_CVPR_2020/papers/Wang_ECA-Net_Efficient_Channel_Attention_for_Deep_Convolutional_Neural_Networks_CVPR_2020_paper.pdf [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Video S1. Tension, shearing and torsion test of the kirigami module, related to Figure 2
Download video file (12.2MB, mp4)
Video S2. The electrical test scenario of KTM, related to Figure 3
Download video file (16.6MB, mp4)
Video S4. Impact test in the swimming pool for tactile warning, related to Figure 5
Download video file (7.5MB, mp4)
Video S5. Target detection test in the swimming pool for vision warning, related to Figure 5
Download video file (8.3MB, mp4)
Video S6. Field tests at Gouqi Island, China, related to Figure 5

Breaststroke in the ocean with USG.

Download video file (10.1MB, mp4)
Document S1. Figures S1–S12 and Tables S1–S4
mmc1.pdf (1.6MB, pdf)
Video S3. Experimental study on the influence of humidity on KTM output, related to Figure 3
Download video file (7.5MB, mp4)

Data Availability Statement

  • All data reported in this paper will be shared by the lead contact (Pengcheng Jiao, pjiao@zju.edu.cn) upon request.

  • Any additional information required to reanalyze the data reported in this paper is also available from the lead contact upon request.


Articles from iScience are provided here courtesy of Elsevier

RESOURCES