Skip to main content
Heliyon logoLink to Heliyon
. 2023 Aug 1;9(8):e18826. doi: 10.1016/j.heliyon.2023.e18826

Solar panel defect detection design based on YOLO v5 algorithm

Jing Huang 1, Keyao Zeng 1,, Zijun Zhang 1, Wanhan Zhong 1
PMCID: PMC10415890  PMID: 37576324

Abstract

Defects of solar panels can easily cause electrical accidents. The YOLO v5 algorithm is improved to make up for the low detection efficiency of the traditional defect detection methods. Firstly, it is improved on the basis of coordinate attention to obtain a LCA attention mechanism with a larger target range, which can enhance the sensing range of target features in addition to fully capturing feature information; secondly, the weighted bidirectional feature pyramid is used to balance the feature information with excessive pixel differences by assigning different weights, which is more conducive to multi-scale Fast fusion of features; finally, the usual coupled head of YOLO series is replaced with decoupled head, so that the task branch can be performed more accurately and the detection accuracy can be improved. The results of comparative experiments on the solar panel defect detection data set show that after the improvement of the algorithm, the overall precision is increased by 1.5%, the recall rate is increased by 2.4%, and the mAP is up to 95.5%, which is 2.5% higher than that before the improvement. It can more accurately determine whether there are defects, standardize the quality of solar panels, and ensure electrical safety.

Keywords: Solar panels, Defect detection, YOLO v5, Electrical safety

1. Introduction

Clean energy, that is, energies that can be recycled in nature, such as tidal energy, wind energy, geothermal energy and solar energy. Compared with the traditional non-renewable energy that is disappearing, it is inexhaustible, and their regeneration is automatic, without human intervention, and will not cause excessive damage to the environment [1]. As the research focus and future trend in the field of renewable energy, the research on solar energy is essential. In the practical application of solar energy, the most extensive is the manufacture of solar panels. The quality and efficiency of electricity generated by photovoltaic power generation are closely related to the goodness of the panel [[2], [3], [4]]. Due to the limitation of solar panel materials and the deviation of mechanical force and thermal force in the process of processing, there will be many defects, resulting in loss problems [5]. If these defective panels are put into use, they may reduce the efficiency of power generation and the life of the product, or even cause serious safety accidents. In order to avoid such accidents, it is a top priority to carry out relevant quality inspection before the solar panels leave the factory.

For the defect detection of solar panels, the main traditional methods are divided into artificial physical method and machine vision method. Byung-Kwan Kang et al. [6] used a suitable temperature control procedure to adjust the relationship between the measured voltage and current, and estimated the photovoltaic array using Kalman filter algorithm with a polynomial model with noise suppression. If the output power is reduced, it indicates that the solar panels are not intact, but the specific defects cannot be identified. Tsuzuki K et al. [7] proposed to use the relationship between the voltage and current obtained on a specific semiconductor after a bypass diode or solar cell element was supplied with forward current or voltage to enable the detection of its defects. Esquivel [8] used contrast-enhanced illumination to detect solar panel crack defects. This method distinguished whether there was a defect by the fact that the reflection degree of light was different between the good battery board and the defective battery board. Tsai D M et al. [9] proposed to identify the enhanced crack faults on solar panels from the differential images using anisotropic diffusion technique on the images using gray scale features and gradients to adjust the magnitude of diffusion coefficients using Fourier image reconstruction method. Although the above methods can identify whether there are defects in solar panels or identify individual categories of defects, they cannot identify multiple categories of defects in solar panels at the same time, and their detection accuracy is poor, efficiency is low.

With the deepening of intelligent technology, deep learning detection algorithm can more accurately and easily identify whether the solar panel is defective and the specific defect category, which is broadly divided into two-stage detection algorithm and one-stage detection algorithm. The two-stage algorithm classifies the target by roughly extracting the suggestion box of the target. For example, Mask R–CNN [10], Fast R–CNN [11], and Faster R–CNN [12] and other algorithms, although the missing rate is low, have the disadvantages of complex calculation and slow speed. The single-stage algorithm does not need to generate candidate region stage, and the result can be obtained directly through a single detection, which greatly improves the detection speed. Therefore, the detection accuracy of algorithms such as SSD [13], YOLO (You Only Look Once) series [[14], [15], [16], [17]], and Retina Net [18] decreases slightly, but it is also within the acceptable range. In comparison, the single-stage algorithm is more in line with the detection needs of industrial production. This paper gives priority to using single-stage detection algorithm. Tsai D M et al. [19] proposed to cluster defect-free image samples during training using a binomial tree clustering algorithm, which evaluates clusters based on the consistency metric of principal component analysis (PCA). If a cluster had the lowest metric score, the clusters were operated using the fuzzy C-means (FCM) algorithm, i.e., the cluster with the lowest score was allowed to be re-clustered into 2 new clusters. The distance between each cluster center and the input sample is calculated to determine whether the input sample is defective or not. However, due to the need for manual continuous re-classification, this method has a certain degree of complexity. Demant et al. [20] conducted crack detection based on the support vector machine (SVM) method. First of all, the samples with and without cracks were trained, and the feature vectors of different types of samples were recorded using a set of local descriptors, which were send to the support vector machine for re-training and memory. During testing, the feature vectors of the input images extracted with local descriptors were then sent into the SVM for comparison, and the presence or absence of crack defects could be determined. Because this method completely depends on a set of local descriptors to judge defects, if the local descriptor is wrong, it will cause a large error. Therefore, this method had some limitations.

In view of the problems existing in the above defect detection methods, a solar panel defect detection algorithm YOLO v5-BDL model based on YOLO v5 algorithm is proposed. It enables the network to identify and classify a variety of defects, improve the accuracy of defect detection, reduce the rate of false detection and missed detection, and enable the network to combine target features more efficiently. Because this method mainly uses software for detection, there is no other large cost, so it solves the shortcomings of low efficiency and high cost of the traditional method. The details of the improvements are as follows:

  • 1.

    Using an attention mechanism that adjusts the convolution layer and activation function. Make the whole network capture cross-channel information and direction-aware and location-aware information at the same time in a larger range of attention, and enhance the ability to absorb target features.

  • 2.

    Insert weights into the pyramid network to control the importance of bi-directional features and achieve higher-level feature fusion.

  • 3.

    The decoupled head is used in the output part to speed up the model to reach a steady state and improve the accuracy of the network.

2. YOLO v5 algorithm introduction

YOLOv5 follows the overall layout of the YOLO series, which consists of four parts [21], as shown in Fig. 1.

Fig. 1.

Fig. 1

YOLO v5 Network Structure.

Input: It is mainly divided into three parts. Mosaic data enhancement means that four random photos of solar panel defects in the data set are cut and spliced into an image to achieve online enhancement of the data image. Adopt adaptive anchor box calculation is used to update the anchor box size by iteratively updating the absolute value of the difference of the prediction box, so as to adaptively calculate the optimal anchor box value. The information features of the web receptive field are exploited as much as possible by adaptive image scaling techniques.

Backbone: It consists of CBS, C3_X, and SPPF structures, which are used to achieve the extraction of target features, and the composition diagram of each module can be seen in Fig. 1. CBS can reduce the dimension of the image to a lower level in order to better retain the basic effective information. C3_X and SPPF can do as much as possible to streamline the network structure and reduce the computational effort while ensuring the accuracy approximation.

Neck: Mainly composed of C3_X_F structure and FPN + PAN structure. Compared to C3 in Backbone, C3_F in Neck eliminates the use of shortcut and cuts the number of channels. The feature pyramid network [22] (FPN) uses top-down side-by-side connections that is, the up-sampling method. This design can better transmit high-level information, but after transmission through multiple layers, the information pixels at the bottom level become blurred. Path aggregation network [23] (PAN) uses bot-tom-up paths to deliver and enhance the underlying target information. Adding PAN after FPN can make up for the localization target information, and the specific direction of the high-level and low-level features can be seen in Fig. 2.

Fig. 2.

Fig. 2

FPN + PAN network structure.

Head: After a variety of improvements, the bounding box loss function uses CIoU_Loss in the YOLO v5 version to optimize the relative proportion of the detection box. Using NMS non-maximum sup-pression, we remove the redundant prediction boxes that have large differences and do not meet the criteria, and select the most optimal box.

3. Model optimization

The YOLO series algorithms are divided into several classes of different depths and widths according to the size of the model, and the suffixes are recorded as s, m, l, and x, whose depth and width increase gradually. Because most of the target pixels of the solar panel defect detection data set are small, it is necessary to improve the feature recognition ability of the model. The weighted bi-directional feature pyramid network is used to replace the feature fusion network in the neck network to improve the reading ability of target features; the three-layer LCA is added to improve the target attention range; the output part is changed into a decoupled head, although the number of parameters will increase to a certain extent, it can make the network reach a stable state more quickly. The parameters of each layer of the improved YOLO v5-BDL algorithm are shown in Table 1.

Table 1.

Overall parameters of YOLO v5-BDL.

Serial No. From Params Module Arguments
0 −1 3520 Conv [3, 32, 6, 2, 2]
1 −1 18560 Conv [32, 64, 3, 2]
2 −1 18816 C3 [64, 64, 1]
3 −1 73984 Conv [64, 128, 3, 2]
4 −1 115712 C3 [128, 128, 2]
5 −1 295424 Conv [128, 256, 3, 2]
6 −1 625152 C3 [256, 256, 3]
7 −1 1180672 Conv [256, 512, 3, 2]
8 −1 1182720 C3 [512, 512, 1]
9 [-1,6] 656896 SPPF [512, 512, 5]
10 −1 131584 Conv [512, 256, 1, 1]
11 −1 0 Upsample [None, 2, 'nearest']
12 −1 65794 BiFPN_Add2 [256, 256]
13 [-1,4] 296448 C3 [256, 256, 1, False]
14 −1 33024 Conv [256, 128, 1, 1]
15 −1 0 Upsample [None, 2, 'nearest']
16 −1 16514 BiFPN_Add2 [128, 128]
17 [-1,13,6] 74496 C3 [128, 128, 1, False]
18 −1 3088 LCA [128, 128]
19 −1 295424 Conv [128, 256, 3, 2]
20 −1 65795 BiFPN_Add2 [256, 256]
21 [-1,10] 296448 C3 [256, 256, 1, False]
22 −1 6160 LCA [256, 256]
23 −1 590336 Conv [256, 256, 3, 2]
24 −1 65794 BiFPN_Add2 [256, 256]
25 −1 1051648 C3 [256, 512, 1, False]
26 −1 24608 LCA [512, 512]
27 [18,22,26] 7342700 Decoupled_Detect [7, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]

3.1. Attentional mechanisms

Since there are individual categories of small targets in the solar panel defect dataset, their feature information is extremely easy to be lost during the convolution process of YOLO v5 model, so the channel attention method can be used to improve the model accuracy by enhancing the performance of important features and suppressing non-important features. However, the 2D global pooling used by SE Attention [24] ignores the importance of location information, and the large's scale convolution used by CBAM [25] cannot perform long-range dependence of location information, yet location information is of great importance for generating spatially selective attention maps. To sum up, the use of Coordinate Attention mechanism (CA) [26] can effectively solve the shortcomings mentioned above. Improve it and name it LCA. Using a 3 × 3 convolution with a larger receptive field enables it to combine position information and channel features in a larger range to obtain long-range dependencies with precise position information, making it easier to identify small pixel targets. Due to the shortcomings of the sigmoid function, the activation function is changed to the Leaky Relu function in LCA to reduce the calculation cost.

LCA attention is consistent with CA attention in the step of coordinate information embedding. It mainly improves the step of coordinate attention generation, and its contrast structure is shown in Fig. 3 (a) and (b).

Fig. 3.

Fig. 3

Attention mechanism. (a) CA; (b) LCA.

Coordinate information embedding part corresponds to the X Avg Pool and Y Avg Pool parts of the graph. That is, for the Abscissa direction X and the ordinate direction Y of the input image axis, the pooling kernel is used to encode each channel where h and w represent the height and width of the channel respectively. Channel cth is expressed by Eqs. (1), (2) [26].

zch(h)=1W0iWxc(h,i) (1)
zcw(w)=1H0jHxc(j,w) (2)

Then, the results obtained from the above expression are returned along the direction-aware attention expressed jointly in the two directions by means of feature aggregation.

Coordinate attention generation part cascades the two feature maps generated by the coordinate information embedding part, and uses a shared 1 × 1 convolution to transform F1, and generates the spatial information into an intermediate feature map f of spatial dimension by connection operations, batch normalization and other processing. Then f is sliced into fh along the transverse phasor and fw along the longitudinal phasor, which are independent of each other. After segmentation, 3 × 3 convolution is used to transform the number of channels to equal each other, and finally channel fusion is performed to generate attention graph.

3.2. Bi-directional feature pyramid network

The FPN + PAN combination constitutes a PANet as shown in Fig. 4(a), which solves the drawback of the FPN structure being limited by unidirectional information flow and the underlying information being blurred, but its essence is still the simple addition of features of different pixel sizes, which still leads to more large-size features being used while ignoring the utilization of small-size features. Based on this, the net-work structure is improved by using a weighted bi-directional feature pyramid [27] (BiFPN), whose structure diagram is shown in Fig. 4(b). The advantage of BiFPN lies in that it can adaptively adjust the size features of different importance by using dynamic learnable weight factors; a single input edge node without feature fusion is deleted for network lightweighting; a cross-scale connection between input and output nodes is used to be able to fuse more shallow information; the bidirectional path feature module is put into the network for higher-level feature fusion.

Fig. 4.

Fig. 4

Structure diagram of PANet and BiFPN. (a) PANet; (b) BiFPN.

The weighted fusion in BiFPN uses fast normalized feature fusion, which is calculated by Eq. (3) [27]:

O=iwiε+jwjIi (3)

wi is the weight of the scientific standards. Relu's role is to guarantee wi0 , and the training speed is greatly improved because the sofemax operation is not used. Use the minimum ε = 0.0001 to ensure numerical stability. Taking node P6 as an example, P6td is the intermediate node and P6out is the output node, the expression is shown by Eqs. (4), (5) [27]:

P6td=Conv(w1·P6in+w2·Resize(P6in)w1+w2+ε) (4)
P6out=Conv(w1·P6in+w2·P6td+w3·Resize(P5out)w1+w2+w3+ε) (5)

Resize means up-sampling or down-sampling; Conv is a deep separable convolution that contains batch normalization and activation functions.

The PAN module in the feature pyramid network of YOLO v5 uses BiFPN combined with add fusion method to speed up the detection speed while retaining more semantic features.

3.3. Detection head

In the target detection task, the classification branch and regression branch tasks have overlapping areas in the completion process, such that unavoidable contradictions and conflicts can arise [28,29]. The output of YOLO v3∼v5 are using coupling head, and the classification and regression branch tasks of the coupling head are sharing a convolution for operation, so it will have an impact on the detection of the overall model and the time to reach the equilibrium state. The original detection head is further decoupled in YOLO X, called the decoupled head and the operational difference between the two output headers can be seen in Fig. 5. It reduces the channel dimension mapping of output P3, P4, and P5 dimensional feature fusion map to 256 by 1 × 1 convolution, and then adds two parallel branches of 3 × 3 convolution layers each, one for classification branch and the other for regression branch, using 3 × 3 convolution as depth separable convolution, which can reduce the use of parameters; the regression branch part is further divided into a localization branch and a confidence branch by two parallel branches of 1 × 1 convolutional layers for localization and confidence tasks. So far, the tasks of classification, location and confidence detection are completed by their own independent detection layers [29]. The Decoupled Head, because the new convolutional layer and two branches are added compared with the original Coupled Head will increase the size of each layer, but the expressive force of the algorithm is better, the detection accuracy is more accurate, and it is more conducive to fast and stable arrival, so by the trade-off of speed and performance, the Coupled Head at the output of YOLO v5 is replaced by the Decoupled Head.

Fig. 5.

Fig. 5

Structure diagram of coupling head and decoupled head.

4. Results

4.1. Data set introduction

There are 4964 images in the solar panel defect detection data set, which brings together 4464 images from the PVELAD data set jointly released by Hebei University of Technology and Beijing University of Aeronautics and Astronautics and 4464 private solar panel defect images. The dataset in this paper deleted the categories with too few images in PVELAD and divided the dataset into 7 categories according to the sequence of Fig. 6 (a), (b), (c), (d), (e), (f), (g), respectively: crack, finger, black_core, material, short_circuit, thick_line, horizontal_dislocation. In defect detection, when the number is large enough, it is basically divided according to the proportion of training set and test set at 9:1, i.e., the training set contains a total of 4467 image photos and the test set contains 497, and the defects in the images are labeled by Labelimg software.

Fig. 6.

Fig. 6

Types of Solar Energy Defects. (a) Crack; (b) finger; (c) black_core; (d) material; (e) short_circuit; (f) thick_line; (g) horizontal_dislocation.

4.2. YOLO v5 model training

4.2.1. Related configuration

The hardware environment for solar panel defect detection is:

  • (1)

    Processor CPU model: Intel(R) Xeon(R) Silver 4210 CPU @2.20 GHz;

  • (2)

    Computer memory: 8 GB;

  • (3)

    GPU model: NVIDIA Quadro P4000.

The software environment is:

  • (4)

    Operating system: Windows 10 64-bit;

  • (5)

    Development language: Python 3.7, Pytorch 1.7.1 framework;

  • (6)

    Software library files: numpy 1.21.5, opencv 4.5.5, CUDA 10.2.89, tensorboard 2.9.0.

4.2.2. YOLO v5 model hyperparameter setting

Solar panel defect detection images are trained based on the YOLOv5 model and small batch random gradient descent (SGD) algorithm. The approximate parameter settings are shown in Table 2.

Table 2.

Model hyperparameter settings.

Parameter Name Parameter Value
Momentum 0.937
Weight_decay 0.0005
Batch_size 16
Learning_rate 0.01
Epochs 100

4.3. Testing model evaluation index

In the YOLO series model, the main evaluation indexes are precision (P), recall (R), average precision (AP) and mean average accuracy (mAP). Within a reasonable range, the higher the better. P is the ratio of correctly predicted samples to all predicted positive samples; R refers to the ratio of correctly predicted samples among all the samples that are actually positive samples; AP is defined as the area under the PR curve with R as the horizontal axis and P as the vertical axis; mAP is the average of AP for all categories, which is used to evaluate the overall performance of the detection model. The confusion matrix in Table 3 classifies all test results into True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN).

Table 3.

Confusion matrix.

Confusion Matrix Actual value
positive negative
Predicted value positive TP FP
negative FN TN

The P, R, F1 values and mAP are calculated as shown in Eqs. (6), (7), (8), (9), where n denotes the number of samples in the test set.

P=TPTP+FP (6)
R=TPTP+FN (7)
AP=01P(R)dR (8)
mAP=j=0nAP(j)n (9)

4.4. Error analysis

The confusion matrix result of YOLO v5-BDL is shown in Fig. 7, which can visualize the error degree of model identification. In addition to the seven defect categories, the figure also includes background categories. The vertical axis of the matrix represents the category predicted by the model, and horizontal axis represents the category actually marked. The value represents the total probability of error detection of this type in the rightmost column and the total probability of missed detection in the bottom row. Taking the crack class as an example, the prediction is correct 85% of the time, it is not detected 14% of the time, and it is mis-detected as a finger 1% of the time. The main diagonal of the confusion matrix represents the probability that the model prediction is consistent with the label, so the closer the value is to 1, the better. The effectiveness and accuracy of the improvement can be seen from the diagonal values above 0.85 and even three categories reaching 1. The false detection rate of YOLOv5-BDL model is high only in crack and finger categories, while the false detection rate and missed detection rate of other categories are very small, which is enough to see the effective effect of YOLOv5-BDL model.

Fig. 7.

Fig. 7

Confusion Matrix of YOLO v5-BDL.

4.5. Graphic analysis

During the training process, the same parameter settings and data set images are used for each model, and the input image resolution is 640px × 640px. Based on the indexes in the training log saved during the training process, the loss curve comparison graph in Fig. 8 and the mAP curve comparison graph in Fig. 9 are drawn after proper smoothing processing. With the decrease of Loss value, the model converges successfully. The higher the mAP is in the range, the more accurate the model is proved. As can be seen from Fig. 8, with the increase of the number of iterations, both models tend to be smooth, representing gradual stability. With the same Loss value, YOLO v5-BDL can be achieved with fewer epoch values. It means that the YOLOv5-BDL model enters the stability stage faster, the prediction frame deviates less from the actual frame, and has better convergence ability. As can be seen from the mAP plot of Fig. 9, with the increase of the number of iterations, the network training accuracy increases rapidly. The mAP value of YOLO v5-BDL network always higher than that of YOLO v5 model, indicating that the improved network has better performance and stronger detection ability on solar panel data sets.

Fig. 8.

Fig. 8

Model Loss curve comparison diagram.

Fig. 9.

Fig. 9

Model mAP curve comparison diagram.

4.6. Comparison experiments

In addition to the comparison with the original model network of YOLO v5, other target detection networks such as SSD, Faster-RCNN, YOLO v3, YOLO v4, and YOLO v7 models were trained and tested using the same solar panel defect detection dataset as well as the model parameters. The values of mean average precision (mAP), Precision (P), Recall (R) and FPS are listed in Table 4 as the performance of each model on the solar panel defect detection dataset (see Table 5).

Table 4.

Compare the experimental results.

Model P/% R/% mAP@0.5/% FPS/(f·s−1) Weights/MB
SSD 92.68 85.67 92.2 17.01 95.92
Faster-RCNN 66.83 71.46 73.18 7.27 111.01
YOLO v3 88.76 89.34 90.2 24.45 242.2
YOLO v4 94.55 85.52 92.4 23.57 250.39
YOLO v7 89.91 91.2 93.2 44.25 73.1
YOLO v5s 92.4 90.3 93 80.65 14
YOLO v5-BDL 93.9 92.7 95.5 53.92 26.93

Table 5.

Ablation experimental results.

Model LCA Bi-Feature Pyramid Output Head mAP@0.5/% FPS/(f·s−1)
YOLO v5s × × × 93 80.65
Improved 1 × × 94.4 72.64
Improved 2 × × 93.4 81.3
Improved 3 × × 93.3 66.23
Improved 4 × 95 60.91
Improved 5 × 95 53.57
Improved 6 × 94.6 56.94
YOLO v5-BDL 95.5 53.92

In the above networks, other models except Faster-RCNN are one-stage network models. Because the YOLO v5-BDL model increases the network structure compared with the YOLO v5 model, its weight increases 12.93 MB, and the detection speed decreases by 26.73 f·s−1, but other indicators such as Precision (P), Recall (R) and average accuracy (mAP) increase by 1.5%, 2.4% and 2.5%, respectively. Compared with SSD, Faster-RCNN, YOLOv3, YOLOv4 and YOLOv7 models, MAP increases by 3.3%, 22.32%, 5.3%, 3.1% and 2.3% respectively, and the detection speed of a single image is better and the file proportion is smaller. The above comparison can fully show that the YOLOv5-BDL model keeps a more excellent performance in memory, and other software costs, while improving the detection accuracy and better performance.

4.7. Ablation experiments

The ablation experiments in Fig. 5 verified the optimization effect of the improved modules by comparing the model performance after each improved module was added to the network. The improved model X indicates that the network algorithms of different modules are added, the added modules are represented by "√", and the unadded modules are represented by " × ". According to the experimental metrics, after adding the attention mechanism separately, mAP increased by 1.4% and FPS decreased by 8.01f·s−1; after modifying the feature pyramid structure alone, mAP increased by 0.4% and FPS increased by 0.65f·s−1; after replacing the output head with the decoupling head alone, mAP improved by 0.3% and FPS decreased by 14.42f·s−1; after adding both the attention mechanism and modifying the feature pyramid structure, mAP improved by 2% and FPS decreased by 19.74f·s−1; after adding attention mechanisms and decoupling heads, mAP improved by 2% and FPS decreased by 27.08f·s−1;after After modifying both the feature pyramid net and the output head, the mAP improves by 1.6% and the FPS decreases by 23.71 f·s−1; and after adding the three improvements, the overall mAP of the model improves by 2.5% and the FPS decreases by 26.73 f·s−1. The above analysis illustrates that the optimized model as a whole has a slight decrease in the speed of identifying defects, and the model weights increase, but the defects of solar panels can be more perfectly resolved and the recognition ability is more advanced.

Fig. 10 shows the detection effect of the above-mentioned optimization model, and the number and confidence of the valid detection frame are proportional to the excellence of the model. Analysis from the number of detection frames: Fig. 10 (b), (c) and (d) have more valid detection frames than those in Fig. 10 (a), while the number of valid detection frames in Fig. 10 (e) is the most. Regarding the confidence levels: the confidence levels of the detection frames in Fig. 10 (b), (c) and (d) are mostly higher than those of the detection frames in Fig. 10 (a), the confidence levels of the detection frames in Fig. 10 (b), (c) and (d) also differ slightly depending on the mAP values, and the confidence levels of the detection frames in Fig. 10 (e) are also the highest from a comprehensive perspective. The above comparison clearly proves that the accuracy of YOLO v5-BDL for solar panel defect detection is higher than other models to some extent.

Fig. 10.

Fig. 10

Detection results obtained under different models. (a) Original YOLO v5 model; (b) using CA attention mechanism and Bi-FPN model; (c) using CA attention mechanism and decoupling head model; (d) using Bi-FPN and decoupling head model; (e) YOLO v5-BDL model.

5. Conclusion

Because there are large pixel defects and small pixel defects in solar panel defects. The huge difference of pixels can easily cause the model to ignore the defects of small pixels, resulting in missed detection. The defect categories with similar pixel size are easy to make the model confuse these defects, resulting in false detection. Aiming at this kind of small pixel defect category, a three-layer improved LCA structure is added to YOLO v5, the feature fusion network structure is modified to a two-way weight channel, the output part is improved to a decoupled head, and the improved model is named YOLO v5-BDL. Compared with the original YOLO v5, the ability of YOLO v5-BDL algorithm to perfectly distinguish defects has been improved by 2.5%, and the final result is 95.5%. The probability of missed detection and false detection is smaller than other methods. Although the detection speed slows down slightly to 53.92 f·s−1, the detection speed of a single image is still at the ms level, within the time range required by the industrial production line. Thus, if the YOLO v5-BDL is applied to the solar power plant assembly line, compared with other methods, it can better improve the defect extraction ability of the software, and will not affect overall process in terms of speed. The application in the assembly line only needs to use a high-definition camera to take pictures and transmit it to the computer for identification and testing, which does not need expensive hardware cost and has a certain degree of maneuverability. Starting with the results of the confusion matrix and the detection speed, the production line can further strengthen the real-time degree of testing whether the solar panels are intact or not. Focus on solving the problem of classification detection between the two categories with high error detection rate, and reducing the weight of the model through lightweight model.

Funding information

Industrial leading (key) project of Fujian Provincial Science and Technology Department (No.2021H0024).

Author contribution statement

Jing Huang: Conceived and designed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data.

keyao zeng: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.

Zijun Zhang: Wanhan Zhong: Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data.

Data availability statement

The data that has been used is confidential.

Declaration of competing interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:Jing Huang reports financial support was provided by Industrial leading(key) project of Fujian Provincial Science and Technology Department (No.2021H0024).

References

  • 1.Dincer I. Renewable energy and sustainable development: a crucial review. Renew. Sustain. Energy Rev. 2000;4(2):157–175. [Google Scholar]
  • 2.Pathipooranam P. An enhancement of the solar panel efficiency–A comprehensive review. Front. Energy Res. 2022:1090. [Google Scholar]
  • 3.Istratov A.A., Hieslmair H., Vyvenko O.F., Weber E.R., Schindler R. Defect recognition and impurity detection techniques in crystalline silicon for solar cells. Sol. Energy Mater. Sol. Cells. 2002;72(1–4):441–451. [Google Scholar]
  • 4.Duenas S., Perez E., Castan H., Garcia H., Bailon L. IEEE; 2013. The Role of Defects in Solar Cells: Control and Detection Defects in Solar Cells, 2013 Spanish Conference on Electron Devices, 2013; pp. 301–304. [Google Scholar]
  • 5.Breitenstein O., Bauer J., Altermatt P.P., Ramspeck K. Influence of defects on solar cell characteristics. Solid State Phenom. 2010;156:1–10. [Google Scholar]
  • 6.Kang B., Kim S., Bae S., Park J. Diagnosis of output power lowering in a PV array by using the Kalman-filter algorithm. Ieee T Energy Conver. 2012;27(4):885–894. [Google Scholar]
  • 7.Tsuzuki K., Murakami T., Yoshino T., et al. Inspection method and production method of solar cell module. U.S. Patent. 2001;6:271. 462[P] 8-7. [Google Scholar]
  • 8.Esquivel O. Contrast imaging method for inspecting specular surface devices. U.S. Patent. 2002;6(433) 867[P] 8-13. [Google Scholar]
  • 9.Tsai D., Chang C., Chao S. Micro-crack inspection in heterogeneously textured solar wafers using anisotropic diffusion. Image Vis Comput. 2010;28(3):491–501. [Google Scholar]
  • 10.He K., Gkioxari G., Dollár P., Girshick R. Mask r-cnn. Proceedings of the IEEE international conference on computer vision. 2017;2017:2961–2969. [Google Scholar]
  • 11.Girshick R. Fast r-cnn. Proceedings of the IEEE international conference on computer vision. 2015;2015:1440–1448. [Google Scholar]
  • 12.Ren S., He K., Girshick R., et al. Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015;28 doi: 10.1109/TPAMI.2016.2577031. [DOI] [PubMed] [Google Scholar]
  • 13.Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu C., Berg A.C. Springer; 2016. Ssd: Single Shot Multibox Detector, Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, 2016; pp. 21–37. [Google Scholar]
  • 14.Redmon J., Farhadi A. 2018. Yolov3: an Incremental Improvement. arXiv preprint arXiv:1804.02767. [Google Scholar]
  • 15.Bochkovskiy A., Wang C.Y., Liao H.Y.M. 2020. Yolov4: Optimal Speed and Accuracy of Object detection. arXiv preprint arXiv:2004.10934. [Google Scholar]
  • 16.Ge Z., Liu S., Wang F., Li Z., Sun J. 2021. Yolox: Exceeding Yolo Series in 2021. arXiv preprint arXiv:2107.08430. [Google Scholar]
  • 17.Redmon J., Divvala S., Girshick R., Farhadi A. 2016. You Only Look once: Unified, Real-Time Object Detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016; pp. 779–788. [Google Scholar]
  • 18.Lin T., Goyal P., Girshick R., He K., Dollár P. Focal loss for dense object detection. Proceedings of the IEEE international conference on computer vision. 2017;2017:2980–2988. [Google Scholar]
  • 19.Tsai D., Li G., Li W., Chiu W. Defect detection in multi-crystal solar cells using clustering with uniformity measures. Adv. Eng. Inf. 2015;29(3):419–430. [Google Scholar]
  • 20.Demant M., Welschehold T., Oswald M., Bartsch S., Brox T., Schoenfelder S., Rein S. Microcracks in silicon wafers I: inline detection and implications of crack morphology on wafer strength. Ieee J Photovolt. 2015;6(1):126–135. [Google Scholar]
  • 21.Jiang P., Ergu D., Liu F., Cai Y., Ma B. A Review of Yolo algorithm developments. Proc. Comput. Sci. 2022;199:1066–1073. [Google Scholar]
  • 22.Lin T., Dollár P., Girshick R., He K., Hariharan B., Belongie S. Feature pyramid networks for object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, 2017. 2017:2117–2125. [Google Scholar]
  • 23.Liu S., Qi L., Qin H., Shi J., Jia J. Path aggregation network for instance segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, 2018. 2018:8759–8768. [Google Scholar]
  • 24.Hu J., Shen L., Sun G. 2018. Squeeze-and-excitation Networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018; pp. 7132–7141. [Google Scholar]
  • 25.Woo S., Park J., Lee J., Kweon I.S. ECCV); 2018. Cbam: Convolutional Block Attention Module, Proceedings of the European Conference on Computer Vision; pp. 3–19. 2018. [Google Scholar]
  • 26.Hou Q., Zhou D., Feng J. Coordinate attention for efficient mobile network design. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021;2021:13713–13722. [Google Scholar]
  • 27.Tan M., Pang R., Le Q.V. Efficientdet: scalable and efficient object detection. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020;2020:10781–10790. [Google Scholar]
  • 28.Wu Y., Chen Y., Yuan L., Liu Z., Wang L., Li H., Fu Y. Rethinking classification and localization for object detection. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020;2020:10186–10195. [Google Scholar]
  • 29.Song G., Liu Y., Wang X. Revisiting the sibling head in object detector. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020;2020:11563–11572. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data that has been used is confidential.


Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES