Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2023 Jul 21;13:11815. doi: 10.1038/s41598-023-39058-w

Research on stress curve clustering algorithm of Fiber Bragg grating sensor

Yisen Lin 1, Ye Wang 1,, Huichen Qu 1, Yiwen Xiong 1
PMCID: PMC10362002  PMID: 37479882

Abstract

The global stress distribution and state parameter analysis of the building's main structure is an urgent problem to be solved in the online state assessment technology of building structure health. In this paper, a stress curve clustering algorithm of fiber Bragg grating stress sensor based on density clustering algorithm is proposed. To solve the problem of large dimension and sparse sample space of sensor stress curve, the distance between samples is measured based on improved cosine similarity. Aiming at the problem of low efficiency and poor effect of traditional clustering algorithm, density clustering algorithm based on mutual nearest neighbor is used to cluster. Finally, the classification of the daily stress load characteristics of the sensor is realized, which provides a basis for constructing the mathematical analysis model of building health. The experimental results show that the stress curve clustering method proposed in this paper is better than the latest clustering algorithms such as HDBSCAN, CBKM, K-mean++,FINCH and NPIR, and is suitable for the feature classification of stress curves of fiber Bragg grating sensors.

Subject terms: Electrical and electronic engineering, Computer science

Introduction

Fiber Bragg sensor (FBG) was widely used in national defense, railway, chemical industry, environment, nuclear power, bridge and tunnel monitoring and other fields due to its characteristics of low transmission loss, corrosion resistance, good insulation and electromagnetic interference resistance1, 2. Building structure safety monitoring system can obtain large and accurate stress and temperature data of building structure through fiber grating stress sensor. Because the geological and environmental changes are a slow evolutionary process, it is not feasible to judge the safety of building structures from the monitoring data at a certain point in time. Only through the accumulation of reliable data collection and the correct mathematical model analysis can we make the safe prediction with high reliability. By using the existing data mining and analysis technology, the daily stress load data of building stress sensor is analyzed and features are extracted by clustering analysis algorithm, and a mathematical analysis model can be constructed to evaluate and predict the overall health status of buildings. This is of great significance to the construction safety monitoring scenario.

The FBG sensor is mainly formed by using the ultraviolet exposure technology to induce periodic changes in the optical refractive index in the fiber core according to the photosensitive characteristics of the fiber material35. The periodic change of the optical refractive index distribution in the FBG leads to the reflection of light of a certain wavelength, which is equivalent to forming a narrow-band filter in the optical fiber, thereby forming the reflection spectrum of the FBG. FBGs only reflect light at the Bragg wavelength, and the reflected Bragg wavelength is proportional to the temperature and strain value.

The working principle of the sensor is shown in Fig. 1. When the light source enters the FBG through the optical fiber for coupling, the FBG will selectively reflect back light of a specific wavelength, and the reflected Bragg wavelength λB can be expressed as 3

λB=2neffΛ 1

Figure 1.

Figure 1

Schematic diagram of working principle of FBG demodulation.

In Eq. (1), λB is the center wavelength of the reflected light, neff is the effective refractive index of the fiber, and Λ is the period of the fiber grating. It can be found from this that λB changes with the changes of neff and Λ, and the effective refractive index of the fiber and the period of the fiber grating are very easily affected by the ambient temperature and the stress of the fiber grating sensor. The offset ΔλB of the center wavelength of λB can be expressed as

ΔλB=λB[1-Peε+(α+ξ)ΔT] 2

In Eq. (2), ε is the axial strain of the fiber grating, α is the thermal expansion coefficient of the fiber, ξ is the thermal optical coefficient of the fiber, ΔT is the temperature change. Pe is called the effective photoelastic coefficient, and can be expressed as

Pe=neff22[P12+υ(P11+P12)] 3

In Eq. (3), υ is the Poisson's ratio. When using a fiber with pure silica core and boron dioxide doped cladding, P11=0.121, P12=0.27, υ=0.17. When the effective refractive index of the fiber is neff=1.46, the effective photoelastic coefficient Pe=0.22. When ΔλB is measured by a fiber grating demodulator, its axial stress change ε can be expressed as3

ε=1/(1-Pe)·(ΔλB)/λB 4

Related work

Stress data processing method of FBG sensor

There are many methods for structural health monitoring in different scenarios. Kahandawa et al.3 proposed a fixed filter decoding system and an integrated artificial neural network algorithm for extracting strain from embedded FBG sensor. On the basis of Kalman filter, Song et al.4 taked the strain value measured by FBG sensor as the observed signal. Through the gain matrix, new information sequence and covariance matrix generated by Kalman filter, the least square algorithm is used to estimate the load size in real time. However, this algorithm can only calculate the real-time local load of the building, and cannot give the current overall health status of the building based on the past sensor data. Zhang et al.5 proposed a model reconstruction predicting algorithm based on PSO-SVR to achieve the self-repairing of the FBG sensor network in SHM system. Through this algorithm, the reliability and survivability of the FBG-based SHM system is enhanced if partial FBG sensors are invalid. Stotaw Talbachew Hayle et al.6 used deep learning technique to accurately identify the Bragg wavelength of FBGs in the condition of the partially or fully overlapped spectra to improve the reliability and detection accuracy of the sensor system even the number of overlaps FBGs spectra increases. Jiang et al.7 developed an FBG sensing-based structural health monitoring system for Chinese ancient Chuan-dou-type timber buildings that aims at monitoring structural deformation. Methods proposed by Zhang et al.8 can accurately identify structural macro strain modal shapes and are much more robust than traditional modal parameter based indexes for structural damage detection. Sierra-Pérez et al.9 presented a damage detection methodology based on optimal baseline selection by means of clustering techniques which includes the use of hierarchical nonlinear PCA as a nonlinear modeling technique. Luckey et al.10 proposed conceptual explainable artificial intelligence framework that provide a basis for improving ML acceptance and transparency and therefore increase trust in ML algorithms implemented in SHM applications.

Most of the above methods used FBG sensors to monitor loads in different scenarios in real time, without assessing or predicting the overall health status of buildings from a global perspective. So this paper puts forward a scheme to solve this problem.

Clustering algorithms

Clustering algorithm is a commonly used unsupervised learning algorithm and can be used to find hidden patterns in datasets. The purpose of clustering is to divide the samples in a dataset into several disjoint subsets. The data of fiber grating building stress sensor is usually collected at a certain time interval. The stress data collected from the same sensing point exists in the form of curves. In this paper, the daily load stress curve of stress sensor is taken as a sample, and the clustering algorithm is used to analyze it, and the mathematical model of building health is built. Traditional clustering algorithms have low clustering efficiency and poor clustering effect1113. For example, the clustering effect of K-means algorithm14 depends heavily on the selection of cluster centers. The clustering results often fall into local optimum due to inappropriate initial cluster centers or the influence of noise and boundary points, and cannot adapt to non-convex shape data, and the K-means algorithm also needs to specify the number of clusters15, 16. CBKM17 is an improved algorithm of KMeans algorithm. It improves clustering by using better initialization and iteration. However, CBKM cannot correctly classify non-spherical data. The traditional density clustering algorithm DBSCAN18 needs to specify the scanning radius (eps) and the minimum number of included points (minPts) for different data, which requires researchers to test one by one, and it is difficult to adjust parameters1922, resulting in low clustering efficiency. HDBSCAN23 is a hierarchical algorithm. It provides a clustering hierarchy, constructs a clustering tree on this basis, and then extracts clustering from the optimal local cut through the clustering tree. A state-of-the-art algorithm in hierarchical algorithm is the FINCH algorithm24. In this algorithm, an adjacency matrix is defined according to the clustering equation to join the points in the point data set. In proximity-based algorithms, FastDP algorithm25 uses a fast and generic construction of approximate k-nearest neighbor graph to improve the quadratic time complexity of the DPC algorithm26. Another latest algorithm based on DPC is NPIR algorithm27. The algorithm is based on nearest neighbor search to deal with clustering problems. This algorithm requires three parameters: the number of clusters, index ratio and iteration times.

These algorithms can not reflect the potential law and change characteristics of fiber stress data well, and can not provide good support for building health mathematical analysis model. The density clustering algorithm based on inverse nearest neighbor proposed in this paper can solve these problems existing in the traditional clustering algorithm in the FBG building stress data scene.

Proposed method

Density clustering algorithm based on mutual nearest neighbor

This subsection mainly describes the specific process of mutual nearest neighbor-based density estimation clustering. Given the number of neighbors k, k is a positive integer, and the distance between x and y is dx,y. Assuming that the sample set is S, the set of k nearest neighbors of a sample point x in the set, that is, the set of k sample points closest to the sample point x is expressed as:

NNkx={s1,skS|dsi,xd(z,x)zS\{s1,sk}} 5

The k-inverse neighbor set of sample point x is expressed as:

RNNkx={zS|xNNkz} 6

The k-mutual neighbor set of sample point x is expressed as:

MNNkx=NNkxRNNkx 7

Algorithm 1 describes the overall clustering algorithm improved from28, followed by a detailed discussion of its time complexity

graphic file with name 41598_2023_39058_Figa_HTML.jpg

graphic file with name 41598_2023_39058_Figb_HTML.jpg

graphic file with name 41598_2023_39058_Figc_HTML.jpg

Algorithm 1 consists of the following two stages:

(1) The first stage:

P are sorted based on |RNNkx| in a descending order.

Points in dataset which have the property of |MNNkpi|=0 are labeled as noise points.

MakeClusterBackbones(Algorithm2): After removing the noise points with |MNNkpi|=0, the remaining points are divided into backbone points and non-backbone points. Backbone points are used to construct the skeleton of a cluster and non-backbone points are processed in the second stage. When two backbone points are mutual neighbors ,they are connected to form the skeleton of the same cluster.

(2) Merge Non Backbones Points (Algorithm3): For each non-backbone point wi, find out the MNNk(wi) of that contains the most backbone points belongs to certain cluster. If there are some backbone points belongs to a certain cluster in MNNk(wi) and the number of that is greater than or equal to k/5, it is determined that it belongs to the certain cluster. Otherwise, wi is marked as a noise.

Time complexity

Suppose that the number of points in the dataset is N, the number of nearest neighbors is k, the dimension of the data is D, and the number of clusters constructed is C. The time complexity for constructing the distance matrix is ONlogN28 and29. In algorithm 1, it takes K iterations to calculate both NNkpi and RNNkpi of each point, so the time complexity of this part is ONk30. The time complexity analysis of the remaining part of algorithm 1 is as follows:

  1. Line8, according to |MNNkpi| descending to the time complexity is ONlogN to sort the list of P.

  2. Line11, the complexity of MakeClusterBackbones(Algorithm2) is OSUlogk,where |U| is an upper limit of the number of generated clusters31. Consider the worst case, OU=O(|S|) and OS=N. Finally, the overall complexity is ON2logk.

  3. Line12, the complexity of MergeNonBackbonesPoints (Algorithm3) is OWUk31. Based on the same analysis, the complexity is ON2k.

In summary, the overall time complexity for Density clustering algorithm is ON2k.

Pattern recognition algorithm for stress curves of fiber bragg grating sensors

Figure 2 shows the pattern recognition algorithm flow of the stress curve of the fiber grating building stress sensor based on the mutual neighbor clustering algorithm, in which the clustering adopts the algorithm described in 3.1.

Figure 2.

Figure 2

Pattern recognition algorithm flow of the stress curve.

Firstly, the daily load data set of the building stress sensor is obtained. The data set contains P samples, and each sample has a data set matrix of Q time point attributes. The building health monitoring system collects sample data through fiber grating sensors installed in the building, and each sample is the stress daily load of a sensor.

Secondly, preprocess the acquired daily load data of the building stress sensor to obtain an initial cluster, wherein the preprocessing progress includes missing value processing, data standardization, data regularization and data dimensionality reduction on the daily load data of the building stress sensor. After this step, the initial cluster obtained is a dataset matrix of p samples with q attributes per sample.

Specifically, the missing value processing is to delete the samples with few valid values, and complete the missing values of the samples with many valid values. Of course, when deleting attributes with few valid values, redundant attributes can be deleted together.

In the process of deleting samples, if n samples are deleted, p samples remain, where p = P − n. In addition, there are various ways to supplement the missing values. In this application, the average of the existing validity is taken as the filling value of the missing value. Those skilled in the art can choose other supplementary methods, which will not affect the subsequent analysis process.

Data normalization is to linearize the stress part data in the original data into the range of [− 1, 1]. The calculation formula of the maximum-minimum normalization is:

Xnorm=1/(1-e-X) 8

This formula realizes the proportional scaling of the original data, where Xnorm is the normalized data, and X is the original data.

Data regularization is to subtract each attribute from the corresponding mean of the attribute, and then divide by the corresponding variance of the attribute. After standardization and regularization, the data of each attribute are clustered around 0 and the variance is 1, that is, the obtained sample data has zero mean and unit variance.

Data dimensionality reduction is to use PCA (Principal Component Analysis), to reduce the dimensionality of the data set, and obtain the processed dimensionality reduction cluster S. In the process of dimensionality reduction, if the number of dimensionality reduction is q, then the dimensionality reduction cluster S after dimensionality reduction is a dataset matrix with p samples and each sample has q attributes.

A clustering algorithm is used to classify the data in the data set S, so as to identify sensor groups with similar characteristics. Using the density estimation clustering algorithm based on mutual neighbors, the p samples in the dataset are divided into M classes, where M is a positive integer. Since the dimension of the sample data after dimensionality reduction is still large, in order to avoid the problem of sparse sample space, the distance between the samples is calculated by the improved cosine similarity:

dx,y=xy|x||y|=i=1qxiyii=1qxi2i=1qyi2 9

The Calinski-Harabasz (CH) value is used to evaluate the clustering effect, and the result with the largest CH value is selected, and finally M classes of sensor clusters are obtained. The specific calculation process of Calinski-Harabasz (CH) is as follows:

CH=trB(h)/(h-1)trW(h)/(m-h) 10

In the above formula, m represents the number of clusters, h represents the current class, trB(h) represents the trace of the inter-class dispersion matrix, and trW(h) represents the trace of the intra-class dispersion matrix. The larger the CH value, the closer the samples within the class, the more dispersed the samples between different classes, that is, the better clustering results.

Experimental results

In order to verify the feasibility and superiority of the proposed algorithm, K-Means++, CBKM, HDBSCAN, FINCH, NPIR and the proposed algorithm are used to cluster the data sets with known classification labels and the actual collected data of fiber grating sensor, and then the clustering effect is analyzed.

Analysis of clustering results of known classification labeled data sets

Public standard datasets with known classification labels (witch includes A1, A2, Aggregation(Agg), Compound(Comp), D31, Flame, Jain, Pathbased(Pb), Spiral, Mouse, G2-2-10(G2), R15, S1, Vary density(Vd)) are used to evaluate algorithm effectiveness. Since the Calinski-Harabasz (CH) metric is suitable for situations where the actual classification label is unknown. For datasets with known classification labels, the clustering effect index can be measured by Normalized Mutual Information (NMI) , Adjusted Mutual Information (AMI), Adjusted Rand Index(ARI), Purity, Homogeneity and Completeness scores. Figures. 3, 4, 5, 6, 7 and 8 shows the results of clustering experiments on datasets with classification labels.

Figure 3.

Figure 3

Comparison of NMI scores in standard data sets.

Figure 4.

Figure 4

Comparison of AMI scores in standard data sets.

Figure 5.

Figure 5

Comparison of ARI scores in standard data sets.

Figure 6.

Figure 6

Comparison of purity scores in standard data sets.

Figure 7.

Figure 7

Comparison of homogeneity scores in standard data sets.

Figure 8.

Figure 8

Comparison of completeness scores in standard data sets.

Analysis of clustering results of actual collected data

The experimental device platform used in this paper to collect fiber grating sensor data consists of fiber demodulator, FBG sensor, and stress generator. The experimental platform is shown in Fig. 9. The FBG sensor is placed in the stress generator device which can apply different types of force shocks to the FBG sensor through programming. The center wavelength of the FBG is demodulated through the GM8050C fiber demodulator to calculate the stress value. The GM8050C fiber demodulator has 4 channels, each of which can be connected to 16 FBG stress sensors in series.

Figure 9.

Figure 9

Experimental platform.

In this paper, the stress curves of 4 FBG sensors are collected as data samples through the experimental device platform. The collection time interval is 2 min and the collection period is 24 h. The daily load data of the stress sensor consists of 720 stress data. Each stress curve sample is collected continuously for 3 months. The proposed algorithm, K-Means++, HDBSCAN,CBKM,FINCH and NPIR algorithms were used to compare the clustering effect of the daily load data of S1, S2, S3, S4 sensors and the mixed data of the four sensors respectively. The optimal results of the Calinski-Harabasz(CH) index obtained by the experimental test are shown in Table 1.

Table 1.

The CH results of the proposed algorithm and other five algorithms.

Sensor data S1 S2 S3 S4 S1 + S2 + S3 + S4
Proposed algorithm 1271.57 1068.29 1156.61 978.39 1006.26
HDBSCAN 931.73 789.24 804.19 902.26 857.48
K-Means++ 765.28 727.53 786.04 704.24 752.86
CBKM 812.67 794.33 826.74 873.36 849.73
FINCH 1147.85 984.04 953.68 949.75 979.45
NPIR 998.52 1021.71 1045.64 930.83 985.72
Rank 1 1 1 1 1

Result and discussion

Experimental results show that proposed algorithm greatly outperforms other algorithms on many datasets such as A1, A2, Aggregation, Compound, D31, Pathbased, Mouse, R15, S1. The performance of NPIR, HDBSCAN and CBKM on Frame, Jain, Spiral, G2-2-10, Vary density datasets is almost equal to that of the proposed algorithms. This is due to the framework adopted by the proposed algorithm in the clustering process can distinguish real clusters in noisy data, even if they are connected to each other or even overlap, provided that they have distinguishable densities. The clustering effect of HDBSCAN deteriorates at different densities between clusters because of the fixed neighborhood radius. Centroid based algorithms like K-Means++and CBKM fail when the centroid of a cluster is closer to the data points belongs to other cluster than to that of its own cluster. FINCH will output incorrect clustering results due to inappropriate location of the merged cluster center during hierarchical merging. NPIR perform poorly when the data is noisy or the cluster density is different and slightly overlapped. The process of calculating the MNNk of each point in the proposed algorithm is actually constructing the structure of mutual neighbor graph, and then completing the clustering by dividing the subgraphs. The structure of the mutual nearest neighbor graph is determined only by the parameter K, and there are no hyperparameters related to the distance between points. Therefore, the algorithm still performs well when the cluster density is different or there are connections between clusters.

The last row of the Table1 shows the ranking of the proposed algorithm in the performance of all six algorithms based on Calinski-Harabasz (CH) index. The results show that the CH value of the algorithm in this paper are both greater than that of other five algorithms, which proves that the clustering effect of the algorithm in this paper is suitable for the actual sensor daily load curve. In general, it can be seen that the clustering effect of the proposed algorithm in this paper is better than that other five algorithms, and the data of the fiber grating stress sensor can be clustered with higher accuracy. However, the proposed algorithm still has problem: (1)the k value for the nearest neighbor in the the algorithm still cannot be determined adaptively. (2)The time complexity of the algorithm is still unacceptably high when large samples need to be processed.

Conclusions

Aiming at the clustering problem of the stress curve of the fiber grating stress sensor, this paper obtains the initial cluster by preprocessing the stress data of the fiber grating building stress sensor. The initial cluster is dimensionally reduced to obtain a dimensionality-reduced cluster. The initial clusters are clustered by the clustering algorithm based on mutual neighbors, and the clustering effectiveness index is used to evaluate the clustering results, and finally sensor groups with similar characteristics are obtained. The cosine similarity is used to calculate the sample distance based on the mutual neighbor clustering algorithm, which solves the problem of sparse sample space caused by the large dimension of the sample data. The experimental results show that compared with the other five clustering algorithms, the proposed algorithm has a better pattern recognition effect.

Author contributions

Conceptualization, Y.L. and H.Q.; methodology, Y.L. and Y.W.; software, Y.W and Y.X.; validation, Y.L., H.Q., Y.W. and Y.X.; investigation, Y.L. and H.Q.; writing—original draft preparation, Y.L.; writing—review and editing, Y.L., H.Q. and Y.W.; visualization, Y.X.; project administration, Y.W.; funding acquisition, H.Q. and Y.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the 2021 Basic Scientific Research Ability Improvement Project for Young and Middle-aged Teachers of Universities in GuangXi (Grant No. 2021KY0794:Development and application of visual remote analysis system based on data acquisition of optical fiber sensor), the 2022 Basic Scientific Research Ability Improvement Project for Young and Middle-aged Teachers of Universities in GuangXi (Grant No. 2022KY0789:Research on logistics distribution route optimization based on electric vehicle), the 2023 Basic Scientific Research Ability Improvement Project for Young and Middle-aged Teachers of Universities in GuangXi (Grant No. 2023KY0814: Research on federated learning methods for non-independent homogeneous distribution scenarios), the 2021 Basic Scientific Research Ability Improvement Project for Young and Middle-aged Teachers of Universities in GuangXi (Grant No. 2021KY0797: Research on Alzheimer's disease classification algorithm based on deep learning), the 2020 Basic Scientific Research Ability Improvement Project for Young and Middle-aged Teachers of Universities in GuangXi (Grant No. 2020KY21023: Research on ambiguity and redundancy in tag recommendation system), school-level Scientific Research Project in Guilin University of Aerospace Technology(Grant No. XJ20KT19: Research on Optimization of electric vehicle logistics distribution).

Data availability

The datasets with known classification labels (witch includes A1, A2, Aggregation, Compound, D31, Flame, Jain, Pathbased, Spiral, Mouse, G2-2–10, R15, S1, Vary density) analysed during the current study are available in http://cs.joensuu.fi/sipu/datasets/ and https://elki-project.github.io/datasets/.The data of FBG sensors analysed during the current study are not publicly available due to the Lab's Policy or Non-Disclosure Agreement but are available from the first author(Yisen Lin, Email:linyisen@guat.edu.cn) on reasonable request.

Competing interests

The authors declare no competing interests.

Footnotes

The original online version of this Article was revised: The original Article contained errors in the main text and reference list. Modifications have been made to the Alogrithm 1 and 2 and in the section “Proposed method.” Full information regarding the corrections made can be found in the correction for this Article.

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Change history

9/20/2023

A Correction to this paper has been published: 10.1038/s41598-023-42289-6

References

  • 1.Cheng Y, Wu CT, Liu HL. A repaired algorithm based on improved compressed sensing to repair damaged fiber bragg grating sensing signal. J. Electron. Inf. Technol. 2018;40:386–393. [Google Scholar]
  • 2.Tan QG, Hu Y, Lin P. A novel kind of multi-access interference cancellation scheme based on fiber Bragg gratings. J. Electron. Inf. Technol. 2007;29:696–698. [Google Scholar]
  • 3.Kahandawa GC, Epaarachchi J, Wang H, et al. Extraction and processing of real time strain of embedded FBG sensors using a fixed filter FBG circuit and an artificial neural network. Meas. J. Int. Meas. Confed. 2013;46(10):4045–4051. doi: 10.1016/j.measurement.2013.07.029. [DOI] [Google Scholar]
  • 4.Song XG, Liu P, Cheng ZM, Wei Z, Yu JS, Huang JW, Liang DK. An algorithm of dynamic load identification based on FBG sensor and Kalman filter. Acta Optic. Sin. 2018;38:165–172. [Google Scholar]
  • 5.Zhang XL, Wang P, Liang DK, et al. A soft self-repairing for FBG sensor network in SHM system based on PSO-SVR model reconstruction. Optic. Commun. 2015;343:38–46. doi: 10.1016/j.optcom.2014.12.079. [DOI] [Google Scholar]
  • 6.Hayle ST, Manie YC, Dehnaw AM, et al. Reliable self-healing FBG sensor network for improvement of multipoint strain sensing. Optic. Commun. 2021;499(1):127286. doi: 10.1016/j.optcom.2021.127286. [DOI] [Google Scholar]
  • 7.Jiang SF, Qiao ZH, Li NL, Luo JB, Shen S, Wu MH, Zhang Y. Structural health monitoring system based on FBG sensing technique for Chinese ancient timber buildings. Sensors. 2019;20(1):110. doi: 10.3390/s20010110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Zhang J, Guo SL, Wu ZS, et al. Structural identification and damage detection through long-gauge strain measurements. Eng. Struct. 2015;99:173–183. doi: 10.1016/j.engstruct.2015.04.024. [DOI] [Google Scholar]
  • 9.Sierra-Pérez J, Torres-Arredondo MA, Alvarez-Montoya J. Damage detection methodology under variable load conditions based on strain field pattern recognition using FBGs, nonlinear principal component analysis, and clustering techniques. Smart Mater. Struct. 2017 doi: 10.1088/1361-665X/aa9797. [DOI] [Google Scholar]
  • 10.Luckey D, Fritz H, Legatiuk D, et al. Explainable artificial intelligence to advance structural health monitoring. Struct. Health Monitor. Based Data Sci. Techn. 2021 doi: 10.1007/978-3-030-81716-9_16. [DOI] [Google Scholar]
  • 11.Lv YH, Ma TH, Tang ML, Cao J, Tian Y, Al-Dhelaan A, Al-Rodhaan M. An efficient and scalable density-based clustering algorithm for datasets with complex structures. Neurocomputing. 2016;171:9–22. doi: 10.1016/j.neucom.2015.05.109. [DOI] [Google Scholar]
  • 12.Campello RJGB, Moulavi D, Zimek A, Sander J. Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans. Knowl. Discov. Data. 2015;10:1–51. doi: 10.1145/2733381. [DOI] [Google Scholar]
  • 13.Liu L, Sun LT, Chen SP, Liu M, Zhong J. K-PRSCAN: A clustering method based on PageRank. Neurocomputing. 2016;175:65–80. doi: 10.1016/j.neucom.2015.10.020. [DOI] [Google Scholar]
  • 14.Arthur, D., Vassilvitskii, S. K-Means++: The advantages of careful seeding. In Proc. of the eighteenth annual ACM-SIAM symposium on discrete algorithms, 1027–1035 (New Orleans, USA, 2007).
  • 15.Gao X, Hu ZM. Digital image clustering based on improved k-means algorithm. Chin. J. Liq. Cryst. Disp. 2020;35:173–179. doi: 10.3788/YJYXS20203502.0173. [DOI] [Google Scholar]
  • 16.Rui X, Wunsch DI. Survey of clustering algorithms. IEEE Trans. Neural Netw. 2005;16:645–678. doi: 10.1109/TNN.2005.845141. [DOI] [PubMed] [Google Scholar]
  • 17.Fränti P, Sieranoja S. How much can k-means be improved by using better initialization and repeats? Pattern Recognit. 2019;93:95–112. doi: 10.1016/j.patcog.2019.04.014. [DOI] [Google Scholar]
  • 18.Ester M, Kriegel HP, Sander J, et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. AAAI Press; 1996. [Google Scholar]
  • 19.Gan, J.H., Tao, Y.F. DBSCAN revisited: Mis-Claim, un-fixability, and approximation. In Proc. of the 2015 ACM SIGMOD International Conference on Management of Data, 519–530 (New York, USA, 2015).
  • 20.Kumar KM, Reddy ARM. A fast DBSCAN clustering algorithm by accelerating neighbor searching using groups method. Pattern Recogn. J. Pattern Recogn. Soc. 2016;58:39–48. doi: 10.1016/j.patcog.2016.03.008. [DOI] [Google Scholar]
  • 21.Cassisi C, Ferro A, Giugno R, Pigola G, Pulvirenti A. Enhancing density-based clustering: Parameter reduction and outlier detection. Inf. Syst. 2013;38:317–330. doi: 10.1016/j.is.2012.09.001. [DOI] [Google Scholar]
  • 22.Schneider, J., Vlachos, M. On randomly projected hierarchical clustering with guarantees. April 24–26, 28–36 (Philadelphia, Pennsylvania, USA, 2014).
  • 23.McInnes L, Healy J, Astels S. Hdbscan: Hierarchical density based clustering. J. Open Sour. Softw. 2017;2(11):205. doi: 10.21105/joss.00205. [DOI] [Google Scholar]
  • 24.Sarfraz, S., Sharma, V., Stiefelhagen, R. Efficient parameter-free clustering using first neighbor relations, In Proc. of the IEEE/CVF conference on computer vision and pattern recognition, 2019-June, 8926–8935 (IEEE, 2019).
  • 25.Sieranoja S, Fränti P. Fast and general density peaks clustering. Pattern Recogn. Lett. 2019;128:551–558. doi: 10.1016/j.patrec.2019.10.019. [DOI] [Google Scholar]
  • 26.Alex R, Alessandro L. Clustering by fast search and find of density peaks. Science. 2014;344:1492–1496. doi: 10.1126/science.1242072. [DOI] [PubMed] [Google Scholar]
  • 27.Qaddoura R, Faris H, Aljarah I. An efficient clustering algorithm based on the k-nearest neighbors with an indexing ratio. Int. J. Mach. Learn. Cybern. 2020;11(3):675–714. doi: 10.1007/s13042-019-01027-z. [DOI] [Google Scholar]
  • 28.Abbas M, El-Zoghabi A, Shoukry A. DenMune: Density peak based clustering using mutual nearest neighbors. Pattern Recogn. 2021;109:107589. doi: 10.1016/j.patcog.2020.107589. [DOI] [Google Scholar]
  • 29.Otair M. Approximate K-nearest neighbour based spatial clustering using K- D tree. Int. J. Database Manag. Syst. 2013;5(1):97–108. doi: 10.5121/ijdms.2013.5108. [DOI] [Google Scholar]
  • 30.Wang X, Chen J, Yu J. Optimised quantisation method for approximate nearest neighbour search. Electron. Lett. 2017;53(3):156–158. doi: 10.1049/el.2016.2810. [DOI] [Google Scholar]
  • 31.Lin Y, Zhang X, Liu L, Qu H. DEDIC: Density estimation clustering method using directly interconnected cores. IEEE Access. 2022;10:132031–132039. doi: 10.1109/ACCESS.2022.3229582. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets with known classification labels (witch includes A1, A2, Aggregation, Compound, D31, Flame, Jain, Pathbased, Spiral, Mouse, G2-2–10, R15, S1, Vary density) analysed during the current study are available in http://cs.joensuu.fi/sipu/datasets/ and https://elki-project.github.io/datasets/.The data of FBG sensors analysed during the current study are not publicly available due to the Lab's Policy or Non-Disclosure Agreement but are available from the first author(Yisen Lin, Email:linyisen@guat.edu.cn) on reasonable request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES