Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2026 Feb 27;16:11321. doi: 10.1038/s41598-026-41117-x

Gear fault diagnosis of wind turbine drivetrains using multi-source information fusion and ensemble learning in a simulation bench study

Xutao Kang 1, Luchuan Shao 1, Bing Zhao 1,
PMCID: PMC13049079  PMID: 41760719

Abstract

Traditional static filtering often fails to extract gear fault features under the variable speed conditions of wind turbines, while single-source signal analysis struggles against strong environmental noise. To address these limitations, this paper proposes a gear fault diagnosis framework integrating a synergistic closed-loop preprocessing mechanism with supervised multi-source information fusion. First, a “speed smoothing-dynamic filtering-envelope analysis” closed-loop mechanism is established to mitigate spectral ambiguity caused by non-stationary rotation. By dynamically adjusting band-pass filter parameters based on Gaussian-smoothed speed signals, this approach precisely locks onto fault feature frequency bands and significantly enhances the signal-to-noise ratio during envelope demodulation. Subsequently, a multidimensional feature system comprising time-domain statistics, frequency-domain energy, wavelet packet coefficients, and gear-specific meshing characteristics is constructed, utilizing Linear Discriminant Analysis (LDA) for supervised dimensionality reduction. Empirical analysis demonstrates that LDA utilizes fault label information more effectively than unsupervised alternatives like PCA or t-SNE, maximizing inter-class separability. Finally, an accuracy-weighted ensemble classifier is designed based on validation performance, integrating the decision-making strengths of SVM, KNN, and Random Forest models. Experimental validation on a high-fidelity wind turbine drivetrain test bench yields a diagnostic accuracy of 98.8% under complex variable speed conditions, outperforming existing single-source methods and conventional deep learning models while demonstrating superior robustness.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-026-41117-x.

Keywords: Gear fault diagnosis, Multi-source information, Feature engineering, Ensemble learning, Dimensionality reduction methods, SVM classifier

Subject terms: Energy science and technology, Engineering, Mathematics and computing

Introduction

As the core transmission component connecting the rotor and generator, wind turbine gearboxes operate under stochastic wind loads and harsh alternating stresses, rendering them highly susceptible to structural degradation. With the global wind energy sector transitioning into a post-subsidy era, the high failure rate of drivetrain components—accounting for up to 65% of mechanical downtime1—poses severe economic risks, necessitating precise fault diagnosis methodologies to ensure operational reliability and reduce maintenance overheads.

In the domain of rotating machinery diagnosis, significant progress has been made in signal processing and feature extraction. Classic techniques such as Minimum Entropy Deconvolution (MED)2, Wavelet Transform3, and various mode decomposition methods (EMD4, VMD5 have been widely applied to enhance impulse features. Recent studies have further optimized these methods; for instance, Xiao et al.6 utilized empirical wavelet transforms for combined faults, while Guan et al.7 proposed multi-kernel denoising networks. Furthermore, systematic reviews indicate that compound faults, characterized by randomness and sequential coupling, represent a primary cause of unscheduled downtime8. To address the challenge of extracting weak fault features, advanced techniques integrating Enhanced MED with Adaptive Periodized Symplectic Geometry Mode Decomposition have demonstrated efficacy in separating coupled fault features9. Concurrently, deep learning approaches have been introduced to automate feature learning. These include Dual-source Adversarial Networks10, optimized CNNs11,12, Transformer-based models1317, and domain adaptation schemes18. Comprehensive reviews on deep learning applications19 and image-based wear particle identification methods like Mask R-CNN20 further illustrate the trend towards intelligent diagnosis.

Despite the proliferation of multi-source information fusion frameworks—such as multi-level fusion21, self-attention fusion networks22, and time-domain multi-sensor fusion23,24—two critical deficiencies persist in current literature hindering their effectiveness in practical wind turbine environments. First, preprocessing mechanisms often exhibit fragmentation. Most studies15,25,26 treat rotational speed and vibration signals independently or rely on static filtering that fails to track shifting fault sidebands during rapid speed fluctuations. This lack of dynamic feedback mechanisms leads to spectral ambiguity when fault characteristic frequencies migrate outside fixed filter bandwidths. Second, existing feature engineering strategies largely focus on generic time-frequency statistics while neglecting physically significant gear-specific kinematic features, such as mesh frequency harmonics. Compounding this issue, widely used dimensionality reduction algorithms like PCA and t-SNE operate in an unsupervised manner, prioritizing global variance or local manifold structure while disregarding valuable fault class labels, which frequently results in blurred decision boundaries between distinct fault modes in the reduced feature space.

To overcome these limitations, this study proposes a systematic diagnostic framework for wind turbine drivetrains based on a synergistic closed-loop preprocessing mechanism and supervised multi-source fusion. Distinct from conventional open-loop processing, a “speed smoothing-dynamic filtering-envelope analysis” closed-loop module is constructed, leveraging Gaussian-smoothed speed trends to drive the real-time adjustment of band-pass filter parameters, thereby ensuring the precise capture of modulating fault sidebands under non-stationary conditions. Furthermore, a four-dimensional feature system integrating time-domain statistics, spectral energy, wavelet packet coefficients, and gear-specific meshing characteristics is established, utilizing Linear Discriminant Analysis (LDA) to maximize inter-class separability through supervised learning. Finally, an accuracy-weighted ensemble classifier aggregating SVM, KNN, and Random Forest models is developed to mitigate individual model bias and enhance generalization robustness.

Based on vibration signals and rotational speed signals of W-T planetary gear

The experimental dataset utilized in this study is the open-source “WT-Planetary Gearbox Dataset” released by Liu et al.27. As illustrated in Fig. 1, the experimental platform is a Drivetrain Dynamic Simulator (DDS) designed to simulate the working conditions of a wind turbine. The drivetrain consists of a driving motor, a planetary gearbox, a parallel-shaft gearbox (fixed-axis), and a magnetic powder brake acting as the load device. The detailed geometric parameters of the planetary gearbox are listed in Table 1.

Fig. 1.

Fig. 1

Experimental prototype.

Table 1.

Parameters of the planetary gearbox.

Number of teeth

Sun gear

Number of internal teeth

Number of planetary gears

28

100

36(4)

Mesh frequency (175/8)Inline graphic
Sun gear fault frequency (25/8)Inline graphic

Inline graphic the rotational frequency of the sun gear.

Data Acquisition System: To capture the vibration characteristics of the gear transmission, piezoelectric accelerometers (Model: Sinocera CA-YD-1181) were employed. The sensors were mounted orthogonally (in both vertical and horizontal directions) on the housing of the planetary gearbox to acquire multi-channel vibration signals. Simultaneously, a rotary encoder was installed to record the instantaneous rotational speed. The sampling frequency for all channels was synchronized at 48 kHz.

Experimental Conditions: The experiment covers five distinct health states of the sun gear: (a) Healthy State, (b) Gear Tooth Breakage, (c) Gear Wear, (d) Tooth Root Crack, and (e) Gear Tooth Missing. The physical appearances of these fault gears are shown in Fig. 2. To verify the method’s robustness under variable speeds, data were collected under eight distinct motor input frequencies: 20 Hz, 25 Hz, 30 Hz, 35 Hz, 40 Hz, 45 Hz, 50 Hz, and 55 Hz, representing a comprehensive range of operating regimes.

Fig. 2.

Fig. 2

Gear states: (1) Gear health, (2) gear tooth breakage, (3) gear wear, (4) tooth root crack, (5) gear tooth missing, (6) internal structure of the gearbox.

Signal Visualization: To intuitively demonstrate the signal characteristics, the raw time-domain waveforms of the five gear states (under 50 Hz condition) are presented in Fig. 3. As shown in the figure, distinct periodic impacts and amplitude modulation phenomena can be clearly observed in the fault signals compared to the healthy baseline.

Fig. 3.

Fig. 3

Time-domain vibration waveforms of the planetary gearbox under five different health states: (a) healthy state, (b) gear tooth breakage, (c) gear wear, (d) tooth root crack, and (e) gear tooth missing.

To provide a comprehensive overview of the proposed method, the complete flowchart of the fault diagnosis framework is illustrated in Fig. 4. As shown in the figure, the framework systematically integrates four key stages: (1) Data Acquisition: Vibration and speed signals are synchronously collected from the DDS test bench. (2) Synergistic Closed-Loop Preprocessing: A closed-loop mechanism is designed where the smoothed speed signal dynamically adjusts the band-pass filter parameters to enhance weak fault impulses. (3) Feature Fusion & Reduction: Multi-domain features (time, frequency, and wavelet) are fused and then projected into a discriminative subspace via LDA. (4) Ensemble Decision: The final diagnosis is determined by an accuracy-weighted voting strategy integrating SVM, KNN, and Random Forest classifiers.

Fig. 4.

Fig. 4

The overall flowchart of the proposed multi-source information fusion and ensemble learning diagnosis framework.

Selection of fusion parameters

The gearbox of a wind turbine contains a variety of parameters that have certain relationships with each other. Parameters with strong correlation contain similar fault information; therefore, only one parameter is sufficient to represent the other highly correlated parameters, avoiding excessive information volume that leads to slow calculation speed and excessive energy consumption. Therefore, the k-Nearest Neighbor (k-NN) algorithm is adopted to calculate the correlation between the X-direction vibration signals, Y-direction vibration signals, and rotational speed signals. The correlation matrix between the calculated vibration signal parameters and rotational speed signals is shown in Fig. 5. It can be observed from the figure that the correlation coefficient between X-direction and Y-direction vibration signals is 0.594, the correlation coefficient between X-direction vibration signals and rotational speed signals is 0.584, and the correlation coefficient between Y-direction vibration signals and rotational speed signals is also 0.584. It can be concluded that the three types of parameters have a certain degree of correlation, and all correlation coefficients are less than 0.6. Thus, the fault information contained in them not only has certain connections but also includes distinct fault information, which can be applied to fault diagnosis.

Fig. 5.

Fig. 5

Heatmap of correlation among different signals.

Signal preprocessing

In gear fault diagnosis of wind turbine gearboxes, signal standardization is a key link in the data preprocessing stage. Its core goal is to eliminate differences in dimensions, value ranges, and distribution characteristics among multi-source signals (e.g., vibration, rotational speed, etc.), thereby providing a fundamental guarantee for the robustness of subsequent feature extraction and classification models26. First, the dataset contains two types of signals: vibration signals and rotational speed signals, each with distinct characteristics. Raw vibration signals include environmental noise and measurement noise, requiring preprocessing to enhance fault features. This study designs a three-step process of “adaptive filtering - impact extraction - anomaly correction,” with the core objective of preserving the periodic impacts and frequency correlation characteristics of gear faults.

Monitoring signals of the gearbox in the dataset (e.g., X/Y-direction vibration acceleration, rotational speed, etc.) typically exhibit significant differences in dimensions and scales: vibration signals are mostly measured in “m/s²” or “g,” with value ranges fluctuating within [-10, 10]; rotational speed signals are measured in “rpm,” with value ranges varying between [1000, 2000]. Such differences can lead to three types of problems: (1) Imbalanced feature weights: In distance metric-based classification models, features with large value ranges will dominate distance calculations, masking features with small value ranges but critical physical significance, resulting in learning bias of the model towards fault-sensitive features. (2) Inefficient model convergence: During parameter optimization, unstandardized features will cause the objective function surface to exhibit a “stretched” shape, increasing the oscillation of gradient descent and prolonging model training time. (3) Lack of cross-operating condition comparability: Different operating conditions will cause non-linear scaling of signal amplitudes for the same fault, making it difficult for unstandardized signals to establish a unified fault discrimination benchmark. Therefore, this study adopts Z-score standardization (zero-mean standardization) to process the signals, and its formula is shown in (1).

graphic file with name d33e451.gif 1

x denotes the original data value;Inline graphic denotes the mean of the data;Inline graphic denotes the standard deviation of the data. It is important to note that Inline graphic and Inline graphic are calculated exclusively from the training set and then applied to the test set. This strict separation prevents data leakage and ensures the validity of the evaluation.

After standardizing the signals, first, it can eliminate dimensional influences and achieve feature fairness. Vibration signals and rotational speed signals with different physical meanings are mapped to a unified scale with a mean of 0 and a standard deviation of 1, enabling each feature to have equal weights during model training and ensuring the effective learning of fault-sensitive features such as vibration kurtosis and mesh frequency amplitude. Second, it retains data distribution characteristics and preserves physical meanings: compared with methods such as Min-Max standardization, Z-score standardization does not compress the data distribution range; instead, it retains the probability distribution characteristics of the original signals through centralization and scaling, which is crucial for feature extraction that relies on signal distribution patterns in fault diagnosis. Finally, the standardized feature space can reduce the interference of extreme values on the model, making scale-sensitive algorithms such as SVM more likely to converge to the global optimal solution, while enhancing the model’s adaptability under different operating conditions.

To validate the choice of standardization methods, a comparative experiment was conducted under identical experimental conditions. The results indicated that the model using Z-score standardization achieved a final accuracy of 99.20%, whereas using Min-Max normalization yielded only 91.60%. This significant drop (7.60%) occurs because Min-Max normalization is highly sensitive to outliers (impulsive noise) in vibration signals, which compresses the feature range of normal signals and reduces discriminability. In contrast, Z-score standardization is more robust to such outliers, ensuring stable model convergence.

Band-pass filtering

In the fault diagnosis of wind turbine gearboxes, raw monitoring vibration signals are often interfered by complex operating conditions, resulting in the submersion of fault features. Signal enhancement processing highlights fault-sensitive components and suppresses noise and irrelevant interference through targeted filtering, demodulation, and smoothing techniques, providing a high-quality data foundation for feature extraction. This paper first adopts band-pass filtering because gearbox faults will excite core feature components in vibration signals, such as the mesh frequency, its harmonics, and sidebands. The goal of band-pass filtering is to retain signals in this frequency band, filter out low-frequency and high-frequency interference, and achieve focus on the feature frequency band. Therefore, it is necessary to dynamically adjust the filtering range based on the physical parameters of the gear, avoiding feature loss or noise residue caused by fixed filtering. The mesh frequency and rotational frequency of the gear are directly related, and the calculation formula is as follows.(2), ༈3༉

graphic file with name d33e483.gif 2
graphic file with name d33e487.gif 3

Inline graphic denotes the rotational frequency (unit: Hz); n denotes the rotational speed (unit: rpm);Inline graphic denotes the mesh frequency (unit: Hz) ; Z denotes the number of gear teeth.

After obtaining the mesh frequency, to cover the mesh frequency and its 2nd-3rd harmonics (characteristics of the fault development stage) while avoiding spectral aliasing, the filtering frequency band is set as shown in the Eqs. (4), (5)

graphic file with name d33e508.gif 4
graphic file with name d33e512.gif 5

Inline graphic denotes the lower limit, Inline graphic denotes the upper limit, Inline graphic denotes the mesh frequency;Inline graphic denotes the sampling frequency;Inline graphic denotes the Nyquist frequency.

An 8th-order Butterworth band-pass filter is adopted, which avoids phase distortion through zero-phase filtering and ensures that the temporal characteristics of impact signals are not distorted. This method retains the frequency band where fault features are most concentrated and effectively suppresses vibration coupling interference from multiple components.

Hilbert envelope analysis

After data preprocessing, fault feature extraction is performed subsequently. Since gear faults generate periodic impact vibrations, their signals manifest as amplitude-modulated signals where high-frequency carriers are modulated by low-frequency fault features. Hilbert envelope analysis performs the Hilbert transform on the filtered signals to separate low-frequency envelope components, which directly reflect the temporal variation law of impacts and realize the visualization of fault impacts. First, the analytical signal construction of the filtered signal is conducted, as shown (6):

graphic file with name d33e544.gif 6

Inline graphic denotes the Hilbert transform of Inline graphic, and satisfies the following condition Inline graphic, Then the amplitude of the analytical signal serves as the envelope signalInline graphic This envelope signal directly reflects the temporal variation of impact intensity. Further performing the Fourier transform on the envelope signal enables clear identification of fault characteristic frequencies in the envelope spectrum, thereby enhancing the spectral discriminability of the signal.

Rotational speed signal smoothing

The rotational speed of wind turbines exhibits non-stationary characteristics due to the influence of wind speed, while the gear mesh frequency Inline graphic is directly related to the rotational speed(Inline graphic)。Instantaneous jumps or high-frequency fluctuations existing in the raw rotational speed signal will causeInline graphic calculation deviations, and further lead to the offset of the band-pass filtering frequency band and inaccurate identification of characteristic frequencies in envelope analysis. Therefore, the core function of rotational speed signal smoothing is to eliminate high-frequency interference, retain the trend variations of rotational speed, provide a stable benchmark for characteristic frequency calculation, and ensure the effectiveness of the two previous enhancement steps. The specific operation is divided into two steps: (1) Outlier cleaning .The Inline graphic Criterion is adopted, which is detailed as follows, Inline graphic is identified as an outlier, and the specific rule is as follows: A data point is identified as an outlier and replaced with the rotational speed mean, where Inline graphicdenotes the actual rotational speed Inline graphic denotes the rotational speed mean (2) Gaussian smoothing filtering: High-frequency fluctuations are suppressed through Gaussian convolution with a sliding window, as shown in Equations (7), (8)

graphic file with name d33e598.gif 7
graphic file with name d33e602.gif 8

whereInline graphic denotes the smoothed rotational speed signal,Inline graphic denotes the cleaned rotational speed value at theInline graphic moment within the window,Inline graphic=50 is the half-width of the window, selected to balance noise suppression and tracking latency,Inline graphic denotes the Gaussian kernel, k denotes the position index within the sliding window,Inline graphic=10 is the standard deviation, set to Inline graphic to prevent truncation artifacts, which determines the “width” of the kernel function, The larger the Inline graphic, the flatter the kernel function and the stronger the smoothing effect. After processing, the fitting error between the rotational speed signal and the actual operating trend is reduced, ensuring that the band-pass filter always locks onto the real characteristic frequency band and improving the accuracy of fault frequency identification in the envelope spectrum.

Band-pass filtering, Hilbert envelope analysis, and rotational speed signal smoothing are not isolated operations but form a synergistic “input-processing-feedback” closed loop, as shown in Fig. 6. Specifically, rotational speed signal smoothing provides a stable benchmark for characteristic frequencies to band-pass filtering, ensuring the filtering frequency band matches the actual fault features; band-pass filtering, in turn, eliminates noise interference for Hilbert envelope analysis, preventing the envelope signal from being contaminated by high-frequency noise; ultimately, Hilbert envelope analysis converts the filtered signal into directly interpretable fault impact features, achieving accurate mapping from raw signals to fault modes.

Fig. 6.

Fig. 6

Synergistic closed-loop diagram.

Fault diagnosis of wind turbine gearboxes based on feature parameter fusion

Fault diagnosis based on feature parameters of vibration signals and rotational speed signals

In the fault diagnosis of wind turbine gearboxes, vibration signals are the information carriers richest in fault features. Feature parameters extracted from vibration signals can be divided into time-domain features and frequency-domain features, covering the amplitude characteristics, fluctuation laws, energy distribution, and multi-directional correlation characteristics of the signals. In this paper, 10 widely used time-domain features (mean, standard deviation, root mean square (RMS), peak value, peak-to-peak value, absolute mean, kurtosis, impulse factor, form factor, crest factor) and 4 frequency-domain features (total frequency-domain energy, centroid frequency, mean square frequency, frequency bandwidth) are selected, whose calculation formulas are shown in Table 2.

Table 2.

Formulas of vibration feature parameters.

Name Formula
Mean Inline graphic
Standard deviation Inline graphic
Root mean square (RMS) Inline graphic
Peak value Inline graphic
Peak-to-peak value Inline graphic
Absolute mean Inline graphic
Kurtosis Inline graphic
Impulse factor Inline graphic
Form factor Inline graphic
Crest factor Inline graphic
Total frequency-domain energy

Inline graphic

Inline graphic denotes sampling frequency ,Inline graphic denotes the Nyquist frequency,

Inline graphic denotes the discrete frequency point

Centroid frequency Inline graphic
Root mean square frequency Inline graphic
Frequency bandwidth Inline graphic

Feature extraction is performed on the rotational speed signals in this paper, and the following relevant features are extracted: Mean, Standard Deviation, Root Mean Square (RMS), Peak-to-Peak Value, Kurtosis, Instantaneous Rotational Speed Change Rate, and Maximum Value of the Change Rate. Most of the formulas for these features are the same as those in Table 1. The dataset consists of 5 data types, with 200 sample points for each type. The sample dataset is divided into a training set and a test set at a ratio of 3:1. Subsequently, the Support Vector Machine (SVM) is used for fault diagnosis on the dataset. SVM is a classic supervised learning model proposed by Vapnik et al.12 in the 1990s. Initially applied to binary classification tasks, it has since been extended to regression, anomaly detection and other fields. Its core idea is to achieve sample classification by finding the “optimal hyperplane”. It performs excellently in small-sample and high-dimensional feature spaces, and is widely used especially in pattern recognition.

The core objective of SVM is to find a hyperplane in the feature space that completely separates samples of different classes and maximizes the distance between the hyperplane and the nearest samples of the two classes. Given the learning objective and input data:Inline graphic,Inline graphic, Each sample in the input data covers several features, which jointly construct the feature space:Inline graphic, binary target variable of the learning objective,Inline graphic denotes the positive class and the negative class. If a hyperplane serving as the decision boundary exists in the feature space where the input data resides, which separates the learning objective into the positive class and the negative class, and the distance from any sample point to the hyperplane is greater than or equal to 114, then the formula for the decision boundary is given by Eq. (9), and the formula for the distance from a sample point to the hyperplane is given by Eq. (10).

graphic file with name d33e873.gif 9
graphic file with name d33e879.gif 10

Parameters Inline graphic and b denote the normal vector and the intercept of the hyperplane, respectively. Decision boundaries satisfying this condition essentially construct two parallel hyperplanes as margin boundaries to determine the classification of samples, with the specific formulas given by Eqs. (11) and (12).

graphic file with name d33e893.gif 11
graphic file with name d33e897.gif 12

All samples above the upper margin boundary belong to the positive class, and those below the lower margin boundary belong to the negative class. The positive and negative class samples located on the margin boundaries are referred to as support vectors.

To comprehensively characterize the fault status under variable speed conditions, this study constructs a high-dimensional feature vector. In addition to standard time-domain and frequency-domain statistics, Wavelet Packet Energy and Gear-specific Kinematic Features are explicitly defined as follows to capture non-stationary and physical fault characteristics.

Wavelet Packet Energy Features The vibration signal Inline graphicis decomposed into Inline graphicsub-bands using Wavelet Packet Decomposition (WPD) at level J. The energy of the Inline graphicterminal node at depth J, denoted as Inline graphic, serves as a feature to quantify the energy distribution changes caused by faults. The calculation formula is:

Inline graphic

Where Inline graphic epresents the wavelet packet coefficients of the Inline graphic node, and N is the signal length. In this study, we select Inline graphic, resulting in 8 energy features (Inline graphictoInline graphic).

Gear-specific Kinematic Features To capture the modulation effects typical of gear failures, we define the Gear-specific Meshing Feature Inline graphic based on the Hilbert envelope spectrum. It is calculated as the sum of spectral amplitudes at the fundamental mesh frequency Inline graphic and its first M harmonics:

Inline graphic

whereInline graphicdenotes the amplitude at frequency f in the envelope spectrum, and Inline graphicis the theoretical mesh frequency derived from the real-time rotational speed. We set Inline graphicto cover the primary harmonics.

(3) Dynamic Feature Screening Algorithm To eliminate redundant and irrelevant features from the constructed high-dimensional feature set Inline graphic, a Dynamic Feature Screening strategy based on the Pearson Correlation Coefficient (PCC) is implemented. The screening procedure consists of two steps: Step 1 (Relevance Filtering): Calculate the correlation coefficient Inline graphic between each feature x and the fault class label y. Features with weak relevance Inline graphicare discarded. Step 2 (Redundancy Removal): For the remaining features, calculate the pairwise correlation Inline graphic between feature i and feature j If Inline graphic(indicating high redundancy), the feature with the lower relevance to the label is removed. In this study, the thresholds are empirically set to Inline graphic and Inline graphic. This process dynamically selects the most discriminative feature subset for input into the fusion model.

Comparison of basic experiments

To ensure the reliability of experimental results and the credibility of comparative conclusions, the experiments are conducted by fixing random seeds, unifying feature dimensions, and other means. First, the sampling points of samples are segmented: the sample length is set to 1024. Comparative tests indicated that shorter lengths (e.g., 512) resulted in insufficient spectral resolution, while longer lengths (e.g., 2048) introduced non-stationary blurring due to speed fluctuations. Thus, 1024 sampling points can satisfy the Nyquist sampling theorem for frequency-domain analysis while balancing computational efficiency and the completeness of feature representation. Non-overlapping segmentation is adopted to avoid information redundancy between samples and ensure the independence of each sample. Second, regarding feature parameters, a combination of time-domain and frequency-domain features is employed. Time-domain features reflect the amplitude statistical characteristics of signals, while frequency-domain features reflect the energy distribution characteristics of signals; their integration can comprehensively characterize fault modes. The time-domain feature set includes mean, standard deviation, peak-to-peak value, kurtosis, skewness, etc., and the frequency-domain feature set includes centroid frequency, root mean square frequency, etc., as detailed in Table 1.Finally, for model and data division: the parameters of the SVM model are configured as follows— the kernel function adopts the radial basis function (RBF), and the penalty coefficient is set to 10. The RBF kernel can handle nonlinear classification problems. To strictly determine the optimal hyperparameters (Penalty Factor C and Kernel Scale σ), a Grid Search was performed. Figure 7 illustrates the accuracy heatmap obtained during the optimization process. The results indicate that the highest accuracy is achieved when C = 10 and σ = 10. This configuration was selected to balance the model’s fitting ability and generalization ability. The training set accounts for 75% of the total data, and the test set accounts for 25%, adhering to the 3:1 division ratio commonly used in machine learning. This ratio ensures that the training set is sufficient to support model training, while the test set can effectively evaluate generalization performance. Additionally, the dataset includes five categories, covering common fault types and normal conditions of rotating machinery. After configuring and designing the core parameters, fault classification processing is subsequently performed on signals of different categories.

Fig. 7.

Fig. 7

Hyperparameter optimization heatmap of the SVM model based on Grid Search.

First, for the single X-direction vibration signals, the core objective of feature extraction from X-direction vibration signals is to mine fault information from single-dimensional vibration signals and focus on the fault representation capability of X-direction vibration signals alone. The data is input and divided into samples with 1024 sampling points each. There are 5 fault categories, with 200 samples per category, totaling 1000 samples. Then, a combined time-domain and frequency-domain feature set is adopted to ensure the comprehensiveness and discriminability of features; both time-domain and frequency-domain features are detailed in the table. Subsequently, the features of the vibration signals are extracted to construct a global feature matrix: for the 200 samples of each fault category, their 1 × 14-dimensional feature vectors are stacked row-wise to form an intra-class feature matrix with a dimension of 200 × 14. Then, the intra-class feature matrices of the 5 fault categories are vertically concatenated to obtain the global feature matrix of X-direction vibration signals alone, with a dimension of 1000 × 14 (1000 total samples × 14 features). Additionally, the dimension of the matrix needs to be verified to prevent sample loss or sample anomalies. Next, the SVM model is trained using the training set to learn the feature parameters of the signals. Finally, the test set is used for validation to evaluate the model’s learning performance, and a confusion matrix is obtained as the final result.

Second, when a system malfunctions, vibration energy is transmitted through the structure to different detection directions. A single X-direction vibration signal may miss key feature information due to the directionality of fault excitation. The core objective of fusing X and Y dual-direction vibration features is to construct a more comprehensive fault representation vector by complementing vibration response information from different directions, thereby enhancing the classification model’s discrimination ability for complex fault modes. This fusion strategy adopts a feature-level horizontal concatenation scheme, whose core logic is as follows: for the X-direction and Y-direction vibration signals of the same fault sample, structurally consistent time-domain and frequency-domain feature vectors are extracted respectively, and then horizontal concatenation is performed to achieve the expansion of feature dimensions and information fusion. The advantage of this scheme lies in retaining the integrity of single-direction features while avoiding information distortion caused by direct superposition at the signal level, ensuring the independent representation and synergistic effect of dual-direction fault features. Additionally, the construction of the global joint feature matrix follows a similar process: for each of the 5 fault categories (200 samples per category), their 1 × 28-dimensional joint feature vectors are stacked row-wise to form an intra-class feature matrix with a dimension of 200 × 28. Subsequently, the intra-class feature matrices of the 5 fault categories are vertically concatenated to obtain the global X + Y joint feature matrix, with a dimension of 1000 × 28. This superposition is essentially information complementarity: the X and Y direction features characterize fault vibration responses in different directions respectively, and the concatenation achieves an information enhancement effect of “1 + 1>2” rather than redundant superposition. Furthermore, the feature dimension is expanded from 14 to 28 (correcting the original typo for logical consistency), which not only enriches the fault representation information but also avoids the curse of dimensionality caused by excessively high dimensions, ensuring the efficiency of model training.

Finally, when processing rotational speed signals, the adopted method is basically consistent with the aforementioned one: the feature matrix is concatenated and then input into the model for training.

In this paper, three types of feature sets derived from different signals are considered: the extracted feature parameters are classified using the SVM method, and the parameter values are determined based on the classification accuracy of the training set. Specifically, the features extracted from vibration signals and rotational speed signals are combined to form feature sets, which are then input into the SVM for fault diagnosis and recognition. Finally, the fault recognition accuracy of SVM based on single vibration feature parameters is 80.4%, while that based on dual-direction vibration feature parameters reaches 89.6%, and the accuracy based on rotational speed feature parameters is 92.8%. The corresponding confusion matrix is shown in Fig. 8. From the classification results, it can be observed that the single vibration signal features can indeed ensure the accuracy of partial fault recognition after classification using the SVM. Furthermore, the accuracy is improved by 9.6% after integrating fault information from different directions of the same type, which further indicates that multi-source information can enhance the accuracy of fault diagnosis. Meanwhile, it is evident that some faults cannot be accurately identified relying solely on single vibration feature parameters, leading to a relatively low overall recognition rate.

Fig. 8.

Fig. 8

Comparison of confusion matrices for three feature sets.

Fault diagnosis based on feature fusion

Based on the aforementioned research findings, the fault recognition accuracy of SVM using vibration feature parameters from a single signal source is lower than that using multi-source vibration feature parameters. Therefore, it is necessary to perform dimensionality reduction on the fault features via dimensionality reduction algorithms. Three such algorithms are selected in this paper, namely: Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and t-distributed Stochastic Neighbor Embedding (t-SNE), each of which is employed for dimensionality reduction.

As shown in Fig. 9, the silhouette coefficient of PCA is 0.193, that of LDA is 0.791, and that of t-SNE is 0.541. It can be observed that LDA achieves the best dimensionality reduction effect among the three. After PCA dimensionality reduction, the features remain mixed with unclear classification boundaries, and different faults are redundant with each other. For t-SNE, preliminary fault feature classification is achievable; however, the wear fault, healthy state, and tooth missing fault are obviously merged together, leading to an unsatisfactory classification effect.

Fig. 9.

Fig. 9

Dimensionality reduction effects of three dimensionality reduction methods.

In contrast, the LDA algorithm fuses the rotational speed feature parameters with the vibration feature parameters and subsequently performs dimensionality reduction on the fused features. As shown in Fig. 9, faults of the same type are concentrated. Although some visual proximity exists between the Healthy, Wear, and Tooth-Missing classes in the 2D projection, the subsequent quantitative evaluation (Fig. 10) confirms that their decision boundaries are sufficiently distinct for accurate classification.

Fig. 10.

Fig. 10

Sample distribution after LDA dimensionality reduction.

While Principal Component Analysis (PCA) and t-distributed Stochastic Neighbor Embedding (t-SNE) are widely utilized for feature reduction, their unsupervised nature limits their efficacy in fault diagnosis scenarios where class labels are available. PCA maximizes the global variance of the projected data, aiming to preserve signal energy; however, the directions of maximum variance often coincide with high-amplitude noise rather than fault-induced variations, leading to suboptimal separability. Similarly, t-SNE focuses on preserving local manifold structures for visualization, often distorting global cluster distances which complicates the definition of decision boundaries.

In contrast, Linear Discriminant Analysis (LDA) operates as a supervised technique explicitly designing a projection matrix W that maximizes the ratio of the between-class scatter matrix Inline graphic to the within-class scatter matrix Inline graphic.By enforcing the objective function Inline graphic, LDA clusters samples of the same fault type tightly while pushing different fault clusters apart. This mechanism ensures that the reduced subspace is optimized specifically for classification discriminability rather than mere information reconstruction, thereby explaining the superior separation performance observed in Fig. 7 compared to the overlapping clusters yielded by PCA and t-SNE.

The dataset constructed by fusing and reducing the dimensionality of feature parameters via the LDA algorithm is input into the SVM fault diagnosis model for validation experiments. The results show that the SVM recognition method based on fused feature parameters achieves an accuracy of 99.2% on the test set, with the corresponding confusion matrix shown in Fig. 11. Compared with the recognition results using only vibration feature parameters, the recognition accuracy of this method is significantly improved. This result confirms that dimensionality reduction after fusing different signal features can effectively enhance the discriminability of various fault modes. Compared with the dimensionality reduction effect of single signal features, the aforementioned fusion and dimensionality reduction strategy exhibits superior performance, which provides strong support for the improvement of diagnostic accuracy and fully verifies the effectiveness and feasibility of the proposed feature fusion method.

Fig. 11.

Fig. 11

Confusion matrix of the SVM model.

Ensemble learning

The fault classification accuracy of the SVM model on the training set reaches 99.2%, which is higher than the 92.8% accuracy achieved by SVM using only single rotational speed feature signals, representing an improvement of 6.4%.Meanwhile, to avoid overfitting in fault diagnosis—where a single model is prone to overlearning the noise in the training data—for instance, the SVM model in this study achieves 98% accuracy on the training set, but its misclassification rate surges when faced with new samples with wind speed fluctuations. This study designs an ensemble learning framework that aggregates the prediction results of multiple base models through the Voting Method to output a final comprehensive decision. Essentially, the Voting Method leverages the diversity among base models: different models learn from the data from distinct perspectives, and collective decision-making reduces the risks of overfitting or underfitting that may exist in individual learners. Thus, this study further adopts the K-Nearest Neighbors (KNN) algorithm, Random Forest algorithm, and Naive Bayes algorithm to perform ensemble learning with the SVM algorithm.

This study adopts “Soft Voting” within the Voting Method. The final result is determined by the weighted average of the prediction probabilities of feature parameters generated by each model, which can further improve the overall accuracy while avoiding the problem of overfitting or underfitting in individual models. Each model is trained and tested separately, and the corresponding confusion matrices for the test set are obtained as shown in Fig. 12. It can be observed that the KNN classifier achieves an accuracy of 88.8%, the Random Forest classifier reaches 98.4%, and the Naive Bayes classifier attains 95.2%.

Fig. 12.

Fig. 12

Top-left: confusion matrix of the KNN algorithm; top-right: confusion matrix of the random forest algorithm; bottom-left: confusion matrix of the naive Bayes algorithm; bottom-right: confusion matrix of the SVM algorithm.

To comprehensively utilize the decision-making capabilities of different models, an accuracy-weighted strategy is adopted. The weightInline graphicassigned to the i-the base classifier is proportional to its classification accuracyInline graphicon the validation set, formulated as:

Inline graphic

Based on the experimental results, the accuracies are: KNN (88.8%), Random Forest (98.4%), Naive Bayes (95.2%), and SVM (99.2%). Following the formula, the normalized weights are assigned as follows: KNN: 0.233, Random Forest: 0.258, Naive Bayes: 0.249, and SVM: 0.260. The final ensemble prediction is obtained by the weighted summation of the probability vectors from these four classifiers.

Subsequently, the SVM classifier achieves an accuracy of 99.2%. These four algorithms are weighted, with the weights assigned as follows: KNN: 0.233, Random Forest: 0.258, Naive Bayes: 0.249, and SVM: 0.260.Ensemble learning is implemented based on the assigned weights. The results of the ensemble learning and the corresponding confusion matrix are shown in Fig. 13. It can be observed that the accuracy of the ensemble learning model reaches 98.8% after integration.

Fig. 13.

Fig. 13

Confusion matrix of ensemble learning.

Comparative analysis with deep learning methods

To further verify the superiority of the proposed method in extracting features under variable speed conditions, we compared it with two mainstream deep learning models: a 1D Convolutional Neural Network (1D-CNN)12 and a Bi-directional Long Short-Term Memory (Bi-LSTM) network16. The 1D-CNN model consists of two convolutional layers and two pooling layers, while the Bi-LSTM model utilizes bidirectional dependencies to capture temporal features. All models were trained and tested using the same dataset partition. The comparative results are presented in Table 3.

Table 3.

Comparison with deep learning methods.

Method Accuracy Training time
1D-CNN 97.20% 10.76 s
Bi-LSTM 50.00% 29.47 s
Proposed Method 98.80% 45.20 s

As shown in Table 3, the proposed ensemble framework achieves the highest accuracy of 98.80%, outperforming the 1D-CNN (97.20%) and significantly surpassing the Bi-LSTM (50.00%). While the 1D-CNN demonstrates high efficiency, it operates as a “black box” with limited interpretability. The Bi-LSTM performs poorly due to the vanishing gradient problem inherent in processing long signal sequences (1024 points) with limited sample sizes. In contrast, our proposed method not only achieves superior accuracy but also provides clear physical interpretability through the extracted gear-specific features, making it more reliable for industrial applications.

Ablation study

To quantify the contribution of each core module to the final diagnostic performance, an ablation study was conducted. We evaluated the model’s accuracy by progressively adding the key components: Closed-loop Preprocessing, Multi-source Fusion, LDA Dimensionality Reduction, and Ensemble Learning. The results are summarized in Table 4.

Table 4.

Ablation study results.

Experiment setting Accuracy Improvement
Baseline (raw data + SVM) 76.7% -
+ Closed-loop preprocessing 80.4% + 3.7%
+ Multi-source fusion 95.7% + 15.3%
+ LDA dimensionality reduction 96.1% + 0.4%
+ Ensemble learning (proposed) 98.8% + 2.7%

The results explicitly validate the necessity of each step. Specifically, the closed-loop preprocessing boosts the baseline performance by suppressing noise. Multi-source fusion provides the largest single-step improvement (+ 15.3%), confirming that combining vibration and speed signals provides complementary fault information that a single source cannot offer. Finally, Ensemble Learning further refines the decision boundary, pushing the accuracy to 98.8%.

Summary

  1. The vibration signals and rotational speed signals of wind turbines are subjected to dimensionality reduction via three methods. The results show that after dimensionality reduction of the two types of parameters (vibration and rotational speed features) using LDA, fault samples of the same type are closely clustered with small distances between them, while different fault types exhibit distinct separability. This enables effective discrimination of various faults and further verifies the feasibility of the LDA-based feature fusion and dimensionality reduction method.

  2. Initially, the SVM fault diagnosis model established based on a single vibration signal achieves an accuracy of 80.4%, that based on dual-direction vibration signals reaches 89.6%, and that based on rotational speed signals attains 92.8%. After fusing different feature signals and reducing their dimensionality via the LDA algorithm, the accuracy of the SVM-based fault diagnosis model reaches 99.2%, which is significantly improved and enables effective identification of fault types.

  3. Finally, an ensemble fault diagnosis model is constructed by integrating four distinct types of algorithms: the KNN algorithm, SVM algorithm, Random Forest algorithm, and Naive Bayes algorithm. The fault diagnosis accuracy of the ensemble model reaches 98.8%. Compared with the SVM fault diagnosis model based on a single vibration signal, the accuracy increases by 18.4%, and it is 6% higher than that of the SVM fault diagnosis model based on rotational speed signals. This demonstrates that the ensemble model addresses the overfitting issue of individual models and can effectively identify fault types.

  4. Limitations and Future Work: While the proposed method demonstrates strong generalization ability across widely varying rotational speeds, its universality across different physical test benches has not yet been verified. Future work will focus on validating the ‘Synergistic Closed-Loop’ framework on public datasets to assess its cross-domain transferability.

Supplementary Information

Below is the link to the electronic supplementary material.

Author contributions

Xutao Kang (Author 1): Conceptualization, Methodology, Software, Investigation, Formal Analysis, Writing - Original Draft; Data Curation, Writing - Original Draft; Visualization, Investigation; Luchuan Shao (Author 2): Resources, Supervision; Software, Validation; Visualization, Writing - Review & Editing. Bing Zhao (Author 3 Corresponding Author): Conceptualization, Funding Acquisition, Resources, Supervision, Writing - Review & Editing.

Data availability

The data supporting the findings of this study are openly available to ensure reproducibility. The raw vibration and rotational speed signals of wind turbine gearboxes (serving as both raw and processed data) are from the research team of Liu Dongdong at Beijing University of Technology, and the download URL for these data is: https://github.com/Liudd-BJUT/WT-planetary-gearbox-dataset/tree/master. For requests to access the data or materials related to this study, please contact the corresponding author: Dongdong Liu (Email: liudd@bjut.edu.cn).

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Wenxiu, Z. & Xinfang, W. Research on condition monitoring and fault diagnosis technology of wind turbines. Electr. Mach. Control Appl.41(02), 50–56 (2014).
  • 2.Jiang, R. et al. The weak fault diagnosis and condition monitoring of rolling element bearing using minimum entropy deconvolution and envelop spectrum. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 227(5), 1116–1129 (2013).
  • 3.Jitong, S., Long, X., Zhanfei, H., Zhongliang, L. & Shicun, D. Fault detection method of wind turbines based on wavelet transform. Water Power 49(06), 99–104 (2023).
  • 4.Yuanjie, Y., Guangyong, Y., Ting, Y., Tiangi, X. & Yihang, G. Fault extraction of rolleing bearing based on CEEMD and MOMEDA. Electron. Meas. Technol.44(22), 96–101 .10.19651/j.cnki.emt.2107226 ( 2021).
  • 5.Qiumei, W. Qiang, L & Yan, Y. Research on fault diagnosis of three-stage planetary gearbox based on variational mode decomposition. Mach. Des. Res.38(02), 101–104 .10.13952/j.cnki.jofmdr.2022.0027 (2022).
  • 6.Xiao, Y. et al. Low-pass filtering empirical wavelet transform machine learning based fault diagnosis for combined fault of wind turbines. Entropy23 (8), 975 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Guan, S., Li, J. & Shi, M. Research on wind turbine gearbox fault diagnosis method based on multi-kernel wavelet denoising network and contrastive learning. Measurement 118982 (2025).
  • 8.Li, S. et al. A Systematic Review on Diagnosis Methods for Rolling Bearing Compound Fault: Research Status, Challenges, and Future Prospects (Measurement Science and Technology, 2024).
  • 9.Li, S. et al. Compound fault diagnosis method for rolling bearings based on enhanced MED and adaptive periodized symplectic geometry mode decomposition. Struct. Health Monit. 14759217251314703 (2025).
  • 10.Ronghua, Z., Lichunlong, M., Kai, W., Hanqiu, L. & Xujun, Z. Cross-domain fault identification method of gearbox based on DFCAN under unsupervised conditions. J. Mech. Electr. Eng. https://link.cnki.net/urlid/33.1088.TH.20250825.1051.009 (2025).
  • 11.Xu, C. et al. Fault diagnosis method of wind turbine planetary gearbox based on improved CNN. Sci. Rep.15 (1), 32481 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Wang, L. et al. A CNN-SA-GRU model with focal loss for fault diagnosis of wind turbine gearboxes. Energies18 (14), 3696 (2025). [Google Scholar]
  • 13.Lu, J. et al. Fault diagnosis method of wind turbine bearing based on time graph fusion. Trans. Inst. Meas. Control 01423312251364606 (2025).
  • 14.Zhang Binqiao, L. J. & Gang, W. Fault diagnosis of wind turbine gearbox based on MTF-swin transformer. Renew. Energy Resour.42(05), 627–633 .10.13941/j.cnki.21-1469/tk.2024.05.001 ( 2024).
  • 15. Baoyue, L. et al. Diagnosis method for typical faults of diesel engines based on multi-source information fusion. Chin. Internal Combus. Eng. Eng.46(01), 73–79 + 90. 10.13949/j.cnki.nrjgc.2025.01.009 (2025).
  • 16.Yuan, W. Fault diagnosis of wind turbine bearings based on improved dung beetle optimizer optimized LSTM. Eng. Res. Exp.
  • 17.LiJunging, Z. et al. Electric power science and engineering. Electr. Power Sci. Eng.39(02), 64–71 ( 2023).
  • 18.Zhu, Y. et al. A partial domain adaptation scheme based on weighted adversarial nets with improved CBAM for fault diagnosis of wind turbine gearbox. Eng. Appl. Artif. Intell.125, 106674 (2023). [Google Scholar]
  • 19.Dongdong, L., Cui, L. & Cheng, W. A review on deep learning in planetary gearbox health state recognition: Methods, applications, and dataset publication. Meas. Sci. Technol. 10.1088/1361-6501/acf390 (2025).
  • 20.Zhihong, Y., Shizhong, H., Wei, F., Qiuqiu, L. & Weichu, H. Intelligent identification of wear particles based on mask R-CNN network and application. Tribology41(01), 105–114. 10.16078/j.tribology.2020020 (2021).
  • 21.Deng, X. et al. A multi-level fusion framework for bearing fault diagnosis using multi-source information. Processes13 (8), 2657 (2025). [Google Scholar]
  • 22.Yang, Q. et al. Self-attention parallel fusion network for wind turbine gearboxes fault diagnosis. IEEE Sens. J.23 (19), 23210–23220 (2023). [Google Scholar]
  • 23.Long, X. et al. A CBA-KELM‐based recognition method for fault diagnosis of wind turbines with time‐domain analysis and multisensor data fusion. Shock Vib.2019 (1), 7490750 (2019). [Google Scholar]
  • 24.Ding, S. et al. Fault diagnosis of wind turbine rotating bearing based on multi-mode signal enhancement and fusion. Entropy27 (9), 951 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Zhang, X., Jia, F. & Chen, Y. Fault diagnosis of wind turbine gearbox based on Mel spectrogram and improved ResNeXt50 model. Appl. Sci.15 (15), 8563 (2025). [Google Scholar]
  • 26.Zhihua, Z. Machine learning. Airport J. 2018(02), 94 (2018).
  • 27.Vapnik, V. N. & Lerner, A. Y. Recognition of patterns with help of generalized portraits. Avtomat Telemekh. 24 (6), 774–780 (1963). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The data supporting the findings of this study are openly available to ensure reproducibility. The raw vibration and rotational speed signals of wind turbine gearboxes (serving as both raw and processed data) are from the research team of Liu Dongdong at Beijing University of Technology, and the download URL for these data is: https://github.com/Liudd-BJUT/WT-planetary-gearbox-dataset/tree/master. For requests to access the data or materials related to this study, please contact the corresponding author: Dongdong Liu (Email: liudd@bjut.edu.cn).


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES