Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2024 Jun 26;14:14692. doi: 10.1038/s41598-024-65372-y

Mitigating sub-synchronous oscillation using intelligent damping control of DFIG based on improved TD3 algorithm with knowledge fusion

Ge Liu 1,2, Jun Liu 1,, Andong Liu 1
PMCID: PMC11208430  PMID: 38926443

Abstract

The occurrence of sub-synchronous oscillation (SSO) phenomenon in doubly-fed induction generators (DFIGs)-based wind turbines threatens the secure and stable operation of the power grid. Conventional sub-synchronous damping controllers encounter challenges in adapting to the dynamic operating conditions of power systems. This paper introduces an Intelligent Sub-Synchronous Damping Controller (I-SSDC) for DFIGs that integrates deep reinforcement learning (DRL) and knowledge to address the limitations of conventional methods for SSO mitigation. The initial step involves formulating a framework for I-SSDC using the improved twin delayed deep deterministic policy gradient (TD3) algorithm incorporating Softmax. Following this, a surrogate model is constructed, employing Weighted Linear Regression and regularization. This model is designed to identify the predominant influencing factors of SSO, focusing on the selection of the output signal (installation position) to optimize decision-making in I-SSDC. The objective is to enhance the controller’s environmental adaptability and interpretability. Moreover, knowledge and experience related to SSOs are integrated into agent training to improve the exploration efficiency of the agent. Case studies under various operating conditions of the test power system validate the efficacy of the proposed I-SSDC in suppressing SSOs.

Keywords: Sub-synchronous oscillation, Intelligent damping controller, S-TD3 algorithm, Adaptive output signal selection, Knowledge fusion

Subject terms: Energy science and technology, Engineering, Computer science

Introduction

Motivation and incitement

Given the increasing global demand for green energy, wind power has emerged as a crucial player in power generation due to its abundant and renewable nature. Grid-connected wind power systems, particularly those utilizing doubly fed wind turbines with series capacitor compensation lines for transmission, are experiencing substantial growth within current power systems. However, this expansion brings about significant challenges, notably in the form of SSOs, which impact the system’s frequency and voltage stability1,2. For instance, an SSO event occurred at a wind farm in Texas in October 2009, primarily due to the interaction between power electronics related to Doubly Fed Induction Generators (DFIGs) and the adjacent series-compensated transmission line3. Similar SSO incidents have also posed challenges to power systems operations in Buffalo Ridge, Canada4, and Guyuan, China5. Operating conditions of the system can change at any time, and the frequency characteristics of triggered SSO may vary for different events in the same system. The characteristics of the triggered oscillation are influenced by factors such as wind speed, the number of online wind turbine generators (WTGs), and the degree of series compensation. Therefore, SSO can lead to undesirable consequences, including equipment damage, generation loss, and other power quality issues6,7. It is crucial to adopt a damping control strategy that can effectively mitigate SSO under the complexity of wind power systems and the dynamic operating environment8.

Literature review

Currently, studies have explored strategies to suppress SSOs in DFIG-based wind farm grid-connected systems. The Sub-Synchronous Damping Controller (SSDC) method proves more suitable for engineering applications due to its low control cost, clear mechanism, and fast response. In Reference9, a comparison of additional damping control on the rotor-side converter (RSC) and grid-side converter (GSC) of DFIGs revealed superior mitigation performance on the RSC. However, due to the nonlinear relationship between controller parameters and system performance, it is challenging to determine the optimal parameters for this method. Reference10 introduced an improved sub-synchronous resonance damping controller and its corresponding control strategy, determining optimal gain coefficients using a particle swarm optimization algorithm. This method optimizes control parameters for different SSO frequencies but is only applicable for SSO suppression within a fixed frequency band. Traditional SSDC, based on the phase compensation principle, faces challenges in meeting oscillation suppression requirements under various operating conditions in grid-connected systems.

To overcome this challenge, various advanced control methods, including feedback linearization, sliding mode control, and robust control, have been proposed Reference11. Reference12 proposed a robust control method to suppress SSOs that is adaptable to changes in the output of multiple wind farms. Although robust control is applied to address changes in system states, quantitatively specifying the upper bound of uncertainties and disturbances a priori remains a formidable task. In Reference13, an innovative method for SSO analysis in wind power networks using linear optimal control was introduced. The approach involves developing a comprehensive mathematical model for wind turbines and applying linear optimal control theory to mitigate SSOs within wind power networks. Despite improvements in the design of SSDC, which simplifies parameter tuning, limitations persist due to the reliance on precise dynamic modeling of wind power systems, preventing online adaptive adjustments. Reference14 introduced an SSDC strategy employing an adaptive bandstop filter. However, the integration of spectral estimation and recursive least squares algorithm in this approach results in numerous design parameters, thereby elevating the complexity and design duration to some extent. Notably, all the aforementioned SSDC parameter designs adopt a model-based approach. Yet, such methods are susceptible to the influence of unmodeled dynamic links and other uncertain factors15, thereby potentially impacting the efficacy of damping control. Moreover, the oscillation frequency of the SSO is contingent upon the power grid environment, necessitating further validation of the aforementioned method’s efficacy in suppressing oscillations amidst frequency variations. In contrast, Reference16 identified the nonlinear dynamic characteristics and uncertainty issues inherent in DFIGs. To address these challenges, they proposed a novel robust disturbance observer fractional-order sliding mode controller, aimed at maximizing power extraction and enhancing fault ride-through capabilities. Additionally, Reference17 emphasized the significance of fault ride-through as a crucial performance metric for DFIGs. Their proposed solution involves enhancing this capability through a voltage compensator and introducing a dynamic voltage restorer to mitigate voltage fluctuations. Furthermore, Reference18 devised an effective Fractional-Order Sliding Mode Controller to precisely regulate the active and reactive power of DFIGs while mitigating system uncertainties and reducing chatter amplitude. Lastly, Reference19 presented a novel inertial control strategy enabling DFIGs to absorb or release kinetic energy via active power control, thereby facilitating participation in system frequency control.

In recent years, attention has shifted to Deep Reinforcement Learning (DRL) as a data-driven control method with robust nonlinear approximation capabilities and adaptability to complex environments20. DRL learns the mapping relationship from states to actions, enabling the acquisition of optimal strategies through continuous interaction with the environment and facilitating adaptive handling of variations in complex systems21. DRL-based damping controller design methods eliminate the need to study the system’s internal structure and mechanism or the controlled system’s mathematical model. They achieve rapid and effective online decision-making through offline learning and online applications. Reference22 employed the GrHDP algorithm to design a damping controller based on VSC-HVDC, effectively suppressing multi-mode low-frequency oscillations by adjusting the active power output of the inverter. The time delay caused by the transmission of remote signals will result in performance degradation of the controller. Reference23 utilized reinforcement learning technology to overcome communication delays and other nonlinear problems in wide-area damping control. Damping control methods based on reinforcement learning are often employed for transient oscillation suppression and low-frequency oscillation suppression within a small oscillation frequency range. However, they face challenges such as inadequate adaptation to operating conditions and the lack of real-time verification of control.

Contribution and paper organization.

This paper introduces an innovative supplementary damping control method specifically designed for SSOs. The main contributions of this work are as follows:

  • An Intelligent Sub-Synchronous Damping Controller (I-SSDC) based on the improved Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm is developed for DFIGs to mitigate SSOs. The inclusion of the softmax operation addresses the underestimation bias in the TD3 algorithm, enhancing its efficacy in the damping control process. A training method employing multiple samples is adopted, tailored for the suppression of time-varying and operationally diverse SSOs.

  • A surrogate model is constructed using weighted linear regression and regularization, enabling the selection of the installation position for I-SSDC by explaining the regression model of participation factors. Compared with purely data-driven models, I-SSDC has better interpretability.

  • Improvement strategies based on knowledge fusion are proposed to address the low training efficiency of current purely data-driven methods for intelligent agents. This strategy significantly accelerates the training convergence speed, which is beneficial for practical engineering applications.

  • The performance of the I-SSDC is compared with traditional SSDC (T-SSDC) considering multiple operating conditions, including wind speed, active/reactive power output of wind farms, number of fans, and line series compensation degree.

This paper is organized as follows: “Methods Principle of Proposed I-SSDC” presents the system’s model with I-SSDC and the mitigation principle. Then, a data-driven SSO mitigation method using an intelligent sub-synchronous damping controller is proposed in the section “Design of I-SSDC based on improved TD3 algorithm with knowledge fusion”. The simulation results under multiple operating conditions are presented in section “Case study and experimental design”. Finally, major conclusions and potential directions for further investigation are given in Section “Conclusions”.

Methods principle of proposed I-SSDC

The equivalent model for SSO damping in DFIG-based wind farms with I-SSDC is shown in Fig. 1. The DFIG model comprises sections for wind turbines (WTs), DFIGs, inverter DCs, and rotor-side and grid-side converter controllers. The transmission system section, featuring series compensation, consists of a 220 kV line and a 500 kV line, with series capacitor compensation linked to the 500 kV line. Tables 1 to 3 in Appendix A meticulously delineate the parameters for each WTG, as well as the transmission line and transformer parameters.

Figure 1.

Figure 1

Equivalent model of I-SSDC suppression principle.

Modelling of the DFIG-based WT and its conversion system

The WT is the primary link of energy conversion in the wind power generation system. Mechanical output power and torque generated by the WT can be expressed as follows24.

Pmec=12ρπR2Vw3cpλ,β 1
Tmec=12ρπR2Vw3cpλ,β/λ 2

Here, ρ represents air density, Vw denotes wind speed, R signifies the radius of the WT rotor, β indicates the pitch angle in the variable pitch system, λ refers to the tip-speed ratio of the rotor, and cp represents the WT’s power coefficient. λ and cp can be expressed by the following equations:

λ=Rwrvw 3
cpλ,β=0.5176116λi-0.4β-5e-21λi+0.0068λ 4
1λi=1λ+0.08β-0.035β3+1 5

The DFIG is a complex system with high order, multivariable, nonlinearity, and strong coupling. In DFIG, the stator is directly connected to the grid, and its rotor is connected to the grid via a back-to-back converter for AC excitation. The stator and rotor voltage equations in the d–q reference frame can be illustrated as follows25:

uds=Rsids+dψdsdt-wsψqs 6
uqs=Rsiqs+dψqsdt+wsψds 7
udr=Rridr+dψdrdt-ws-wrψqr 8
uqr=Rriqr+dψqrdt+ws-wrψdr 9

Here, uds, uqs, udr, and uqr represent the d-axis and q-axis components of the stator and rotor voltages, respectively; ωs is the synchronous magnetic field rotation angular velocity; ωr denotes the rotor angular velocity; ψds, ψqs, ψdr, and ψqr represent the d-axis and q-axis components of the stator and rotor magnetic fluxes.

The electromagnetic torque equation is as Eq. (10), where np is the number of pole pairs:

Te=32npψqridr-ψdriqr 10

Rotor-side converter control consists of current inner loop control and power outer loop control. The reference value of the inner current loop depends on the maximum power point tracking (MPPT) curve (Te_ref) and reactive power control of the outer power loop (Qe_ref), respectively. The difference between the reference value and the rotor current feedback (idr,iqr) is sent to the PI controller, and the udr and uqr of the rotor voltage control are obtained. Through the conversion of d–q reference to a-b-c reference and PWM signal modulation, the power decoupling control of the rotor-side is realized.

The grid-side converter also employs double closed-loop decoupling control. The reference value of current inner loop control is obtained from the deviation of DC voltage (Vdc) and reactive power (Qg) in outer loop control by the PI controller. The difference between the current reference value of the converter and the feedback quantity (idg, iqg) is input into the inner loop PI controller to obtain the voltage control signal (udg, uqr) of the converter on the grid-side.

Mitigation principle of I-SSDC for DFIGs

The mitigation principle of the proposed I-SSDC for DFIGs resembles that of traditional SSDC, as illustrated in Fig. 1. Selecting electrical signals with significant sub-synchronous components from the measured electrical quantities of grid-connected nodes as the output signal y(t) of the controlled system. This signal is then fed into the I-SSDC, producing a control output signal u(t). SSOs in the DFIG-based wind farm grid-connected system primarily stem from the interaction between the controller of RSC and the series capacitor compensation circuit26. Consequently, the output signal u(t) is integrated into the control loops of the RSC as the supplementary control signals, thereby generating damping torque/power and providing positive damping for the system. In contrast to traditional SSDC, this paper introduces I-SSDC to enhance adaptability to continuously changing environments and operational conditions. DRL, renowned for its learning and adjustment capabilities, proves advantageous for complex and dynamically changing systems like wind farms, as it does not rely on precise system models. TD3 is a DRL algorithm designed for deterministic strategies, making it well-suited for decision-making tasks involving continuous action spaces. Given that the environmental state variables of DFIG-based wind farms are continuous, I-SSDC adopts an improved TD3 algorithm based on measurement data to intelligently adjust damping control parameters through reinforcement learning, effectively mitigating SSOs.

Principle of I-SSDC Based on DRL

Reinforcement learning involves a continuous interaction process between an agent and its environment to determine an optimal policy that maximizes the expected return. The key components include the environment, the agent, a set of states (s) representing the environment, a set of actions (a) representing the agent’s actions, and rewards (r) given to the agent. This interactive process is depicted in Fig. 2. In the context of I-SSDC based on DRL, the DFIG-based wind farm grid-connected system serves as the environment, with measured electrical quantities acting as the state for the agent. The agent determines the optimal action policy based on the state and received reward, i.e., the additional damping control signal.

  1. The state, which is the perceptual information provided by the environment to the agent in the SSO suppression problem considered in this paper, is the input control signal to I-SSDC. Common input signals for additional damping controllers include rotor speed, terminal voltage of DFIGs, rotor current, etc. The input control signal for SSDC should possess characteristics that facilitate easy acquisition and fast transmission to minimize signal acquisition delays. The state set is defined as the oscillation amplitude of electrical quantities of grid-connected DFIG-based WTs:
    st=[Δugd(t),Δugq(t),Δigd(t),Δigq(t)] 11
  2. The action space comprises relevant decision variables in the optimization model. To suppress SSOs, the controller of RSC can be enhanced by adding additional damping control, thus injecting an extra damping control signal into the control system. The selected controller output signals include the inner and outer loop control output signals of the DFIG’s RSC. Dual-loop control is advantageous for suppressing SSOs. The action set can be defined as the injected additional damping control signals into the control system:
    at=[TeAdd(t),QeAdd(t)]ν[idrAdd(t),iqrAdd(t)]ν[udrAdd(t),uqrAdd(t)] 12
  3. The reward function serves as a crucial driving signal for the intelligent agent to explore the optimal action strategy. The oscillation amplitudes of active and reactive power at the grid-connected node are critical for oscillation suppression. Therefore, the reward function for the agent is designed as Eq. (13):
    rt=-(λ1ΔP(t)+λ2ΔQe(t)) 13

Figure 2.

Figure 2

Interaction process between agent and environment.

Here, λ1 and λ2 require continuous experimentation and modification during training.

Design of I-SSDC based on the improved TD3 algorithm with knowledge fusion

Framework of I-SSDC

Figure 3 illustrates the overall schematic diagram of I-SSDC. The left side depicts the power system environment requiring additional damping, while the DRL agent on the right utilizes an S-TD3 algorithm, transforming the controller design into an MDP. The S-TD3 algorithm, an enhancement of TD3, incorporates the Softmax operation to control the gap between the value function and the optimal value, resulting in improved decision-making effects. A regression model is established between the electrical quantities of rotor-side control loops and contributing factors to adapt to the inhibitory effects of controllers in diverse scenarios. On this basis, a surrogate model extracts key characteristics from the electrical quantities to determine the optimal installation position for I-SSDC. The framework supports the integration of knowledge and experience related to the SSO parameter, restricting the agent’s exploration space. This is achieved by dividing the experience replay into successful experience replay and failure experience replay, followed by mixed sampling for agent training. In each episode, the agent attempts actions based on input states and performs a one-time domain simulation. The episode ends if the system’s oscillation amplitude is less than ϵ; otherwise, the search continues. Utilizing the samples generated by the interaction between the agent and the environment, the agent is trained to obtain the optimal action strategy for suppressing SSO through as few action attempts as possible.

Figure 3.

Figure 3

The Overall Schematic diagram of I-SSDC.

Adaptive output signal selection of I-SSDC based on a surrogate model

When deploying an additional damping controller to mitigate SSO, the controller’s output signal may impact its control efficacy. Moreover, the configuration of damping values often requires adjustments based on the operational state of the system27,28. Potential output signals available for the SSDC for control over RSC encompass the power control loops, d-axis and q-axis current control loops, and d-axis and q-axis voltage control loops. Employing traditional model-based observability and controllability indicators for generating output signals necessitates extensive online calculations. In this paper, the extraction of critical features from measured data of the rotor-side is employed to discern the optimal output for the controller, thereby enhancing the model’s performance and interpretability.

Exploring the relationship between electrical quantities and SSO is imperative to identify effective data features. The linearized state-space equations for the DFIG-based grid-connected system are presented below:

ddtΔX=AΔX+BΔV 14
ΔY=CΔX+DΔV 15

Here, ΔX denotes the system’s state variables, ΔV represents the grid-side input voltage at the WT connection point, and ΔY=[ΔP,ΔQ] signifies the active and reactive power injected by the WT into the power system. By employing the conventional modal analysis method29, the small disturbance stability of the system is characterized by the eigenvalues λ of the system state matrix A. Each pair of complex conjugate eigenvalues corresponds to an oscillatory mode. The participation factors (Pki) are utilized to depict the influence of various system state variables on each oscillatory mode, as illustrated in Eq. (16).

Pki=νkiwkii=1nνkiwki 16

In Eq. (16), νki and uki represent the elements in the k-th row and i-th column of the left and right eigenvector matrices corresponding to the eigenvalue λi, respectively. Meanwhile, Pki signifies the correlation between the i-th mode (associated with the eigenvalue λi) and the k-th state variable. A larger Pki indicates a more significant influence of the state variable on this mode30.

With the characteristic matrix A of the closed-loop system given, the participation factors Pki can be determined, i.e., Pki=A, where · denotes the mapping relationship function between the participation factor and the system characteristic matrix. The characteristic matrix A of the closed-loop system is dependent on the system’s operating point M. SSCI is primarily caused by the interaction between the RSC control and the series capacitor compensation circuit. When analyzing the SSCI problem, the influences of the grid-side converter, filters, DC capacitor, and phase-locked loop can be disregarded. Therefore, M is expressed as M=Te,Qe,idr,iqr,udr,uqr, where Te, Qe, idr, iqr, udr, and uqr represent the electrical quantities of the rotor-side control system. Consequently, the relationship between the participation factors and the system’s electrical quantities can be expressed as Eq. (17):

pki=g((A))=g((h(M)))=l(M) 17

In Eq. (17), g(·) denotes the mapping relationship function between the participation factors and the system characteristic matrix; h(·) represents the mapping relationship function between the system characteristic matrix and the system operating point; l(·) signifies the mapping relationship function between the participation factors and the system operating point.

A regression model for the dominant participation factors under the primary SSO mode is established using a neural network method31 to estimate various operational scenarios. Building upon the regression model, this section employs a local surrogate model to extract critical features from multiple input characteristics under the currently studied sample, thereby determining the controller’s installation position selection.

The linear surrogate model g(z) approximates the original model f(x)32, and its form is as Eq. (18).

g(z)=w0+i=1nwizif(x) 18

Here, x represents the input of the sample, i.e., the electrical quantities of the rotor-side. z represents n important variables in x that significantly impact SSO. By utilizing machine learning to obtain the parameters wi of the surrogate model g(z) as interpretative results, the electrical quantity with the highest weight is chosen as the injection position for the output of I-SSDC.

Based on the input variables of the original model, sampling is performed with the training data of the surrogate model centered around the decision-making data (x0, y0). The estimated results of the original model’s sampling data are labeled. Constructing a linear model g′(x) based on weighted linear regression and L1 regularization, the model is trained using the sampled data. The important state variable z is selected from the sparsity of the parameters in g′(x), highlights the important state variables z that influence SSO. Since the regularization penalty in g’(x) is relatively strong, it leads to a larger parameter bias in the model solution. Therefore, further using z as input, a surrogate model g(z) is constructed using weighted linear regression and L2 regularization, training the model to make g(z) ≈ f(x). The objective function of the linear surrogate model is described as Equation (19):

minw[L(w)+i=1nρ(zi)(yi-g(zi))2] 19

In Equation (19), L(w) denotes the regularization term, and ρ(zi) refers to the weight coefficient of the sampled data. L(w) comprises L1 and L2 regularization terms, expressed as the follows:

L1(w)=λi=1n|wi| 20
L2(w)=λi=1nwi2 21

The weighting coefficients of the samples can be determined using logistic regression, as shown Equation (22):

ρzi=e-||zi-x0||22/2σ2 22

Here, the closer the sampled data is to x0 during training, the larger the weight. σ is a free parameter, and the smaller the value of 2σ, the smaller the fitting neighborhood range of the linear surrogate model for x.

S-TD3

The TD3 algorithm, structured as an Actor-Critic system as depicted in Fig. 3, engages in continuous interaction with the power system environment. This interaction acquires optimal values for the six neural network parameters, subsequently achieving an optimal configuration for the damping controller. This process is commonly referred to as offline training. The TD3 algorithm represents an enhancement of DDPG, introducing features such as clipped double-Q learning, delayed policy updates, and target policy smoothing33. Throughout the training process, the parameters θ and ω of the Actor network (πθ) and critic network (Qw) are updated through gradient descent to minimize their respective loss functions.

The objective of the Actor network is to maximize the value function, utilizing a gradient descent approach to optimize the parameters θ.

θJπθ=1ni=1naQwst,at;wα=πstθπθst 23

In Eq. (23), n represents the number of training samples extracted from the experience replay, st and at denote the state and action at time t, respectively. Following the deterministic policy gradient, the parameters θ of πθ are updated as θ=θ+μθθJ(θ), where μθ is the learning rate of the Actor network. Simultaneously, the parameters θ of the Actor target network are updated as θ=τθ+(1-τ)θ, with τ as the update coefficient.

The critic network optimizes parameter w by minimizing the loss function Loss(w), defined as Eq. (24):

Lossw=1ni=1nyt-Qwst,at2 24

Here, yt signifies the target Q value at time t.

The TD3 algorithm simultaneously learns two critic target networks (Q'w1 and Q'w2) and selects the minimum value for policy updates. While the TD3 algorithm incorporates a clipped double Q-learning mechanism to prevent Q value overestimation, it may introduce a low estimation bias on Q values, impacting performance. To effectively address these drawbacks, this paper introduces the S-TD3 algorithm, utilizing the Softmax function to estimate the value function. The softmax function can regulate the gap between the value function and the optimal value, reduce the frequency of obtaining local optimal solutions, and decrease the sensitivity of algorithm initialization parameters34. The target Q value (yt) can be expressed as:

yt=rst,at+γsoftmaxQst+1,at+1 25
softmaxQst+1,at+1=EexpβQst+1,at+1Qst+1,at+1pat+1EexpβQst+1,at+1pat+1 26
Qst+1,at+1=minQw1,Qw2 27

Here, Qst+1,at+1 represents the minimum value of Q'w obtained from the two critic target networks; p() is the probability density function of the Gaussian distribution; β is the parameter in the softmax function; γ is the reward discount factor.

at+1 is calculated by the Actor target network πθ and at+1=πθ(st+1)+ε, where ε is the added noise based on the normal distribution. The parameter w of Qw are updated according to the gradient rule, w=w-μwwLoss(w), where μw is the learning rate of the critic network. The parameters w of the critic target network are updated as w=τw+1-τw.

Agent training based on knowledge fusion

Knowledge fusion involves merging prior domain expertise with deep learning methodologies to enhance model performance and interpretability. This section presents an improved strategy that integrates relevant knowledge into the training of an agent within a data-driven approach, employing knowledge constraints to guide the agent’s exploration space.

The S-TD3 algorithm, a form of DRL, involves the agent interacting with a simulation environment, generating samples subsequently placed into an experience replay for training35,36. However, during the initial training phase, the agent is randomly initialized, posing a challenge for the agent to produce high-quality samples during interaction. This challenge results in a slow convergence of the agent toward an approximately optimal decision. To expedite the convergence speed of the algorithm, as depicted in Figure 3, knowledge rules related to SSO analysis are integrated into the decision-making process of the intelligent agent.

In instances of SSOs, the model of the actually collected measurement signals can be represented as Equation (28)37:

xn=A1cos2πf1n+ϕ1+i=1tAiseξisncos(2πfisn+ϕis) 28

In Eq. (28), x(n) consists of the fundamental frequency component and the SSO component. A1, f1, and ϕ1 represent the amplitude, frequency, and phase of the fundamental frequency component, while Ais, fis, and ϕis represent the amplitude, frequency, and phase of the SSO component, and ζis is the damping ratio of the SSO component.

The theory of SSO analysis in power systems indicates that the damping ratio quantitatively reflects the stability of the system. During SSOs, the damping ratio is negative, and the system’s stability diminishes as the damping ratio decreases. The objective of damping control is to maximize the damping ratio of the system, with the objective function defined as Equation (29):

J=minξismaxJ 29

The Prony algorithm is employed to identify sub-synchronous frequency components at grid-connected nodes following SSCI. The computational steps of the Prony identification algorithm are detailed in the literature38, offering insights into SSO parameters such as amplitude, frequency, phase, and damping ratio.

This paper categorizes the experience replay into two sets: Msuccess for successful experiences and Mfailure for failure experiences. Experiences from both successful and failure controllers are recorded using the identified damping ratio as a reference. Throughout the time-domain simulation process, the damping ratio at at is compared with that at at-1. If ζ(at) < ζ(at-1), the controller at at is not conducive to stability, and at is assigned to the failure experience replay. Conversely, if ζ(at) is greater than ζ(at-1), the controller at at contributes to system stability, and at assigned to the success experience replay. The sampling ratios for the success and failure experience replays are as Eq. (30):

Mfailure=βmMsuccess=(1-β)m 30

In Eq. (30), m represents the total number of training samples extracted from the experience replay; Mfailure and Msuccess denote the sampling quantities of failure and success experience samples, respectively; β is the sampling rate for the failure experience replay.

In Msuccess, the goal is to extract experiences of higher value, hence the implementation of prioritized experience replay. The data priority, denoted by p, is assessed through the TD error. The priority of the k-th sample is determined by Eq. (31):

pk=δk=ytk-Qstk,atk 31

The probability of sampling the k-th sample is determined by Equation (32):

p(k)=pkn=1successpn 32

Case study and experimental design

This paper focuses on the system model established in reference39 and constructs an equivalent system simulation model using the SIMULINK platform, as illustrated in Figure 1.

Each training episode lasts 5 seconds, with each step set to 0.0001 seconds in the configured improved TD3 agent. Diverse samples of operating conditions are selected for training to enhance the agent’s adaptability to the operating conditions of the DFIG-based wind farm grid-connected system. The system’s dynamic nature is captured by incorporating uncertain external parameters affecting SSOs, such as wind farm wind speed, control parameters, the number of WTs, active/reactive power output, series compensation degree, and transmission line parameters. The agent is trained using multiple samples generated by varying these parameters.

The configuration of the improved TD3 algorithm is detailed as follows: the critic network consists of three hidden layers with 128, 200, and 200 neurons, respectively. The Actor network comprises two hidden layers with 128 and 200 neurons each. Throughout the training process, hyperparameters are continuously fine-tuned, and the final selections are a critic network learning rate μω of 0.001, an Actor network learning rate μθ of 0.0001, a successful experience replay (Msuccess) capacity of 64, a failure experience replay (Mfailure) capacity of 64, a discount factor γ of 0.9, a soft update coefficient of 0.001, a sampling rate for the failure experience replay β of 0.8, and a noise standard deviation ε of 0.3. The reward index coefficients are set as follows: λ1 is 0.7, and λ2 is 0.3. The surrogate model employs a kernel function with a parameter σ of 3, and the regularization coefficient λ is set to 0.4.

Results

Performance of I-SSDC under various operating conditions

During the operation of the wind power transmission system, uncertain operating parameters undergo continuous random variations. It is imperative that the I-SSDC ensures real-time stability under these conditions.

Utilizing the acquired system participation factors, a regression model is developed. A local surrogate model is fitted to the samples in its current operational neighborhood, offering insights into the significance of each feature under the prevailing conditions, as depicted in Figure 4. The sub-synchronous oscillation mode exhibits a pronounced correlation with the stator power of the DFIG-based WT. Consequently, the power loops of the RSC control are selected for sub-synchronous oscillation suppression.

Figure 4.

Figure 4

Influence weights of RSC control electrical quantities.

To assess the real-time dynamic control efficacy of I-SSDC under varying system operating conditions, the initial operating state of the wind farm is characterized by a wind speed (vwind) of 9m/s, DFIG output active power (P) of 0.5 pu, and DFIG output reactive power (Qe) of 0. At t = 0.2 s, the series compensation degree (Kc) is increased to 40%, and at t = 1 s, the reactive power (Qe) changes to − 0.5 pu. Figure 5 illustrates the time-domain waveform of the wind power plant’s active power output over time.

Figure 5.

Figure 5

Suppression effect of the I-SSDC as the operating state changes over time.

As depicted in Fig. 5a, active power oscillation disperses after 0.2 s due to the increased string complement. However, the oscillation amplitude is exacerbated by changes in output reactive power after 1 s, leading to the superposition of oscillation frequencies. Figure 5b illustrates that the amplitude of active power decreases substantially following the increase in string complement, owing to the involvement of the I-SSDC in the control process. This amplitude remains relatively stable after 0.45 s. With changes in output reactive power after 1 s, the I-SSDC adaptively updates according to system variations, maintaining its previous value once the oscillation subsides after 1.4 s. The oscillation frequency of the active power fluctuates after 1.2 s.

When the operating state undergoes a sudden change, the adaptive updating capability of the I-SSDC controller leads to a rapid decrease in the amplitude of the generator’s output active power oscillations. Once the oscillations are suppressed, the output active power returns to its previous value. This ensures the stable operation of the system under varying conditions. As the system operating point continues to change, the I-SSDC demonstrates excellent dynamic tracking control capability, successfully suppressing each disturbance-triggered SSCI consistently.

Analysis of I-SSDC adaptability across a wide operating range

To assess the enhanced adaptability of I-SSDC, a comparative analysis with T-SSDC) based on the phase compensation principle9,10 is conducted across a broad operational spectrum. This study comprehensively examines the adaptability of the proposed I-SSDC method compared to its traditional counterpart. The controller’s efficacy is evaluated under diverse operating conditions, including wind speed (Vwind), active power of DFIG output (P), reactive power output of DFIG output (Qe), Controller parameter (Id-kp), and series compensation degree (Kc). Establishing a stable state as the baseline operating point A serves as a reference for subsequent parameter adjustments, resulting in additional operating points B to I. Each operating point represents various levels of system stability. Points B to E modify a single operating parameter, points F to H alter multiple operating parameters concurrently, and point I serves as a non-training operating point. These points encompass a wide range of SSO frequencies due to substantial variations in operating parameters, posing a considerable challenge to the controller’s suppression capabilities. The operating conditions for each point are detailed in Table 1. The comparative evaluation between I-SSDC and T-SSDC at points B to I is presented in Fig. 6.

Table 1.

Operating conditions of each operating point.

Operating point Vwind(m/s) P(pu) Q(pu) Id-kp Kc
A 11 0.74 0 0.01 10%
B 7 0.29 0 0.01 10%
C 11 0.74 0 0.01 90%
D 11 0.74 0 0.1 10%
E 11 0.74 − 0.5 0.01 10%
F 7 0.29 − 0.3 0.01 10%
G 9 0.50 0 0.01 50%
H 11 0.74 − 0.3 0.05 50%
I 12 0.88 − 0.3 0.01 50%

Figure 6.

Figure 6

Suppression effect of the I-SSDC across a wide operating range.

As shown in Fig. 6a, operating points B-E alter the wind speed, string complement degree, controller parameters, and output reactive power of the system, respectively. After 0.4 s, the system oscillates at a single frequency. Operating points F–H exhibit multiple superimposed oscillation frequencies due to changes in multiple operating parameters, resulting in higher oscillation amplitudes. From Fig. 6b, it is evident that after the system begins to oscillate, the oscillation amplitude at operating points B-E starts to decrease due to the involvement of the T-SSDC in the control. Significant reduction in oscillation amplitude is observed after 0.2 s of control participation, with the system returning to stability after 1.2 s. The speed of oscillation damping varies slightly among operation points B-E due to differing initial amplitudes. For operating points F–H, the system state changes after 0.4 s, with a gradual increase in the trend of oscillation amplitude reduction. However, even after 2 s, the oscillation is not entirely quelled, showing the worst suppression effect at operating point H. As depicted in Fig. 6c, the oscillations at operating points B-H subside after 0.6 s following the system state change, with a faster decrease in amplitude.

While T-SSDC succeeds in restoring system stability at operating points B to E, the suppression process is time-consuming, and there is a notable initial oscillation amplitude during control. However, at operating points F to G, SSCI is inadequately suppressed, indicating the limited adaptability of SSDC with fixed parameters.

I-SSDC adeptly suppresses SSO at operating points B to I, exhibiting faster convergence and less overshoot compared to T-SSDC. Notably, even at operating point I, the suppression strategy remains effective, showcasing the robustness of I-SSDC, even under non-training samples. In summary, a comprehensive comparative analysis is undertaken to demonstrate the superior adaptability of the proposed I-SSDC method compared to the effectiveness of T-SSDC, which is assessed across diverse operating environments.

Conclusion

This paper addresses SSO in a DFIG-based wind farm grid-connected system by introducing an intelligent damping controller that amalgamates knowledge with improved TD3, ensuring the secure and stable operation of the power grid. Deep deterministic policies govern decisions regarding the control variable, particularly additional damping control, for action exploration. The Softmax operation enhances the accuracy of the trained model. A surrogate model, constructed using weighted linear regression, is employed. Relevant rules are synthesized by thoroughly examining the intricate and variable operating conditions. The selection principles for the output signal of I-SSDC are delineated, enhancing engineering practicality and interpretability. To further enhance the agent’s performance, introducing the key stability indicator, the damping ratio parameter, through knowledge experience-guided strategies improves decision-making frequency and reduces the number of decision iterations. Experimental results demonstrate the method’s robust adaptability to various operating conditions, effectively suppressing SSOs across variable and wide operating ranges. It holds promise as an advanced control strategy for ensuring the safe and efficient operation of DFIG generators.

Future research efforts will focus on refining the proposed DRL-based SSO damping control framework, integrating more specific operational experience to guide intelligent agent training, and leveraging increased computational power for validation in large-scale practical power grids.

Supplementary Information

Abbreviations

DFIG

Doubly fed induction generator

SSO

Sub-synchronous oscillation

I-SSDC

Intelligent sub-synchronous damping controller

SSDC

Sub-synchronous damping controller

DRL

Deep reinforcement learning

TD3

Twin delayed deep deterministic policy gradient

SSR

Static synchronous series compensation

WT

Wind turbine

WTG

Wind turbine generator

RSC

Rotor side converter

GSC

Grid side converter

MPPT

Maximum power point tracking

PI

Proportional-Integral

SSCI

Sub-synchronous control interaction

SSO

Sub-synchronous oscillation

VSC-HVDC

Voltage source converter high voltage direct current

RL

Reinforcement learning

MDP

Markov decision process

LSTM

Long short-term memory

Prony

Prony’s method

Author contributions

Ge Liu, Jun Liu, and Andong Liu contributed to conception and design of the study. Ge Liu organized the database. Jun Liu performed the statistical analysis. Andong Liu wrote the first draft of the manuscript. Ge Liu, Jun Liu, and Andong Liuwrote sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.

Data availability

All data generated or analysed during this study are included in this published article [and its supplementary information files].

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-024-65372-y.

References

  • 1.Meng F, Sun D, Zhou K, et al. A sub-synchronous oscillation suppression strategy for doubly fed wind power generation system. IEEE Access. 2021;9:83482–83498. doi: 10.1109/ACCESS.2021.3087638. [DOI] [Google Scholar]
  • 2.Tan A, Tang Z, Sun X, et al. Genetic algorithm-based analysis of the effects of an additional damping controller for a doubly fed induction generator. J. Electr. Eng. Technol. 2020;15(4):1585–1593. doi: 10.1007/s42835-020-00440-7. [DOI] [Google Scholar]
  • 3.Adams, J. Carter, C. & Huang, S. H. ERCOT experience with sub-synchronous control interaction and proposed remediation. In IEEE Power Eng. Soc. Trans. and Dist. Conf. and Expo.. 10.1109/TDC.2012.6281678 (2012).
  • 4.Narendra, K. Fedirchuk, D. Midence, R. et al. New microprocessor based relay to monitor and protect power systems against sub-harmonics. In IEEE Proc. of Elect. Power Energy and Conf. (EPEC), Winnipeg, 438–443, 10.1109/EPEC.2011.6070241 (2011).
  • 5.Fan L, Miao Z. Modeling and Analysis of Doubly Fed Induction Generator wind Energy Systems. Academic Press; 2015. [Google Scholar]
  • 6.Leon AE, Solsona JA. Sub-synchronous interaction damping control for DFIG wind turbines. IEEE Trans. Power Syst. 2014;30(1):419–428. doi: 10.1109/TPWRS.2014.2327197. [DOI] [Google Scholar]
  • 7.Xie X, Zhang X, Liu H, et al. Characteristic analysis of subsynchronous resonance in practical wind farms connected to series-compensated transmissions. IEEE Trans. on Energy Conv. 2017;32(3):1117–1126. doi: 10.1109/TEC.2017.2676024. [DOI] [Google Scholar]
  • 8.Shair J, Xie X, Yan G. Mitigating subsynchronous control interaction in wind power systems: Existing techniques and open challenges. Renew. Sust. Energy Rev. 2019;108:330–346. doi: 10.1016/j.rser.2019.04.003. [DOI] [Google Scholar]
  • 9.Abdeen M, Li H, Mohamed MAEH, Kamel S, et al. Sub-synchronous interaction damping controller for a series-compensated DFIG-based wind farm. IET Renew. Power Gen. 2022;16(5):933–944. doi: 10.1049/rpg2.12400. [DOI] [Google Scholar]
  • 10.Yao J, Wang X, Li J, Liu R, Zhang H. Sub-synchronous resonance damping control for series-compensated DFIG-based wind farm with improved particle swarm optimization algorithm. IEEE Trans. Energy Conv. 2018;34(2):849–859. doi: 10.1109/TEC.2018.2872841. [DOI] [Google Scholar]
  • 11.Perera U, Oo AMT, Zamora R. Sub synchronous oscillations under high penetration of renewables—A review of existing monitoring and damping methods, challenges, and research prospects. Energies. 2022;15(22):8477. doi: 10.3390/en15228477. [DOI] [Google Scholar]
  • 12.Wang T, Yang J, Padhee M, et al. Robust, coordinated control of SSO in wind-integrated power system. IET Renew. Power Gen. 2020;14(6):1031–1043. doi: 10.1049/iet-rpg.2019.0410. [DOI] [Google Scholar]
  • 13.Saleem B, Badar R, Manzoor A, et al. Fully adaptive recurrent Neuro-fuzzy control for power system stability enhancement in Multi Machine System. IEEE Access. 2022;10:36464–36476. doi: 10.1109/ACCESS.2022.3164455. [DOI] [Google Scholar]
  • 14.Yang H, Xu S, Wu X, et al. Sub-synchronous oscillation mitigation strategy based on adaptive band-stop filter in DFIG-based wind farms. J. Phys. Conf. Ser. 2020;1639(1):012086. doi: 10.1088/1742-6596/1639/1/012086. [DOI] [Google Scholar]
  • 15.Slootweg JG, Kling WL. The impact of large scale wind power generation on power system oscillations. Elect. Power Syst. Res. 2003;67(1):9–20. doi: 10.1016/S0378-7796(03)00089-0. [DOI] [Google Scholar]
  • 16.Falehi AD. An innovative optimal RPO-FOSMC based on multi-objective grasshopper optimization algorithm for DFIG-based wind turbine to augment MPPT and FRT capabilities. Chaos Solitons Fractals. 2020;130:109407. doi: 10.1016/j.chaos.2019.109407. [DOI] [Google Scholar]
  • 17.Darvish Falehi A, Rafiee M. Optimal control of novel fuel cell-based DVR using ANFISC-MOSSA to increase FRT capability of DFIG-wind turbine. Soft Comput. 2019;23:6633–6655. doi: 10.1007/s00500-018-3312-9. [DOI] [Google Scholar]
  • 18.Darvish FA. Optimal power tracking of DFIG-based wind turbine using MOGWO-based fractional-order sliding mode controller. J. Solar Energy Eng. 2020;142(3):031004. doi: 10.1115/1.4044977. [DOI] [Google Scholar]
  • 19.Darvish FA. An innovative OANF–IPFC based on MOGWO to enhance participation of DFIG-based wind turbine in interconnected reconstructed power system: An innovative OANF–IPFC based on MOGWO to enhance participation of DFIG-based wind turbine. Soft Comput. 2019;23(23):12911–12927. doi: 10.1007/s00500-019-03848-0. [DOI] [Google Scholar]
  • 20.Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning. Nature. 2015;518(7540):529–533. doi: 10.1038/nature14236. [DOI] [PubMed] [Google Scholar]
  • 21.Mocanu E, Mocanu DC, Nguyen PH, et al. On-line building energy optimization using deep reinforcement learning. IEEE Trans. Smart Grid. 2019;10(4):3698–3708. doi: 10.1109/TSG.2018.2834219. [DOI] [Google Scholar]
  • 22.Shen Y, Yao W, Wen J, He H, Jiang L. Resilient wide-area damping control using GRHDP to tolerate communication failures. IEEE Trans. Smart Grid. 2019;10(3):2547–2557. doi: 10.1109/TSG.2018.2803822. [DOI] [Google Scholar]
  • 23.Hashmy Y, Yu Z, Shi D, et al. Wide area measurement system-based low frequency oscillation damping control through reinforcement learning. IEEE Trans. Smart Grid. 2020;11(6):5072–5083. doi: 10.1109/TSG.2020.3008364. [DOI] [Google Scholar]
  • 24.Moheb AM, et al. Consolidation of LVFRT capabilities of microgrids using energy storage devices. Sci. Rep. 2023;13:22294. doi: 10.1038/s41598-023-49659-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Jin JX, et al. Combined low voltage ride through and power smoothing control for DFIG/PMSG hybrid wind energy conversion system employing a SMES-based AC–DC unified power quality conditioner. Int. J. Electr. Power Energy Syst. 2021;128:106733. doi: 10.1016/j.ijepes.2020.106733. [DOI] [Google Scholar]
  • 26.Miao Z. Impedance-model-based SSR analysis for type 3 wind generator and series-compensated network. IEEE Trans. Energy Conv. 2012;27(4):984–991. doi: 10.1109/TEC.2012.2211019. [DOI] [Google Scholar]
  • 27.Ma Y, Zhu D, Zou X, et al. Transient characteristics and quantitative analysis of electromotive force for DFIG-based wind turbines during grid faults. Chin. J. Electr. Eng. 2022;8(2):3–12. doi: 10.23919/CJEE.2022.000010. [DOI] [Google Scholar]
  • 28.Alam MS, Al-Ismail FS, Salem A, et al. High-level penetration of renewable energy sources into grid utility: Challenges and solutions. IEEE Access. 2020;8:190277–190299. doi: 10.1109/ACCESS.2020.3031481. [DOI] [Google Scholar]
  • 29.Fan LL, Zhu CX, Miao ZX, et al. Modal of analysis of a DFIG-based wind farm interfaced with a series compensated network. IEEE Trans. Energy Conv. 2011;26(4):1010–1020. doi: 10.1109/TEC.2011.2160995. [DOI] [Google Scholar]
  • 30.Du W, Chen X, Fu Q, Wang H, Littler T. Sub-synchronous oscillations caused by grid-connected PMSGs under the condition of near strong open-loop modal resonance. Electr. Power Comput. Syst. 2019;47(19–20):1731–1743. doi: 10.1080/15325008.2020.1731866. [DOI] [Google Scholar]
  • 31.Du W, Chen X, Wang H. PLL-induced modal resonance of grid-connected PMSGs with the power system electromechanical oscillation modes. IEEE Trans. Sustain. Energy. 2017;8(4):1581–1591. doi: 10.1109/TSTE.2017.2695563. [DOI] [Google Scholar]
  • 32.Uthayopas K, et al. TSMDA: Target and symptom-based computational model for miRNA-disease-association prediction. Mol. Ther. Nucl. Acids. 2021;26:536–546. doi: 10.1016/j.omtn.2021.08.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ali S, Khan A, Shah K, Alqudah MA, Abdeljawad T. On computational analysis of highly nonlinear model addressing real world applications. Results Phys. 2022;36:105431. doi: 10.1016/j.rinp.2022.105431. [DOI] [Google Scholar]
  • 34.Zheng Z, Cao Z, Deng H, et al. Searching for double-line spectroscopic binaries in the LAMOST medium-resolution spectroscopic survey with deep learning. Astrophys. J. Suppl. Ser. 2023;266(2):18. doi: 10.3847/1538-4365/acc94e. [DOI] [Google Scholar]
  • 35.Hu Q, Xiong Y, Liu C, et al. Transient Stability analysis of direct drive wind turbine in DC-link voltage control timescale during grid fault. Processes. 2022;10(4):774. doi: 10.3390/pr10040774. [DOI] [Google Scholar]
  • 36.Sun D, Meng F, Shen W. Study on suppression strategy for broadband sub-synchronous oscillation in doubly-fed wind power generation system. Appl. Sci. 2022;12(16):8344. doi: 10.3390/app12168344. [DOI] [Google Scholar]
  • 37.Yang X, Zhang J, Xie X, et al. Interpolated DFT-based identification of sub-synchronous oscillation parameters using synchrophasor data. IEEE Trans. Smart Grid. 2019;11(3):2662–2675. doi: 10.1109/TSG.2019.2959811. [DOI] [Google Scholar]
  • 38.Wang Y, Zhang H, Lu F. Capacitive power transfer with series-parallel compensation for step-up voltage output. IEEE Trans. Ind. Electron. 2021;69(6):5604–5614. doi: 10.1109/TIE.2021.3091925. [DOI] [Google Scholar]
  • 39.Jiang H, Song R, Du N, et al. Application of UPFC to mitigate SSR in series-compensated wind farms. J. Eng. 2019;2019(16):2505–2509. doi: 10.1049/joe.2018.8533. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

All data generated or analysed during this study are included in this published article [and its supplementary information files].


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES