Skip to main content
Sensors (Basel, Switzerland) logoLink to Sensors (Basel, Switzerland)
. 2026 Feb 28;26(5):1529. doi: 10.3390/s26051529

Adaptive Predefined-Time Tracking Control for Robotic Manipulator Based on Actor-Critic Reinforcement Learning

Yong Qin 1, Yuan Sun 2,*, Jun Huang 2, Yankai Li 3,*
Editors: Sungkeun Yoo, Taewon Seo, Taegyun Kim
PMCID: PMC12987290  PMID: 41829491

Abstract

This paper proposes a novel predefined-time adaptive neural tracking control method for uncertain manipulator systems based on Actor-Critic reinforcement learning framework. The proposed control scheme integrates the advantages of predefined-time stability theory and reinforcement learning to achieve fast convergence with guaranteed settling time bounds while handling unknown system dynamics. An Actor neural network is designed to approximate the unknown nonlinear functions and generate control inputs, while a Critic neural network evaluates the cost-to-go function to guide the learning process. The predefined-time convergence is ensured by incorporating specially designed terms into both the control law and the neural network weight update laws. The upper bound of the settling time can be explicitly preset by a single design parameter, independent of initial conditions and system parameters. Rigorous stability analysis based on Lyapunov theory proves that all closed-loop signals are bounded and the tracking error converges to a small neighborhood of the origin within the predefined time. Simulation results on a single-link manipulator system demonstrate the effectiveness and superiority of the proposed control scheme compared with conventional PID control.

Keywords: predefined-time control, actor-critic reinforcement learning, adaptive neural network control, backstepping control

1. Introduction

Robotic manipulators have been extensively deployed in industrial manufacturing, medical surgery, space exploration, and military applications due to their high flexibility, precision, repeatability, and efficiency [1,2]. In these applications, the control system must achieve accurate trajectory tracking while adapting to varying operating conditions and task requirements. However, the control of robotic manipulators remains challenging due to their inherent nonlinearities arising from trigonometric functions in the dynamic equations, strong coupling effects between joints, and inevitable uncertainties stemming from unmodeled dynamics, parameter variations, friction, and external disturbances [3]. Therefore, developing advanced control strategies that simultaneously guarantee tracking performance, fast convergence, and robustness against uncertainties has become a critical research topic in the field of robotics and control engineering.

To address the challenges posed by system uncertainties, numerous advanced control strategies have been developed over the past decades. Adaptive control provides an effective approach to handle parametric uncertainties through online parameter estimation, enabling the controller to adjust its parameters in real-time based on system behavior [4,5]. Neural network (NN)-based control has gained significant attention for its universal approximation capability, which allows it to deal with unknown nonlinear functions without requiring explicit mathematical models [6]. The combination of adaptive control and neural networks, known as adaptive neural network control, has demonstrated excellent performance in handling both parametric and functional uncertainties, and has been successfully applied to various robotic systems [7,8]. Despite these advances, most existing adaptive neural control methods only guarantee asymptotic or exponential convergence, where the settling time depends on initial conditions and system parameters, which may not satisfy the strict timing requirements in practical applications.

In practical robotic applications, fast convergence is often a critical requirement, particularly in time-critical tasks such as assembly operations, surgical procedures, and emergency response scenarios. To achieve convergence in finite time, finite-time control and fixed-time control have been developed based on nonsmooth Lyapunov analysis [9,10]. Finite-time control ensures that the system states converge to the equilibrium within a finite settling time, but this settling time depends on initial conditions, making it difficult to predict or prescribe in advance. Fixed-time control addresses this limitation by ensuring that the settling time is bounded regardless of initial conditions [3]. However, the relationship between the settling time bound and control parameters in fixed-time control is implicit and complex, typically involving multiple design parameters in a nonlinear manner, which complicates the controller tuning process for achieving desired convergence speed.

Recently, predefined-time control has emerged as a promising approach that allows designers to explicitly preset the upper bound of the settling time through a single design parameter [11,12,13]. This feature is particularly attractive for applications with strict timing requirements, as the maximum convergence time can be directly specified according to task demands without complex parameter calculations. Several predefined-time control schemes have been proposed for various systems including rigid spacecraft attitude stabilization [14] and robotic manipulators [15]. However, most existing predefined-time control methods require accurate system models or assume that the system uncertainties are bounded with known bounds, which significantly limits their practical applicability to real-world robotic systems where model parameters are often unknown or time-varying.

On the other hand, reinforcement learning (RL) has shown great potential in control applications due to its ability to learn optimal control policies through interaction with the environment without requiring explicit system models [16,17]. Among various RL architectures, the Actor-Critic (AC) framework is particularly well-suited for continuous control problems, where the Actor network generates control actions and the Critic network evaluates the performance by estimating the value function or cost-to-go [18,19]. The combination of Actor-Critic reinforcement learning and neural network approximation has been successfully applied to various robotic control problems, demonstrating improved adaptability and optimality compared to conventional methods [20,21]. The Actor-Critic structure offers several advantages: the Critic provides a global performance metric for guiding the Actor’s learning, the dual-network architecture separates policy evaluation from policy improvement for enhanced learning efficiency, and the framework naturally accommodates online learning in real-time control scenarios.

Despite the significant progress in each individual area, there remains a gap in the literature regarding the unified treatment of predefined-time convergence, adaptive learning capability, and optimal control for uncertain robotic systems. Most existing predefined-time control methods lack the ability to handle unknown nonlinearities adaptively, while conventional adaptive neural control schemes cannot guarantee predefined-time convergence. The integration of predefined-time stability with Actor-Critic reinforcement learning presents unique theoretical challenges: the predefined-time convergence mechanism must be incorporated into both the control law and the neural network weight update laws in a compatible manner, and the stability analysis must account for the coupled dynamics of tracking errors and weight estimation errors within the predefined-time framework. To the best of the authors’ knowledge, the problem of predefined-time adaptive neural control using Actor-Critic reinforcement learning for robotic manipulators has not been adequately addressed in the existing literature.

Motivated by the above observations, this paper proposes a novel predefined-time adaptive neural tracking control scheme for uncertain single-link manipulator systems based on the Actor-Critic reinforcement learning framework. The main contributions of this paper are summarized as follows:

  • A novel control framework that synergistically integrates predefined-time stability theory with Actor-Critic reinforcement learning is proposed. The Actor neural network approximates unknown system dynamics and generates control inputs, while the Critic neural network evaluates the cost-to-go function to guide the learning process, achieving both guaranteed convergence time and online learning capability.

  • Predefined-time neural network weight update laws are designed with specially constructed terms that incorporate the predefined-time convergence mechanism. These update laws ensure the convergence of both tracking errors and weight estimation errors within the predefined time while maintaining the learning and approximation capabilities of the neural networks.

  • The upper bound of the settling time can be explicitly preset by a single design parameter Tc, satisfying TP<2Tc, which is independent of initial conditions and system parameters. This explicit relationship between the design parameter and settling time bound greatly simplifies the controller design process for applications with specific timing requirements.

The remainder of this paper is organized as follows. Section 2 presents the single-link manipulator system model, introduces the Actor-Critic neural network framework, and provides necessary mathematical preliminaries including the predefined-time stability lemma. Section 3 details the controller design procedure, including the predefined-time virtual controller, the Actor-Critic reinforcement learning controller, and the predefined-time weight update laws. Section 4 provides the rigorous stability analysis based on Lyapunov theory. Section 5 presents comprehensive simulation results to validate the effectiveness and superiority of the proposed control scheme. Finally, Section 6 concludes the paper and discusses future research directions.

2. Preliminaries and Problem Formulation

2.1. System Model

Consider a single-link robotic manipulator system described by the following dynamic equation:

Iq¨+c0q˙+mglcos(q)=τ+d (1)

where qR denotes the joint angle, q˙R is the angular velocity, q¨R is the angular acceleration, τR represents the control torque, I=43ml2 is the moment of inertia, m is the link mass, l is the link length, c0 is the viscous friction coefficient, g is the gravitational acceleration, and d represents the bounded external disturbance satisfying |d|d¯ with d¯ being a known positive constant.

Define the state variables x1=q and x2=q˙. The system (1) can be rewritten in the following state-space form:

x˙1=x2x˙2=f(x)+g1uy=x1 (2)

where u=τ is the control input, y=x1 is the system output, g1=1I is a known positive constant, and

f(x)=1Ic0x2+mglcos(x1)d (3)

is an uncertain nonlinear function.

Assumption 1.

The desired reference trajectory yd and its derivatives y˙d, y¨d are continuous and bounded, i.e., there exist positive constants y¯d, y¯d1, y¯d2 such that |yd|y¯d, |y˙d|y¯d1, |y¨d|y¯d2.

2.2. Control Objective

The control objective is to design an adaptive neural tracking controller based on Actor-Critic reinforcement learning such that:

  • (i)

    The joint angle x1 tracks the desired trajectory yd with the tracking error converging to a small neighborhood of the origin within a predefined time TP<2Tc, where Tc is a preset design parameter.

  • (ii)

    All signals in the closed-loop system remain bounded within the predefined time.

  • (iii)

    The Actor-Critic neural networks learn to compensate for the unknown system dynamics online.

2.3. Actor-Critic Neural Network Framework

To handle the unknown nonlinear functions in the system and achieve adaptive optimal control, this paper employs an Actor-Critic reinforcement learning framework. This framework consists of two cooperatively working neural networks: the Actor network is responsible for approximating unknown dynamics and generating control policies, while the Critic network evaluates the control performance and guides the Actor’s learning process.

2.3.1. RBF Basis Function

Both neural networks adopt Radial Basis Functions (RBFs) as basis functions due to their universal approximation capability. For a continuous function h(Z):RnR defined on a compact set ΩZRn, it can be approximated by an RBF neural network as:

h(Z)=W*TS(Z)+ϵ(Z) (4)

where Z=[Z1,Z2,,Zn]TΩZ is the input vector, W*=[w1*,w2*,,wl*]TRl is the ideal weight vector, l is the number of neural network nodes, S(Z)=[s1(Z),s2(Z),,sl(Z)]T is the basis function vector, and ϵ(Z) is the approximation error satisfying |ϵ(Z)|ϵ¯.

The Gaussian function is employed as the basis function:

si(Z)=expZμi2η2,i=1,2,,l (5)

where μi=[μi1,μi2,,μin]T is the center of the i-th basis function, and η>0 is the width of the Gaussian function.

2.3.2. Critic Network Structure

The Critic network is designed to evaluate the long-term performance of the current control policy. The long-term cost function is defined as:

I(t)=temtψϕ(m)dm (6)

where ψ>0 is the discount factor, and the instantaneous cost function is defined as:

ϕ(t)=z1TDz1+τTRτ (7)

where D>0 and R>0 are positive definite weight matrices that penalize the tracking error and control effort, respectively.

Using the RBF neural network to approximate the cost function:

I=Wc*TSc(Zc)+ϵc (8)
I^=W^cTSc(Zc) (9)

where Zc=z1 is the Critic network input, Wc*Rlc is the ideal weight vector, W^c is the estimated weight vector, Sc(Zc) is the basis function vector, and ϵc satisfies |ϵc|ϵ¯c.

When ψ, based on the Bellman equation, the temporal difference (TD) error can be expressed as:

γ(t)=ϕ(t)+I^˙(t)=ϕ(t)+W^cTΛ (10)

where Λ=Scψ+ScZ˙c. The learning objective of the Critic network is to minimize the TD error.

2.3.3. Actor Network Structure

The Actor network is designed to approximate the unknown nonlinear functions in the system and assist in generating control inputs. Define the composite unknown function:

φ=f(x)α˙1+z1+12z2 (11)

where f(x) is the unknown nonlinear term of the system, and α˙1 is the derivative of the virtual control.

Using the RBF neural network to approximate φ:

φ=Wa*TSa(Za)+ϵa (12)
φ^=W^aTSa(Za) (13)

where Za=[x1,x2,yd,y˙d,y¨d]T is the Actor network input, Wa*Rla is the ideal weight vector, W^a is the estimated weight vector, Sa(Za) is the basis function vector, and ϵa satisfies |ϵa|ϵ¯a.

2.3.4. Actor-Critic Cooperative Learning Mechanism

The cooperative learning mechanism of the Actor-Critic framework operates as follows:

  • (1)

    Critic evaluates policy performance: The Critic network computes the estimated cost function I^ based on the current state and control input, evaluating the quality of the Actor’s current policy. A larger I^ indicates poorer policy performance that requires improvement.

  • (2)

    Actor improves control policy: The Actor network utilizes the evaluation information I^ provided by the Critic as feedback to adjust its weights W^a, thereby improving the control policy to minimize the long-term cost.

  • (3)

    Online cooperative update: The weights of both networks are updated in real-time during the control process. Through continuous “evaluation-improvement” cycles, the control performance is progressively optimized.

Define the weight estimation errors as:

W˜c=Wc*W^c,W˜a=Wa*W^a (14)

The specific weight update laws for the Actor-Critic networks will be designed in Section 3, incorporating the predefined-time stability requirements.

Remark 1.

Compared with traditional single neural network adaptive control, the Actor-Critic framework possesses the following advantages: (i) The value function evaluation provided by the Critic offers a global performance metric for the Actor, rather than relying solely on local error information; (ii) The dual-network structure separates policy evaluation from policy improvement, enhancing learning efficiency and stability; (iii) This framework is naturally suited for integration with predefined-time control, allowing the predefined-time convergence mechanism to be incorporated into the weight update laws of both networks.

2.4. Technical Lemmas

Lemma 1

([22]). For any ξR and ρ>0, the following inequality holds:

0|ξ|ξtanhξρδρ (15)

where δ=0.2785.

Lemma 2

([23]). For xi0 (i=1,2,,n) and γ>0, the following inequalities hold:

i=1nxiγi=1nxiγ,0<γ<1 (16)
i=1nxiγn1γi=1nxiγ,γ>1 (17)

Lemma 3.

For yx and v>1, the following inequality holds:

x(yx)vv1+vy1+vx1+v (18)

Lemma 4

([24]). (Predefined-Time Stability) Consider the system x˙=f(t,x). If there exists a continuous positive definite function V(x) and parameters 0<β<1, Tc>0, 0<σ< such that

V˙(x)πβTcV1β2+V1+β2+σ (19)

then the system is practically predefined-time stable (PPTS), and the convergence region is

limtTPxVmin2βTcσπ22β,2βTcσπ22+β (20)

where TP is the settling time satisfying TP<Tmax=2Tc.

Remark 2.

Lemma 4 is fundamental to predefined-time stability theory. The key feature is that the upper bound of the settling time Tmax=2Tc can be explicitly preset through the parameter Tc, independent of the initial conditions and system parameters. This is in contrast to finite-time control where the settling time depends on initial conditions, and fixed-time control where the settling time bound is implicitly determined by multiple parameters.

Lemma 5

([25]). For any zR and κ>0:

0|z|z2z2+κ2<κ (21)

Lemma 6.

(Power Function Inequality) For any x>0 and 0<β<1, the following inequality holds:

x2x2βCβ (22)

where Cβ=β22β22ββ is a positive constant depending only on β.

Proof. 

Define f(x)=x2x2β for x>0. Taking the derivative:

f(x)=2x(2β)x1β=x1β2xβ(2β)

Setting f(x)=0 yields the critical point x*=2β21β. Since f(x*)>0, this is a minimum point. The minimum value is:

f(x*)=β22β22ββ=Cβ

Therefore, f(x)Cβ, which completes the proof. □

3. Actor-Critic Predefined-Time Controller Design

In this section, we present the design of the predefined-time adaptive neural tracking controller based on the Actor-Critic reinforcement learning framework. The control system architecture is illustrated in Figure 1. The Actor network receives system states and reference signals, outputs the control signal to compensate for unknown dynamics. The Critic network evaluates the cost-to-go and provides feedback to guide the Actor’s learning process. Both networks are updated using predefined-time weight update laws.

Figure 1.

Figure 1

Block diagram of the Actor-Critic predefined-time control system.

3.1. Predefined-Time Virtual Controller Design

Define the tracking error variables as:

z1=x1yd (23)
z2=x2α1 (24)

where α1 is the virtual control law to be designed.

The time derivative of z1 is:

z˙1=x˙1y˙d=x2y˙d=z2+α1y˙d (25)

Design the predefined-time virtual controller as:

α1=z1αˇ12z12αˇ12+ε12+y˙d (26)

where ε1>0 is a small positive constant, and

αˇ1=πβTc2β121+β2sig1+β(z1)+121β2sig1β(z1) (27)

with 0<β<1, Tc>0 being the predefined time parameter, and sigγ(a)=|a|γsgn(a).

Remark 3.

The virtual controller (26) is specifically designed to achieve predefined-time convergence. The structure z1αˇ12z12αˇ12+ε12 ensures that the derivative α˙1 remains bounded even when z1 approaches zero, thus avoiding the singularity issue that commonly arises in traditional finite-time control designs where terms like sig1β(z1) with 0<β<1 would cause unbounded derivatives.

Consider the Lyapunov function candidate:

V1=12z12 (28)

Taking the time derivative of V1 and substituting (25) and (26):

V˙1=z1z˙1=z1(z2+α1y˙d)=z1z2z12αˇ12z12αˇ12+ε12 (29)

Applying Lemma 5:

z12αˇ12z12αˇ12+ε12<ε1|z1αˇ1|<ε1πβTcz1221β2πβTc2βz1221+β2 (30)

Therefore:

V˙1<πβTcz1221β2πβTc2βz1221+β2+z1z2+ε1 (31)

3.2. Actor-Critic Reinforcement Learning Controller Design

The time derivative of z2 is:

z˙2=x˙2α˙1=f(x)+g1uα˙1 (32)

Define the unknown nonlinear function:

φ=f(x)α˙1+z1+12z2 (33)

3.2.1. Critic Network Design

The Critic network is designed to approximate the cost-to-go function and evaluate the control performance. Define the long-term cost function:

I(t)=temtψϕ(m)dm (34)

where ψ>0 is a discount factor, and the instantaneous cost function is defined as:

ϕ(t)=z1TDz1+τTRτ (35)

with D>0 and R>0 being positive definite weight matrices.

The cost-to-go function is approximated by the Critic neural network:

I=Wc*TSc(Zc)+ϵc (36)
I^=W^cTSc(Zc) (37)

where Zc=z1 is the Critic network input, Wc* is the ideal weight vector, W^c is the estimated weight vector, Sc(Zc) is the basis function vector, and ϵc is the approximation error.

When ψ, the temporal difference error can be expressed as:

γ(t)=ϕ(t)+I^˙(t)=ϕ(t)+I^(t)Z˙c (38)

The predefined-time Critic network weight update law is designed as:

W^˙c=σc(ϕ(t)+W^cTΛ)ΛςcW^ccc|W^c|βW^c (39)

where Λ=Scψ+ScZ˙c, σc>0 is the learning rate, ςc=π·2β2·rcβ2βTc, cc=π(2+β)2βTc(1+β)rcβ2 and rc>0 is a design parameter.

3.2.2. Actor Network Design

The Actor network is designed to approximate the unknown function φ and generate control inputs. Using RBFNN approximation:

φ=Wa*TSa(Za)+ϵa (40)

where Za=[x1,x2,yd,y˙d,y¨d]T is the Actor network input, Wa* is the ideal weight vector, Sa(Za) is the basis function vector, and |ϵa|ϵ¯a.

The predefined-time Actor network weight update law is designed as:

W^˙a=raz2g1Satanhz2g1SaρςaW^aca|W^a|βW^a+KII^Sa (41)

where ra>0 is the learning rate, ςa=π·2β2·raβ2βTc, ca=π(2+β)2βTc(1+β)raβ2, KI>0 is the Critic feedback gain and ρ>0 is a design parameter.

Remark 4.

The weight update laws (39) and (41) ensure both learning capability and predefined-time convergence by incorporating three essential terms: the first is the standard gradient descent term, which minimizes the approximation or temporal difference error; the second term, ςW^, introduces a damping effect to prevent weight drift; and the third term, c|W^|βW^, acts as the predefined-time convergence component, guaranteeing that the weights converge within the specified time frame.

3.2.3. Predefined-Time Actual Controller

The actual control law is designed as:

u=α2g1 (42)

where

α2=z2αˇ22z22αˇ22+ε22 (43)

with ε2>0 and

                  αˇ2=πβTc2β121+β2sig1+β(z2)+121β2sig1β(z2) (44)
+W^aTSatanhz2Saρ+K2z2+z1 (45)

where K2>0 is a feedback gain.

Remark 5.

The control law defined in Equations (42)–(45) comprises three key components: a predefined-time convergence term that ensures the tracking error converges within the specified time Tc; a neural network compensation term, W^aTSatanh(·), which provides online compensation for unknown system dynamics; and stabilizing feedback terms, K2z2+z1, designed to enhance closed-loop stability.

Remark 6.

The proposed Actor-Critic framework is rooted in the Adaptive Dynamic Programming (ADP) paradigm [16,17,18]. Specifically, the Critic network approximates the value function I(t) associated with the Hamilton–Jacobi–Bellman equation:

0=φ(t)+ITx˙ (46)

where φ(t)=z1TDz1+τTRτ is the instantaneous cost that penalizes both tracking error and control effort. The TD error γ(t)=φ(t)+I^˙(t) measures the discrepancy between the current value estimate and the Bellman optimality condition. Minimizing γ2 drives the Critic toward the true value function.

The term KII^Sa in the Actor update law (41) can be interpreted as an approximate policy gradient step: it adjusts the Actor weights in a direction that reduces the estimated long-term cost I^, analogous to the policy improvement step in policy iteration methods. Together with the error-driven gradient term raz2g1Satanh(·), the Actor update simultaneously ensures Lyapunov stability (via error reduction) and approximate optimality (via cost minimization).

It should be noted that due to the integration with predefined-time stability requirements, the damping term ςaW^a and the predefined-time term ca|W^a|βW^a modify the pure policy gradient direction. Therefore, the optimality guarantee is approximate rather than exact, representing a meaningful design trade-off between guaranteed predefined-time convergence and strict optimality. This is consistent with the ADP literature where stability-constrained policy optimization yields near-optimal rather than globally optimal policies.

4. Stability Analysis

Theorem 1.

Consider the single-link manipulator system (2) satisfying Assumption 1. Under the virtual controller (26), the actual controller (42), and the Actor-Critic neural network weight update laws (39) and (41), if the design parameters satisfy:

  • 0<β<1, Tc>0

  • ε1,ε2,ρ>0

  • σc,ra,KI,K2>0

then the closed-loop system is practically predefined-time stable (PPTS). Specifically:

  • (i) 

    The error signals κ=[z1,z2,W˜a,W˜c]T converge to a compact set within the predefined time TP<Tmax=2Tc.

  • (ii) 

    All signals in the closed-loop system remain bounded.

  • (iii) 
    The convergence region is given by:
    Δ=limtTPκVmin2βTcσπ22β,2βTcσπ22+β (47)

Proof. 

Consider the following Lyapunov function candidate:

V=12z12+12z22+12raW˜aTW˜a+12rcW˜cTW˜c (48)

where W˜a=Wa*W^a and W˜c=Wc*W^c are the weight estimation errors.

From (31), we have:

V˙1<πβTcz1221β2πβTc2βz1221+β2+z1z2+ε1 (49)

Taking the derivative of V2=12z22:

V˙2=z2z˙2=z2(fα˙1+g1u)=z2(φz112z2+g1u) (50)

Using the neural network approximation (40):

z2φ=z2Wa*TSa+z2ϵa (51)

Applying Lemma 1:

z2g1Wa*TSa|z2|g1Wa*Sag1Wa*δρ+z2Satanhz2g1Saρ (52)

Define θa=Wa*g1. Substituting the control law (42):

z2g1u=z2α2=z22αˇ22z22αˇ22+ε22 (53)

By Lemma 5:

z2g1u<ε2|z2αˇ2| (54)

Expanding αˇ2 and combining terms:

V˙2<πβTcz2221β2πβTc2βz2221+β2z1z2+z2g1W˜aTSatanhz2g1Saρ+σ2 (55)

where σ2=ε2+δθag1ρ+12ϵ¯a2.

Taking the derivative of Va=12raW˜aTW˜a:

V˙a=1raW˜aTW^˙a (56)

Substituting the Actor weight update law (41):

V˙a=z2g1W˜aTSatanhz2g1Saρ+ςaraW˜aTW^a+caraW˜aT|W^a|βW^aKIraW˜aTI^Sa (57)

Using Young’s inequality for W˜aTW^a=W˜aT(Wa*W˜a):

W˜aTW^a12W˜a2+12Wa*2 (58)

Using Lemma 3 for W˜aT|W^a|βW^a:

W˜aT|W^a|βW^a1+β2+βWa*2+β1+β2+βW˜a2+β (59)

Therefore:

V˙az2g1W˜aTSatanh(·)ςa2raW˜a2ca(1+β)ra(2+β)W˜a2+β+σa (60)

From the definition Va=12raW˜a2, we have:

W˜a2+β=(2raVa)1+β2=(2ra)1+β2Va1+β2 (61)

Substituting into the third term of (60):

ca(1+β)ra(2+β)W˜a2+β=ca(1+β)ra(2+β)(2ra)1+β2Va1+β2=ca(1+β)·21+β2·raβ22+βVa1+β2 (62)

To achieve the target form πβTc2β2Va1+β2, we require:

ca(1+β)·21+β2·raβ22+β=πβTc2β2 (63)

Solving for ca:

ca=π(2+β)2βTc(1+β)raβ2 (64)

With this choice of ca, we obtain:

ca(1+β)ra(2+β)W˜a2+β=πβTc2β2Va1+β2 (65)

From the definition Va=12raW˜a2, we have:

W˜a2β=(2raVa)1β2=(2ra)1β2Va1β2 (66)

Applying Lemma 6 with x=W˜a:

W˜a2W˜a2βCβ (67)

Multiplying both sides by ςa2ra:

ςa2raW˜a2ςa2raW˜a2βςaCβ2ra (68)

Substituting (66):

ςa2raW˜a2ςa2ra(2ra)1β2Va1β2ςaCβ2ra=ςa(2ra)1β22raVa1β2σa1=ςa(2ra)β2Va1β2σa1 (69)

where σa1=ςaCβ2ra is a bounded positive constant.

Therefore:

ςa2raW˜a2ςa(2ra)β2Va1β2+σa1 (70)

To achieve the target form πβTcVa1β2, we require:

ςa(2ra)β2=πβTc (71)

Solving for ςa:

ςa=π(2ra)β2βTc=π·2β2·raβ2βTc (72)

With this choice of ςa, we obtain:

ςa2raW˜a2πβTcVa1β2+σa1 (73)

Therefore, substituting (73) and (65) into (60):

V˙az2g1W˜aTSatanh(·)πβTcVa1β2πβTc2β2Va1+β2+σa (74)

where σa=σa+σa1 is a bounded positive constant.

By the definitions of ςa and ca, and applying Lemma 6:

V˙az2g1W˜aTSatanh(·)πβTcW˜a22ra1β2πβTc2β2W˜a22ra1+β2+σa (75)

From V˙2, the cross term involving the Actor network is z2g1W˜aTSatanhz2g1Saρ, which arises because the control law u=α2/g1 yields z2g1u=z2α2 and the neural network compensation term in α2 contains W^aTSatanh(·). From V˙a=1raW˜aTW^˙a, substituting the Actor update law (41), the first term is z2g1W˜aTSatanhz2g1Saρ, these two terms cancel exactly for any g1>0.

Similarly, for the Critic network:

V˙cπβTcW˜c22rc1β2πβTc2β2W˜c22rc1+β2+σc (76)

where σc=σc+σc1 is a bounded positive constant.

Combining all terms:

V˙=V˙1+V˙2+V˙a+V˙c (77)

Note that the cross terms cancel:

  • z1z2 from V˙1 cancels with z1z2 from V˙2.

  • z2g1W˜aTSatanh(·) from V˙2 cancels exactly with z2g1W˜aTSatanh(·) from V˙a, since the Actor weight update law (41) explicitly includes the factor g1 in the gradient term, and the control law u=α2/g1 ensures that z2g1u=z2α2. This exact cancellation holds for any g1=1/I>0 without requiring any approximation.

Therefore:

V˙<πβTcz1221β2+z2221β2+W˜a22ra1β2+W˜c22rc1β2πβTc2β2z1221+β2+z2221+β2+W˜a22ra1+β2+W˜c22rc1+β2+σ (78)

where σ=ε1+σ2+σa+σc is a positive constant.

Applying Lemma 2, for 0<1β2<1:

ixi221β2ixi221β2=V1β2 (79)

for 1+β2>1:

ixi221+β24β2V1+β2 (80)

Therefore:

V˙πβTcV1β2+V1+β2+σ (81)

By Lemma 4, the system is practically predefined-time stable with settling time TP<Tmax=2Tc.

From the predefined-time stability, z1, z2, W˜a, W˜c are all bounded.

This completes the proof. □

Remark 7.

By adjusting the predefined time parameter Tc, the upper bound of the settling time can be explicitly preset: TP<Tmax=2Tc. A smaller Tc leads to faster convergence but may require larger control efforts.

Remark 8.

The predefined-time parameters for both Actor and Critic networks are derived from the requirement that the weight estimation error dynamics satisfy the predefined-time stability condition in Lemma 4. The key insight is:

  • The damping term ςW^ generates the V1β2 component through Lemma 6, which dominates when V is large.

  • The predefined-time term c|W^|βW^ directly generates the V1+β2 component through algebraic substitution, which dominates when V is small.

  • The combination of both terms ensures predefined-time convergence for all values of V>0.

5. Simulation Results

In this section, numerical simulations are conducted to verify the effectiveness of the proposed Actor-Critic predefined-time control scheme. The simulations are performed on a single-link manipulator system using MATLAB R2025a with Runge–Kutta 4th order integration.

5.1. Simulation Setup

The initial conditions are set as x1(0)=1.0 rad and x2(0)=1.0 rad/s. The simulation runs for 20 s with a step size of Δt=0.001 s. For the Actor network with 100 nodes (la=100) processing 5-dimensional input [x1,x2,yd,y˙d,y¨d], the basis centers are uniformly sampled from the hypercube [3,3]5 with width parameter ηa=1.2. All weights are initialized to zero, W^a(0)=0R100, and bounded by |W^a,i|200 via saturation clipping. The Critic network uses 64 nodes (lc=64) with 2-dimensional input [z1,z2]. Centers are placed on an 8×8 uniform grid over [2,2]2 with width ηc=1.0. Weights are similarly initialized as W^c(0)=0R64 and bounded by |W^c,i|100. Regarding the discount factor, we implement ψ=10 rather than the theoretical limit ψ. This is a standard simplification in ADP literature [18], where using a sufficiently large ψ makes the term Sc/ψ negligible compared to Sc·Zc, effectively approximating the infinite discount case while maintaining numerical stability. The system and control parameters are given in Table 1.

Table 1.

System and Control Parameters.

Parameter Description Value Unit
System Parameters
m Link mass 1.0 kg
l Link length 0.5 m
c0 Friction coefficient 1.0 N·m·s/rad
g Gravitational acceleration 9.8 m/s2
d(t) External disturbance 0.5sin(t)+0.3cos(2t) N·m
yd(t) Reference trajectory 0.5sin(t) rad
Predefined-Time Parameters
Tc Predefined time parameter 2.0 s
Tmax Maximum settling time 2Tc=2.83 s
β Convergence parameter 0.6 -
Controller Parameters
K2 Feedback gain 100 -
ε1,ε2 Small constants 104 -
ρ Smoothing parameter 0.05 -
Neural Network Parameters
la Actor network nodes 100 -
lc Critic network nodes 64 -
σa,σc Learning rates 100, 50 -
KI Critic feedback gain 2.0 -
ηa, ηc RBF widths 1.2, 1.0 -
ψ Discount factor 10 -
Wamax, Wcmax Weight bounds 200, 100 -
PID Controller
Kp,Kd,Ki PID gains 25, 12, 5 -

To verify that the performance is not an artifact of a specific initial condition, we additionally conducted 20 independent Monte Carlo simulations with randomized initial conditions uniformly drawn from x1(0)[0.5,1.5] rad and x2(0)[1.5,0.5] rad/s. The statistical results are reported in Table 2.

Table 2.

Statistical Performance over 20 Monte Carlo Runs (Mean ± Std).

Performance Metric AC-PT PID Improvement
Total RMSE (rad) 0.0417±0.0129 0.1266±0.0275 67.0%
SS RMSE (rad) 0.0014±0.0004 0.0454±0.0042 96.9%
Max SS Error (rad) 0.0030±0.0009 0.1229±0.0108 97.5%
Settling Time (s) 0.2031±0.0068 13.824±0.063 98.5%
TP<Tmax Satisfied 20/20 (100%) N/A

N/A: Since the PID controller’s response is slow and does not meet the basic premise for evaluating this time constraint, this metric is not applicable.

5.2. Tracking Performance Analysis

Figure 2 compares the tracking performance of the proposed AC-PT controller and the conventional PID controller. As shown in Figure 2a, both controllers track the reference trajectory yd=0.5sin(t), but the AC-PT controller achieves stabilization within approximately 0.23 s, well within the theoretical upper bound Tmax=2.83 s. In contrast, the PID controller requires approximately 13.84 s to reach the ±0.01 rad tolerance band (Figure 2b). The zoomed steady-state view in Figure 2c confirms that the AC-PT controller maintains the tracking error consistently within the specified tolerance, whereas the PID controller exhibits noticeable residual oscillations. The logarithmic-scale convergence plot in Figure 2d further illustrates the characteristic rapid error decay before Tmax, corroborating the predefined-time stability guarantee of Theorem 1. The quantitative comparison is summarized in Table 3: the AC-PT controller achieves 96.9% reduction in steady-state RMSE and 98.3% reduction in settling time compared to PID control.

Figure 2.

Figure 2

Tracking performance comparison: (a) Position tracking showing both controllers following the reference trajectory; (b) Tracking error z1 with ±0.01 rad tolerance band; (c) Steady-state error detail (zoomed view after t>6 s); (d) Error convergence in logarithmic scale showing the convergence rate.

Table 3.

Performance Comparison: AC-PT vs PID Control (Tc=2.0 s). Single-run results with baseline initial condition x1(0)=1.0 rad, x2(0)=1.0 rad/s.

Performance Metric AC-PT PID Improvement
Total RMSE (rad) 0.0467 0.1333 65.0%
Steady-State RMSE (rad) 0.0014 0.0465 96.9%
Max Steady-State Error (rad) 0.0037 0.1259 97.1%
Settling Time to ±0.01 rad (s) 0.229 13.841 98.3%
Time Within ±0.01 rad (%) 100.0 12.2 -

5.3. Neural Network Learning Process

The online learning behavior of the Actor-Critic neural networks is shown in Figure 3. Both the Actor and Critic weight norms (Figure 3a,b) converge to bounded values and remain stable throughout the simulation, confirming that the predefined-time weight update laws incorporating the |W^|βW^ terms successfully prevent weight divergence. The adaptive parameter θ^ (Figure 3c) increases during the transient phase to compensate for system uncertainties and subsequently stabilizes as the tracking error diminishes. Figure 3d shows that both the cost-to-go estimation I^ and the instantaneous cost ϕ decrease rapidly during the initial phase, indicating that the Actor-Critic framework effectively optimizes the control policy while compensating for unknown system dynamics.

Figure 3.

Figure 3

Neural network learning process: (a) Actor network weight norm W^a; (b) Critic network weight norm W^c; (c) Adaptive parameter θ^; (d) Cost-to-go estimation I^ and instantaneous cost ϕ.

5.4. Effect of Predefined Time Parameter

The influence of the predefined time parameter Tc on control performance is investigated through simulations with Tc{1.5,2.0,3.0} s, as shown in Figure 4. Smaller Tc values lead to faster error convergence (Figure 4a), with the system converging before Tmax=2.12 s for Tc=1.5 s. However, this faster convergence comes at the cost of larger initial control effort (Figure 4b), presenting a trade-off between convergence speed and actuator requirements. Figure 4c demonstrates that all tested Tc values achieve comparable steady-state accuracy, indicating that Tc primarily governs the transient response rather than the ultimate tracking precision. The Lyapunov function evolution in Figure 4d confirms that V(t) decreases below its corresponding Tmax bound in all cases, thereby validating the predefined-time stability theory of Theorem 1 across different parameter settings.

Figure 4.

Figure 4

Comparison of different predefined time parameters: (a) Tracking error for different Tc; (b) Control input comparison; (c) Steady-state error comparison; (d) Lyapunov function evolution.

5.5. Comparison with State-of-the-Art Methods

To further substantiate the contributions, the proposed AC-PT controller is compared with two representative methods from the literature: a disturbance-observer-based fixed-time sliding mode controller (FxT-SMC) based on [26], and a predefined-time robust controller without neural networks (PT-Robust) based on [24]. The tracking error comparison in Figure 5a shows that all three advanced controllers significantly outperform PID, with AC-PT and PT-Robust achieving comparable transient performance. The steady-state error detail in Figure 5c reveals that AC-PT achieves the smallest residual error among all methods. The quantitative results demonstrate that the proposed AC-PT method provides competitive convergence speed while offering two key advantages: online learning capability for unknown dynamics compensation (absent in PT-Robust) and an explicit, user-tunable settling time bound TP<2Tc (which FxT-SMC cannot directly prescribe).

Figure 5.

Figure 5

Comparison of multiple methods: (a) Tracking error comparison; (b) Control input comparison; (c) Steady-state error comparison; (d) SS RMSE Comparison.

5.6. Robustness Evaluation

To evaluate the robustness of the proposed AC-PT controller under model uncertainties, we conduct simulations under three categories of perturbations: (i) mass uncertainty (m varied by ±30%), (ii) friction coefficient uncertainty (c0 varied by ±50%), and (iii) increased external disturbance (amplitude scaled to 2×, 3×, and 5× nominal). All tests use the baseline initial condition x1(0)=1.0 rad, x2(0)=1.0 rad/s with all controller parameters unchanged from Table 1.

The results are summarized in Table 4. The AC-PT controller satisfies the predefined-time guarantee TP<Tmax=2.83 s in all tested scenarios without any parameter re-tuning. The settling time remains within the narrow range [0.207,0.212] s, and the steady-state RMSE is maintained at approximately 0.0015 rad across all cases.

Table 4.

Robustness Evaluation under Parameter Perturbations (Tc=2.0 s, Tmax=2.83 s).

Scenario Total RMSE SS RMSE Settling Time TP<Tmax
(rad) (rad) (s) Satisfied
Nominal (m=1.0, c0=1.0) 0.0450 0.0015 0.209 Yes
Mass Uncertainty
   m=0.7 kg (30%) 0.0429 0.0015 0.207 Yes
   m=1.3 kg (+30%) 0.0469 0.0016 0.212 Yes
Friction Uncertainty
   c0=0.5 (50%) 0.0448 0.0015 0.208 Yes
   c0=1.5 (+50%) 0.0453 0.0015 0.210 Yes
Increased Disturbance
   2× disturbance 0.0450 0.0015 0.209 Yes
   3× disturbance 0.0450 0.0015 0.209 Yes
   5× disturbance 0.0450 0.0015 0.209 Yes

This strong invariance is theoretically grounded: the predefined-time convergence rate in Lemma 4 depends on the control gains K1,pt, K2,pt and the design parameter Tc, which are independent of the physical parameters. The adaptive parameter θ^ and the Actor neural network compensate for the parametric variations and disturbance changes online, as predicted by Theorem 1. The representative tracking error trajectories in Figure 6 confirm that the convergence behavior is qualitatively preserved under all perturbation conditions.

Figure 6.

Figure 6

Robustness evaluation: tracking error z1 under parameter perturbations. (a) Nominal parameters; (b) Mass increased by 30% (m=1.3 kg); (c) Disturbance amplitude tripled (3×d(t)); (d) Disturbance amplitude quintupled (5×d(t)). The dashed vertical line indicates Tmax=2.83 s. The green band denotes the ±0.01 rad tolerance. All scenarios satisfy TP<Tmax.

6. Conclusions

This paper has presented a predefined-time adaptive neural tracking control framework for uncertain single-link manipulator systems, integrating predefined-time stability theory with an Actor-Critic reinforcement learning architecture. The main contribution lies in the synergistic design where the predefined-time convergence mechanism is incorporated into both the control law and the neural network weight update laws, enabling a single parameter Tc to explicitly prescribe the settling time upper bound as TP<2Tc, independent of initial conditions and system parameters.

The current work has several limitations that motivate future research. First, the single-link manipulator setting does not capture the coupling effects present in multi-DOF systems; extending the framework to multi-link and redundant manipulators with inter-joint coupling is a natural next step. Second, the current validation is simulation-based; experimental validation on physical robot platforms is essential to assess real-world applicability. Additional future directions include: incorporating input saturation constraints and actuator dynamics; developing event-triggered implementations to reduce computational and communication overhead.

Author Contributions

Conceptualization, Y.Q. and Y.S.; methodology, Y.S.; software, Y.S.; validation, Y.Q., Y.S. and Y.L.; formal analysis, Y.Q. and Y.S.; investigation, J.H.; resources, Y.S.; data curation, Y.L.; writing—original draft preparation, Y.Q. and Y.S.; writing—review and editing, J.H. and Y.S.; visualization, J.H.; supervision, Y.S. and Y.L.; project administration, Y.S.; funding acquisition, Y.Q. and Y.S. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Funding Statement

This work was supported by the Natural Science Foundation of Jiangsu Province, China, No. BK20240771 and the Key Laboratory of AI and Information Processing, Education Department of Guangxi Zhuang Autonomous Region (Hechi University), No. 2024GXZDSY008.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

  • 1.Gao H., Yang Y., Liu J., Sun C. Reinforcement Learning-Based Admittance Control for Physical Human–Robot Interaction with Output Constraints. IEEE Trans. Autom. Sci. Eng. 2025;22:16334–16345. doi: 10.1109/TASE.2025.3576586. [DOI] [Google Scholar]
  • 2.Vyas Y.J., van der Wijk V., Cocuzza S. A Review of Mechanical Design Approaches for Balanced Robotic Manipulation. Robotics. 2025;14:151. doi: 10.3390/robotics14110151. [DOI] [Google Scholar]
  • 3.Zhang D., Hu J., Cheng J., Wu Z.G., Yan H. A Novel Disturbance Observer Based Fixed-Time Sliding Mode Control for Robotic Manipulators with Global Fast Convergence. IEEE/CAA J. Autom. Sin. 2024;11:661–672. doi: 10.1109/JAS.2023.123948. [DOI] [Google Scholar]
  • 4.Sun Y., Yan B., Shi P., Lim C.C. Consensus for Multiagent Systems Under Output Constraints and Unknown Control Directions. IEEE Syst. J. 2024;17:1035–1044. doi: 10.1109/JSYST.2022.3192573. [DOI] [Google Scholar]
  • 5.Liu J., Wang Q.G., Yu J. Event-Triggered Adaptive Neural Network Tracking Control for Uncertain Systems with Unknown Input Saturation Based on Command Filters. IEEE Trans. Neural Netw. Learn. Syst. 2024;35:8702–8707. doi: 10.1109/TNNLS.2022.3224065. [DOI] [PubMed] [Google Scholar]
  • 6.Li W., Zhang Z., Ge S.S. Dynamic Gain Reduced-Order Observer-Based Global Adaptive Neural-Network Tracking Control for Nonlinear Time-Delay Systems. IEEE Trans. Cybern. 2023;53:7105–7114. doi: 10.1109/TCYB.2022.3178385. [DOI] [PubMed] [Google Scholar]
  • 7.Xie X., Chen W., Xia C., Xing J., Chang L. An RBFNN-Based Prescribed Performance Controller for Spacecraft Proximity Operations with Collision Avoidance. Sensors. 2026;26:108. doi: 10.3390/s26010108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Zhang X., Li H., Zhu G., Zhang Y., Wang C., Wang Y., Su C.Y. Finite-Time Adaptive Quantized Control for Quadrotor Aerial Vehicle with Full States Constraints and Validation on QDrone Experimental Platform. Drones. 2024;8:264. doi: 10.3390/drones8060264. [DOI] [Google Scholar]
  • 9.Zhang S., Yang P., Kong L., Li G., He W. A Single Parameter-Based Adaptive Approach to Robotic Manipulators with Finite Time Convergence and Actuator Fault. IEEE Access. 2020;8:15123–15131. doi: 10.1109/ACCESS.2020.2966639. [DOI] [Google Scholar]
  • 10.Li G., Chen X., Yu J., Liu J. Adaptive Neural Network-Based Finite-Time Impedance Control of Constrained Robotic Manipulators with Disturbance Observer. IEEE Trans. Circuits Syst. II Express Briefs. 2022;69:1412–1416. doi: 10.1109/TCSII.2021.3109257. [DOI] [Google Scholar]
  • 11.Jiménez-Rodríguez E., Muñoz-Vázquez A.J., Sánchez-Torres J.D., Defoort M., Loukianov A.G. A Lyapunov-Like Characterization of Predefined-Time Stability. IEEE Trans. Autom. Control. 2020;65:4922–4927. doi: 10.1109/TAC.2020.2967555. [DOI] [Google Scholar]
  • 12.Zhang T., Bai R., Li Y. Practically Predefined-Time Adaptive Fuzzy Quantized Control for Nonlinear Stochastic Systems with Actuator Dead Zone. IEEE Trans. Fuzzy Syst. 2023;31:1240–1253. doi: 10.1109/TFUZZ.2022.3197970. [DOI] [Google Scholar]
  • 13.Liu B., Wang W., Li Y., Yi Y., Xie G. Adaptive Quantized Predefined-Time Backstepping Control for Nonlinear Strict-Feedback Systems. IEEE Trans. Circuits Syst. II Express Briefs. 2022;69:3859–3863. doi: 10.1109/TCSII.2022.3175739. [DOI] [Google Scholar]
  • 14.Xie S., Chen Q. Adaptive Nonsingular Predefined-Time Control for Attitude Stabilization of Rigid Spacecrafts. IEEE Trans. Circuits Syst. II Express Briefs. 2022;69:189–193. doi: 10.1109/TCSII.2021.3078708. [DOI] [Google Scholar]
  • 15.Fan Y., Yang C., Zhan H., Li Y. Neuro-Adaptive-Based Predefined-Time Smooth Control for Manipulators with Disturbance. IEEE Trans. Syst. Man Cybern. Syst. 2024;54:4605–4616. doi: 10.1109/TSMC.2024.3382748. [DOI] [Google Scholar]
  • 16.Lewis F.L., Vrabie D., Vamvoudakis K.G. Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers. IEEE Control Syst. Mag. 2012;32:76–105. [Google Scholar]
  • 17.Ouyang Y., He W., Li X. Reinforcement learning control of a single-link flexible robotic manipulator. IET Control Theory Appl. 2017;11:1426–1433. doi: 10.1049/iet-cta.2016.1540. [DOI] [Google Scholar]
  • 18.Vamvoudakis K.G., Lewis F.L. Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica. 2010;46:878–888. doi: 10.1016/j.automatica.2010.02.018. [DOI] [Google Scholar]
  • 19.Guan X., Li Y.X., Hou Z., Ahn C.K. Reinforcement Learning-Based Event-Triggered Adaptive Fixed-Time Optimal Formation Control of Multiple QAAVs. IEEE Trans. Aerosp. Electron. Syst. 2025;61:11849–11864. doi: 10.1109/TAES.2025.3569643. [DOI] [Google Scholar]
  • 20.Liu Y.J., Li S., Tong S., Chen C.L.P. Adaptive Reinforcement Learning Control Based on Neural Approximation for Nonlinear Discrete-Time Systems with Unknown Nonaffine Dead-Zone Input. IEEE Trans. Neural Netw. Learn. Syst. 2019;30:295–305. doi: 10.1109/TNNLS.2018.2844165. [DOI] [PubMed] [Google Scholar]
  • 21.Zhang Y., Liang X., Li D., Ge S.S., Gao B., Chen H., Lee T.H. Reinforcement Learning-Based Time-Synchronized Optimized Control for Affine Systems. IEEE Trans. Artif. Intell. 2024;5:5216–5231. doi: 10.1109/TAI.2024.3420261. [DOI] [Google Scholar]
  • 22.Sun Y., Shi P., Lim C.C. Event-triggered adaptive leaderless consensus control for nonlinear multi-agent systems with unknown backlash-like hysteresis. Int. J. Robust Nonlinear Control. 2021;31:7409–7424. doi: 10.1002/rnc.5692. [DOI] [Google Scholar]
  • 23.Hu G., Xu D., Hua W., Jiang B., Shi P., Rudas I.J. Fixed-Time Cooperative Sliding Mode Control for Synchronization of Multilinear Motor Systems. IEEE/ASME Trans. Mechatronics. 2025;31:173–184. doi: 10.1109/TMECH.2025.3585574. [DOI] [Google Scholar]
  • 24.Muñoz-Vázquez A.J., Sánchez-Torres J.D., Jiménez-Rodríguez E., Loukianov A.G. Predefined-time robust stabilization of robotic manipulators. IEEE/ASME Trans. Mechatronics. 2019;24:1033–1040. doi: 10.1109/TMECH.2019.2906289. [DOI] [Google Scholar]
  • 25.Sun Y., Shi P., Lim C.C. Adaptive consensus control for output-constrained nonlinear multi-agent systems with actuator faults. J. Frankl. Inst. 2022;359:4216–4232. doi: 10.1016/j.jfranklin.2022.03.025. [DOI] [Google Scholar]
  • 26.Zhang L., Su Y., Wang Z., Wang H. Fixed-time terminal sliding mode control for uncertain robot manipulators. ISA Trans. 2024;144:364–373. doi: 10.1016/j.isatra.2023.10.011. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data presented in this study are available from the corresponding author upon reasonable request.


Articles from Sensors (Basel, Switzerland) are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES