Data-Driven Self-Triggered Control for Networked Motor Control Systems Using RNNs and Pre-Training: A Hierarchical Reinforcement Learning Framework

. 2024 Mar 20;24(6):1986. doi: 10.3390/s24061986

Algorithm 1 Model-free self-triggered control based on HRL.

Initialize $W, θ, β$
for $e p = 0, 1, \dots, N$ do
$t \leftarrow 0, x_{0} \leftarrow r a n d o m (x)$
for $t \leq t_{f i n a l}$ do
Choose ${\bar{u}}_{l}$ and $τ_{t}$ using actor network with noisy signal
Apply ${\bar{u}}_{l}$ during $τ_{t}$ and read $x_{t}$ , $r_{t}$ on each sampling time unit $d t$
Add ( $x_{t}$ , ${\bar{u}}_{l}$ , $n \cdot d t$ , $r_{t}$ ) to the experience replay buffer $D$
if $buffer size \geq mini-batch size$ then
for data in buffer do
Compute $L (W) : = \frac{1}{n} \sum_{i = 1}^{n} {(G (x_{t i}, {\bar{u}}_{l i}, τ_{t i} ∣ W) - \sum_{t = t_{l}}^{t_{l} + τ_{t i}} (x_{t_{i}}^{T} Q x_{t_{i}} + {\bar{u}}_{l_{i}}^{T} R {\bar{u}}_{l i}))}^{2}$
Using the adam optimizer to update the critic network parameters W
Compute the performance evaluation metrics J and $η$
Update the actor network parameters $θ$ and $μ$ using the formula (24)
end for
end if
end for
end for