|
Algorithm 1 Model-free self-triggered control based on HRL. |
Initialize
fordo
for do
Choose and using actor network with noisy signal
Apply during and read , on each sampling time unit
Add (,,,) to the experience replay buffer
if then
for data in buffer do
Compute
Using the adam optimizer to update the critic network parameters W
Compute the performance evaluation metrics J and
Update the actor network parameters and using the formula (24)
end for
end if
end for
end for
|