Hybrid Imitation Learning Framework for Robotic Manipulation Tasks

. 2021 May 13;21(10):3409. doi: 10.3390/s21103409

Algorithm 1. Hybrid imitation learning framework.

Function HIL(D,

ρ

)
/* the demo dataset

D = {〈(s_{k_{1}}^{E}, a_{k_{1}}^{E}), \dots, (s_{k_{l}}^{E}, a_{k_{l}}^{E})〉 | k = 1, \dots, n}

,
the state recovery probability

ρ

*/
Initialize the policy network

π_{θ}

and the dynamics network

f_{ϕ}

f_{ϕ}

= Pretrain_Dynamics_Network(

f_{ϕ},

D

E

)
for e = 0,…, E epochs do
for I = 0,…,|D| do
Sample

d = 〈(s_{1}^{E}, a_{1}^{E}), (s_{2}^{E}, a_{2}^{E}), \dots, (s_{l}^{E}, a_{l}^{E})〉

from

D

L_{BC}, d_{conv} =

Behavior_Cloning(

π_{θ}

d

)

L_{SC} =

State_Cloning(

π_{θ}

f_{ϕ}, d

ρ

)

L_{mix} =

Loss_Mixing(

L_{BC}

L_{SC}, d_{c o n v}

)
Update the policy network parameters

θ

by gradient descent

θ \leftarrow θ - \nabla_{θ} L_{mix}

end for
end for
return

π_{θ}