Skip to main content
. 2023 Jan 9;23(2):762. doi: 10.3390/s23020762
Algorithm 2 VIB based meta-reinforcement learning testing algorithm.
  • 1:

    Input:{Tn}m=1,,Np(T): Meta-testing task set; θ: meta-training policy network; ω: meta-training latent space encoder network.

  • 2:

    Initializing the trajectory: eT={}

  • 3:

    for k=1,,N do

  • 4:

          Latent space inference: zEωzeT

  • 5:

          Using current policy πθas,z to interact with each task and obtain Dk

  • 6:

          Sample accumulation: eT=eTDk

  • 7:

          Evaluating empirical discounted return for each task.

  • 8:

    end for