Skip to main content
. 2024 Jun 8;9(6):346. doi: 10.3390/biomimetics9060346
Algorithm 1 Modified HER Algorithm for Retaining Experience with Adjusted Rewards
  •   1:

    Initialize environment and agent

  •   2:

    state_history[]

  •   3:

    fallen_index1

  •   4:

    while task not finished do

  •   5:

        actionagentselectsaction(current_state)

  •   6:

        next_state,reward,done,infoenvironmentexecutesaction(action)

  •   7:

        Append current_state to state_history

  •   8:

        if info[fallen]=True then

  •   9:

            doneTrue

  • 10:

          fallen_indexlength(state_history)

  • 11:

      end if

  • 12:

      current_statenext_state

  • 13:

    end while

  • 14:

    if fallen_index20  then

  • 15:

       successful_statesstate_history[0:fallen_index20]

  • 16:

       last_actionsstate_history[fallen_index20:fallen_index]

  • 17:

      Add a reward of 10 to the reward of successful_states

  • 18:

      Process successful_states for further learning as successful attempts

  • 19:

      Store last_actions in experience pool without additional reward

  • 20:

    end if