|
Algorithm 1: An algorithm of MADDPG with the -greedy policy for AGVs. |
| for
to max-episode do
|
| Initialization of the parameters |
| for
to M do
|
| for
to N do
|
| n = random number |
| if
then
|
| execute any action(a) |
| else
|
| execute the action which maximizes
with
|
| end if
|
|
|
|
|
|
|
| end for
|
| for agent to N do
|
|
|
|
|
|
|
| end for
|
| for
to N do
|
|
|
| end for
|
| end for
|
| end for
|