Figure 3.
Experimental setting detail for our MoP-SAN in learning and generalization phases (zero-shot collaboration). Agent A-E corresponds to the agent with different seeds whose name is A-E. (A) In the learning phase, the ego agent and specific partner agent in a pair collaborate for this task and are trained by iterative optimization. The ego agent and partner agent in a pair have the same name. There are five agent pairs in the learning phase: (A, A), (B, B), (C, C), (D, D), and (E, E). (B) In the generalization phase, the ego agent needs to collaborate with all unseen partner agents in a zero-shot manner. For example, the ego agent A will cooperate with another unseen partner agent with a different name (B, C, D, or E) for the zero-shot collaboration test.
