Skip to main content
. 2023 Aug 20;7:608–624. doi: 10.1162/opmi_a_00097

Figure 2. .

Figure 2. 

Model: We model agents of different levels of theory of mind (ToM). Note how all buyers and all sellers share the same respective value function and only differ in the way they use theory of mind for planning and/or inference. We begin with a simple reinforcement learning buyer in blue, ToM(−1) that makes the first and second choice independently, and a regular inverse reinforcement learning seller in red, ToM(0), who draws inferences about this buyer. The ToM(1) buyer plans through this seller’s inference process to optimize its overall value. In turn, the ToM(2) seller does inverse reinforcement learning, but of a higher level, taking the ToM(1)’s planning into account. As we will see, this gives rise to a defensive inference process. The ToM(3) in turn plans just like the ToM(1) but does so with the ToM(2)’s ‘defensive’ inference in mind.