Belief update and table configurations for the 1-step (top) and 3-step (bottom) bounded memory models at successive time-steps. (T = 1) After the first disagreement and in the absence of any previous history, the belief remains uniform over α. The human (white dot) follows their modal policy from the previous time-step, therefore at T = 2 the belief becomes higher for smaller values of α in both models (lower adaptability). (T = 2) The robot (black dot) adapts to the human and executes the human modal policy. At the same time, the human switches to the robot mode, therefore at T = 3 the probability mass moves to the right. (T = 3) The human switches back to their initial mode. In the 3-step model the resulting distribution at T = 4 has a positive skewness: the robot estimates the human to be non-adaptable. In the 1-step model the robot incorrectly infers that the human adapted to the robot mode of the previous time-step, and the probability distribution has a negative skewness. (T = 4,5) The robot in the 3-step trial switches to the human modal policy, whereas in the 1-step trial it does not adapt to the human, who insists on their mode.