Fig. 4.
We are examining per episode group-level metrics about agent populations (y axis) over the trajectory of learning (x axis in time steps). We plot the average trajectory per condition (and standard error of the mean). (A) Number of times unmarked agents are punished (agents that have not broken a taboo). (B) Number of times marked agents are punished (agents that have broken a taboo). (C) Fraction of time spent marked after breaking a taboo. (D) Fraction of time agents spent poisoned. (E) The number of “taboo” berries eaten (poisonous and nonpoisonous combined, if available in the condition). (F) Total sum of reward gained by group (including costs of punishing). In total, we observe a benefit of the silly rule condition in the intermediate stages of learning, driven by an increased ability to avoid poisonous berries. We also see a temporal order to learned behaviors, for example, an increase in social punishment that then declines together with a decrease in number of taboo berries eaten.