(a) The traditional, unforgetful Chinese restaurant process (CRP) is a nonparametric Bayesian model where the probability that a new observation belongs to an existing cluster or a new one is determined by the cluster sizes and the strength parameter α. In the metaphor, the new customer (new observation; see the terminology in Table 1; shown as black dots) sits at one of the existing tables (clusters labeled by key press identity, e.g., ‘response to left side of the screen’; shown as colored circles) or opens up a new table (shown as open circle) with probabilities proportional to the number of customers sitting at the tables and α. Here, the most likely next response would be of the type pink. (b) The distance-dependent or ‘forgetful’ Chinese restaurant process (ddCRP) is governed by a distance metric, according to the ‘close together sit together’ principle. In our case, the customers are subject to exponential decay with rate λ, as shown in the inset (and illustrated by the grey colours of the customers). Even though the same number of customers sit at the tables as in (a), this time the predictive probability of a yellow response is highest because most of the recent responses were yellow. (c) In the distance-dependent hierarchical Chinese restaurant process (HCRP), restaurants are labeled by the context of some number of preceding events and are organized hierarchically such that restaurants with the longest context are on top. Thus, each restaurant models the key press of the participant at time point t, kt, given a context of n events (et−n, …et−1). A new customer arrives first to the topmost restaurant that corresponds to its context in the data (in the example, the customer is bound to visit the restaurant labeled by the context ‘yellow-blue’ when he arrives to level 2). If it opens up a new table, it also backs off to the restaurant corresponding to the context one element shorter (in the example, to the restaurant labeled by the context ‘blue’).