(A) Timeline of a trial in Experiments 1 and 2. On each trial, the subject chose between two objects (colored shapes) and was provided with reward feedback (reward or no reward) on the chosen object. The insets show the set of all features (C, color; S, shape) and objects used in Experiments 1 and 2. (B) Examples of the set of reward probabilities (indicated on each object) assigned to the four objects in Experiments 1 and 2. The reward probabilities assigned to the four shapes changed after every 48 trials without any cue to the subject. In the generalizable environment (Experiment 1) reward probabilities assigned to objects were predicted by feature values. In the non-generalizable environment (Experiment 2), there was no generalizable relationship between the reward values of individual objects and their features. (C)Reward probabilities were assigned to nine or sixteen possible objects during Experiments 3 and 4, respectively. Objects were defined by combinations of two features (S, shape; P, pattern), each of which could take any of three or four values in Experiments 3 and 4, respectively. The inset shows the set of shapes (top row) and patterns (bottom row) used to construct objects. Reward probabilities were assigned such that no generalizable rule predicted the reward probabilities of all objects based on feature values. In addition to choice trials similar to those in Experiments 1 and 2, Experiments 3 and 4 also included estimation trials where the subject estimated the probability of reward for an individual object by pressing one of ten keys on the keyboard.