Abstract
State representation is fundamental to behavior. However, identifying the true state of the world is challenging when explicit cues are ambiguous. Here, Bradfield and colleagues show that the medial OFC is critical for using associative information to discriminate ambiguous states.
Some decisions are easy: you go at a green light, you stop at red. Those two states of the world are clearly different, signaling different appropriate behaviors. However, sometimes you stop even at a green light—for instance, you are going left and first need to give right-of-way to oncoming traffic. Here, the state of “green light and I intend to go left” is different from “green light and I intend to go straight,” despite the two states being perceptually identical. Appropriate state representation is fundamental to behavioral flexibility—by abstracting away superfluous information (whether the oncoming car is black or gray, are pedestrians crossing the street on your right) and adding in important unobservable information (your intention, your past actions, your knowledge of traffic rules), the brain can craft a “task state” that is ideal for rapid, correct, and generalizable decision making.
However, identifying the “true” state can be particularly challenging when explicit cues are ambiguous and the candidate states imply contradictory rules. For example, a soldier returning from deployment must be able to categorize the wartime setting differently from similar civilian contexts to avoid inappropriate behavioral responses. While there are often explicit or observable cues that distinguish these states, this is not always the case; given enough abstraction, the distinction between Bagdad and Baltimore may become difficult, and largely a matter of an internal belief. It is critical that the neural representation of this belief be able to bridge the gaps between observable distinguishing events, as dysfunction in this process, even if very brief, could contribute to phenomena such as post-traumatic stress disorder (PTSD).
Given the importance of task state information for decision making and learning, and for disturbances thereof, it is of interest to identify the neural substrates mediating the ability to recognize, maintain, and deploy state representations. We have recently proposed that the orbitofrontal cortex (OFC) might be one key area involved in this process (Takahashi et al., 2011; Wilson et al., 2014). In particular, we suggested that the OFC is critical for representing and using states that include components that are not externally observable. In the current issue of Neuron, Bradfield et al. (2015) elegantly explore this possibility, focusing specifically on the role of the medial OFC in supporting goal-directed behavior that depends on a “forward search” over potential upcoming (and currently unobservable) task states. In a series of experiments, they test whether the medial OFC contributes to the ability to predict that a particular state is forthcoming when the critical defining information—a predicted food outcome—is not available in the physical environment but can only be inferred from learned associations.
For example, in one experiment two cues predicted delivery of two different foods to rats. The rats also learned that two distinct actions—pressing a left or a right lever—could, with some probability, produce these foods. Once each of these associations had been trained independently, rats were given a choice test in which both levers were available. When one of the cues was turned on, the rats increased their press rate on the lever that had produced the food predicted by that cue, as if anticipation of the food increased the value of the action associated with that outcome. This occurred even though food was not actually present at any point in the test. This bias in action selection is consistent with an association between the actions and specific future outcome states and with the operation of two internal states or beliefs that each food is more likely available under certain circumstances: the presence of the cue that previously signaled its delivery. Rats with lesions of the medial OFC, in contrast, failed to show this selective bias, increasing instead their press rates on both levers, regardless of which cue was present. In itself, this result is amenable to a number of interpretations: perhaps the medial OFC is necessary for rats to learn to associate each lever with a specific food outcome (Klein-Flügge et al., 2013), or for associations between cues and outcomes to affect instrumental actions (Colwill and Rescorla, 1990), or for rats to be able to internally simulate the currently unobservable forward-consequences of their actions (Wilson et al., 2014; Doll et al., 2015). Using a series of follow-up experiments, including manipulations that disable the medial OFC only temporarily at test (after allowing for normal learning), Bradfield et al. (2015) mount a convincing case for the third interpretation—if outcomes are available in the environment, animals can plan actions appropriately even without the medial OFC, but when outcomes are absent and appropriate behavior requires internal state information, the medial OFC is key.
In a final experiment consisting of creative animal-theory acrobatics, Bradfield et al. (2015) set forth to show that the medial OFC has a specific role in contributing to the use of unobservable information to generate task states. In this experiment, animals were trained that each of two cues predicts a specific cue reward, and each of two actions predict the cues. As in typical conditioned reinforcement experiments, when actions were performed, the cues appeared but no food reward was given. One consequence of this setup is that animals can learn that action A1 is inhibitory—it predicts the absence of an otherwise expected outcome. In a later stage, the action A1 led to the same cue S1, but this time with reward available. If the medial OFC is necessary only for inferring states on the basis of unobservable information, rats with medial OFC lesions should still be able to learn the associations inherent in the rewarded state but show a selective deficit in forming inhibitory associations in the unrewarded state. To test this prediction, in a test phase, the authors gave rats a choice between the two actions A1 and A2, this time with both actions leading to cue S1 (and no reward). The absence of reward was intended to invoke the inhibitory association: in this context, the A1-S1 combination should be inhibitory and thus undesirable. Rats should therefore prefer to perform A2, as it leads to S1, which may still conceivably lead to reward. However, if medial OFC lesions render rats unable to learn the inhibitory relationship, lesioned rats should prefer the A1-S1 combination that had been previously rewarded to the unknown A2-S1 combination. This was indeed what the behavioral results showed: intact rats responded by choosing the novel combination, essentially inferring the unavailability of reward for the previously trained action in this state, while rats with medial OFC lesions showed the opposite bias. In other words, rats with medial OFC actively pursued performance of the action that they should have known was counterproductive to acquiring food if they had been able to use the unobservable outcome to infer the unrewarded state. The authors argue that this experiment clearly showed that rats without medial OFC function are not necessarily impaired at forming states per se; rather they were selectively unable to form a novel state on the basis of unobservable information.
The intricacy of the studies conducted by Bradfield et al. (2015) presents us with formidable evidence that the medial OFC contributes to the ability to use inferred but unobservable outcomes to generate internal states. But a state space theoretically includes all aspects of the current environment, not just information about outcomes or consequences. The importance of this broader conceptualization is evident in past work from this lab (Bradfield et al., 2013). In that study, they reported that silencing of cholinergic interneurons in the dorsomedial striatum (DMS) caused a selective deficit in the ability of rats to appropriately segregate new learning to a new state of the task. Importantly, although both the current and earlier experiments speak to the importance of state representations to flexible behavior, the impairment found in the earlier study was qualitatively different from the impairment observed in the current experiments. Unlike rats lacking medial OFC, rats lacking cholinergic function in the DMS were not impaired in using unobservable outcomes to influence action selection—in fact they were perfectly able to choose an action on the basis of an inferred outcome. However, these rats were unable to maintain appropriate responding when the contingencies were changed and a new state had to be inferred. Instead, the rats appeared to combine the learning from the two episodes as if they lacked the ability to create different conceptual states entirely (Schoenbaum et al., 2013). While this deficit differs from the effects of medial OFC damage reported here, it is remarkably similar to the effects of damage to the lateral OFC, which are often not evident until contingencies change (Ostlund and Balleine, 2007; Riceberg and Shapiro, 2012; Schoenbaum et al., 2002). Future work may find that all the factors that contribute to a conceptualization of state are not localized within one region of the OFC. Rather, the OFC may function as a wider network to promote a high-dimensional state representation capable of tagging distinct associations developed in downstream structures.
What begins to emerge from these recent studies is a complex system capable of placing distinct memories within a broader representation or cognitive map of the world (Tolman, 1948). The importance of these broader maps is a challenge to existing, more simplistic frameworks that are often applied to understand how the brain regulates complex associative behavior. For example, we began by noting the potential importance of state to understanding phenomena like PTSD. Existing therapeutic approaches emphasize extinction of the fear-producing memory. Yet an appreciation of the contribution of prediction errors to segregating learning between different inferred states (Gershman et al., 2010) indicates that for extinction to be effective it must be conducted under circumstances—a state—similar to the original learning, otherwise the relevant associative rules will not be accessed and modified by the extinction training (Gershman et al., 2013). Beyond this, it is worth considering that PTSD may be the result of a very transient failure in the ability to maintain the appropriate state representation. Such a brief failure could lead to the emotional responses that characterize PTSD even in the absence of any underlying abnormalities in the learning and extinction processes. In the end, it may be simpler to strengthen state representations than to extinguish or “erase” the underlying memories. Importantly, the clinical implications of state are not limited to PTSD. Rather, this framework may help to explain the persistent ability of environmental stimuli to control adverse behaviors in many disorders, such as drug addiction or other anxiety disorders. Studies such as Bradfield et al., (2015) demonstrate the importance of using sophisticated learning paradigms to understand these disorders and the neural circuitry that promote them.
References
- Bradfield LA, Bertran-Gonzalez J, Chieng B, Balleine BW. Neuron. 2013;79:153–166. doi: 10.1016/j.neuron.2013.04.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bradfield LA, Dezfouli A, van Holstein M, Chieng B, Balleine BW. Neuron. 2015;88:1268–1280. doi: 10.1016/j.neuron.2015.10.044. this issue. [DOI] [PubMed] [Google Scholar]
- Colwill RM, Rescorla RA. J Exp Psychol Anim Behav Process. 1990;16:40–47. [PubMed] [Google Scholar]
- Doll BD, Duncan KD, Simon DA, Shohamy D, Daw ND. Nat Neurosci. 2015;18:767–772. doi: 10.1038/nn.3981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gershman SJ, Blei DM, Niv Y. Psychol Rev. 2010;117:197–209. doi: 10.1037/a0017808. [DOI] [PubMed] [Google Scholar]
- Gershman SJ, Jones CE, Norman KA, Monfils MH, Niv Y. Front Behav Neurosci. 2013;7:164–170. doi: 10.3389/fnbeh.2013.00164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Klein-Flügge MC, Barron HC, Brodersen KH, Dolan RJ, Behrens TEJ. J Neurosci. 2013;33:3202–3211. doi: 10.1523/JNEUROSCI.2532-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ostlund SB, Balleine BW. J Neurosci. 2007;27:4819–4825. doi: 10.1523/JNEUROSCI.5443-06.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Riceberg JS, Shapiro ML. J Neurosci. 2012;32:16402–16409. doi: 10.1523/JNEUROSCI.0776-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schoenbaum G, Nugent SL, Saddoris MP, Setlow B. Neuroreport. 2002;13:885–890. doi: 10.1097/00001756-200205070-00030. [DOI] [PubMed] [Google Scholar]
- Schoenbaum G, Stalnaker TA, Niv Y. Neuron. 2013;79:3–6. doi: 10.1016/j.neuron.2013.06.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takahashi YK, Roesch MR, Wilson RC, Toreson K, O’Donnell P, Niv Y, Schoenbaum G. Nat Neurosci. 2011;14:1590–1597. doi: 10.1038/nn.2957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tolman EC. Psychol Rev. 1948;55:189–208. doi: 10.1037/h0061626. [DOI] [PubMed] [Google Scholar]
- Wilson RC, Takahashi YK, Schoenbaum G, Niv Y. Neuron. 2014;81:267–279. doi: 10.1016/j.neuron.2013.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
