Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Jun 1.
Published in final edited form as: Behav Neurosci. 2017 Jun;131(3):201–212. doi: 10.1037/bne0000195

Ensembles in medial and lateral orbitofrontal cortex construct cognitive maps emphasizing different features of the behavioral landscape

Nina Lopatina 1,3,*, Brian F Sadacca 1,*, Michael A McDannald 2, Clay V Styer 1, Jacob F Peterson 1, Joseph F Cheer 4, Geoffrey Schoenbaum 1,4,5,
PMCID: PMC5445939  NIHMSID: NIHMS856623  PMID: 28541078

Abstract

The orbitofrontal cortex (OFC) has long been implicated in the ability to use the current value of expected outcomes to guide behavior. More recently this specific role has been conceptualized as a special case of a more general function that OFC plays in constructing a “cognitive map” of the behavioral task space by labeling the current task state and learning relationships among task states. Here, we have used single unit recording data from two prior studies to examine whether and how information relating different states within and across trials is represented in medial versus lateral OFC in rats. Using a hierarchical clustering analysis, we examined how neurons from each area represented information about differently valued trial types, defined by the cue-outcome pairings, versus how those same neurons represented information about similar epochs between these different trial types, such as the stimulus sample, delay, and reward consumption epochs. This analysis revealed that ensembles in lOFC group states according to trial epoch, whereas those in mOFC organize the same states by trial type. These results suggest that lOFC and mOFC construct cognitive maps that emphasize different features of the behavioral landscape, with lOFC tracking events based on local similarities, irrespective of their values, and mOFC tracking more distal or higher order relationships relevant to value.

Keywords: Orbitofrontal, electrophysiology, unblocking, dendrogram, hierarchical clustering


The orbitofrontal cortex (OFC) has long been implicated in the ability to use the current value of expected outcomes to guide behavior (Gallagher, McMahan, & Schoenbaum, 1999; Izquierdo & Murray, 2000; Jones et al., 2012; Pickens et al., 2003; Rudebeck, Saunders, Prescott, Chau, & Murray, 2013; West, DesJardin, Gale, & Malkova, 2011). Recently, it has been suggested that this is part of a more general function in which the OFC constructs a cognitive map (Tolman, 1948) of the behavioral task space by labeling the current task state and learning relationships among task states (Wilson, Takahashi, Schoenbaum, & Niv, 2014).

We have recently recorded single unit activity in the lateral and medial subregions of the OFC during Pavlovian unblocking in order to isolate signaling of information about reward value from other reward features. In one study (N Lopatina et al., 2015), we compared firing in lOFC neurons to cues that signaled an increase, a decrease, or no change in reward. Despite the linear change in value signaled by the different cues, a change reflected in the rats’ behavior, we failed to find neural correlates that reflected reward value across cues. Instead, we found dissociable populations of lOFC neurons that developed firing to each of the three cues, including the cue that predicted no change in reward. In a second (N. Lopatina et al., 2016), we repeated this experiment recording in the mOFC. Again, the responses we recorded did not correlate with abstract value across cues. Instead, we found that cells developed responses to cues predicting a change, particularly a decrease, in reward value.

Here we return to these two datasets to investigate how mOFC and lOFC distinguish and relate different task states within and across differently valued trial types. We used an unsupervised machine learning algorithm, hierarchical clustering, (Farovik et al., 2015; McKenzie et al., 2014) to reveal the structure of task representation in our recorded population responses. This analysis built a hierarchy of clusters from individually defined task states by the Euclidean distance between these states’ population firing rate in a dimensionally reduced plane. We used this approach to distinguish the relative sensitivity of our recorded populations to our task parameters: the states we had defined by epoch and type. We summarized our results in a dendrogram, a tree diagram showing the Euclidean distances between objects and clusters. Dendrograms of both the pseudo-ensemble population and simultaneously recorded ensembles in lOFC predominantly grouped task states according to their epoch within a trial, even though the states in a given epoch differed in value, while those in mOFC predominantly grouped task states by trial type, an organization which reflected value in our task. Since differing trial types are associated with differently valued outcomes, the similarity in responses within a trial epoch, i.e. between an upshift and downshift cue, indicates enhanced representation of local events. This local representation is independent of context: thus, the downshift cue signaling a small reward and the upshift cue signaling a large reward are similarly represented. The higher order relationship between these cues and the subsequent outcomes is not reflected in this population. On the other hand, this higher order relationship is represented in mOFC, where population activity is more similar for states that are similar in their associated outcome. This representation links states to past and future events and their relative values. These results suggest that ensembles in lOFC and mOFC construct a cognitive map of the task space. Further, they indicate that the map in each area is subtly different, with lOFC tracking similarities between local events, even when the value of those events differ, and mOFC tracking more distal or higher order relationships, which are of more relevance to distinguishing value in our task.

Methods

Subjects

18 and 13 male Long-Evans rats were obtained at 200–250g from Charles River Labs, (Wilmington, MA) for the lOFC and mOFC experiments, respectively. Rats were tested at the NIDA-IRP in accordance with NIH guidelines. Procedures were approved in accordance with the NIDA IACUC (ACUC protocol 15-CNRB-108, assurance number A4149-01).

Surgery and Histology

Using aseptic, stereotaxic surgical methods, a drivable bundle of sixteen 25 μm diameter FeNiCr wires (Stablohm 675, California Fine Wire, Grover Beach, CA) was chronically implanted in the left hemisphere at lateral OFC at 3.0 mm anterior to bregma, 3.2 mm laterally, and 3.9 mm ventral to each rat’s brain surface, or at medial OFC at a 13° angle, 4.4 mm anterior to bregma, 1.58 mm laterally, and 2.78 mm ventral to each rat’s brain surface. We implanted these microelectrodes in MO since this area has recently been reported to be homologous to medial orbital areas in primates based on a comparison of connectivity with striatum, hippocampus, and amygdala (Heilbronner, Rodriquez-Romaguera, Quirk, Groenewegen, & Haber, in press). These wires were cut at an angle with surgical scissors immediately prior to implantation, to extend ~1.8–2.5 mm beyond the cannula, with a range of ~0.3 mm between wires. Current was passed through each electrode immediately prior to implantation to lower the impedance to ~300–400 kOhms. Rats were anesthetized with isoflurane. Subcutaneous injections of 0.1 mL lidocaine and 0.1 mL carprofen diluted in saline were used for analgesia. At the study’s conclusion, a 15-μA current was passed through each electrode to mark the final position. Following perfusion of the rats, their brains were extracted and processed for histology using standard techniques. While we otherwise used identical procedures between the lOFC and mOFC recording experiments to facilitate comparison between these experiments, there are known experimental differences resulting from recording in two different regions. The implant placements differed in two ways: the cannula and wire bundle trajectory through the OFC was vertical in lOFC and diagonal (13°) in mOFC. Second, the connector placement was more anterior in mOFC than lOFC.

Blocking Task

Recording was conducted in grounded aluminum chambers approximately 18″ on each side with sloping walls narrowing to an area of 12″ x 12″ at the bottom. An odor port was located centrally above a fluid well on a panel in the right wall of each chamber. Above the panel were two lights. To allow rapid delivery of olfactory cues to the odor port, it was connected to an airflow dilution olfactometer. Odors were chosen from compounds obtained from International Flavors and Fragrances (New York, NY). The fluid well was connected to lines controlling the independent delivery of liquid rewards. A computer running a behavioral program written in C++ implemented control of the task. Following implantation with microelectrodes, rats were water deprived by restricting access to 10 minutes daily. Following two days of water deprivation, rats were shaped, in stages, to hold in the odor port for 1s in order to receive a water reward at the well. Each trial started with house light illumination, following which rats had 3 s to enter the odor port. A failure to enter the odor port caused restart of the trial. Rats were required to hold for 1 s in the odor port, and upon exit had 3 s to enter the reward well. Again, failure to hold for 1 s or to make reward well entry within 3 s resulted in restart of the trial. Following shaping, rats were trained until they proficiently responded for the initial odor to receive a medium-sized bolus of diluted chocolate milk solution; this comprised up to 15 sessions, with a maximum of 170 trials per session. Completion of ~150 trials per session was characterized as proficient responding.

Once rats were deemed proficient at initial training and single units were isolated, the unblocking procedure began. On each of the two learning days, rats received four randomly intermixed trial types. The first trial type was a reminder of initial training. The remaining trial types comprised a 200 ms presentation of the initial odor followed by one of three 800 ms, novel, differentiable odors: one signaling the same medium-sized bolus of chocolate milk used in prior training, a second signaling a larger bolus, and a third signaling a smaller bolus. The behavioral requirements for each of trial type were exactly as in initial training. Rats completed 20–40 trials with each novel odor per session during unblocking. Then, on the probe test day, rats received 10 reminder trials of each type, followed by up to 10 trials of each novel odor alone without reward, interleaved with rewarded presentations of the initial odor to maintain responding. During the unrewarded, novel-odor extinction trials, both requirements to sample the odor for 1-s and respond to the reward well were lifted. This unblocking procedure was repeated two to three times per rat, using a new set of blocked, upshift and downshift odors each time.

Single-Unit Recording

Neural activity was recorded using four identical Plexon Multichannel Acquisition Processor systems (Dallas, TX), interfaced with odor discrimination training chambers described above. Following recovery from surgery and proficiency in shaping, electrodes were advanced daily until activity was obtained. Rats received reminder training using the pre-trained initial odor, as described above, during this process. Once rats showed proficient responding and single units were isolated, the rat began unblocking. During this three-day procedure, the electrode was moved ~167 μm between the first and second learning days in approximately ¾ of the lOFC recording group and all of the mOFC recording group. The electrode was advanced again following each three-day unblocking procedure in all rats. This was done between unblocking days and prior to repetition of this process in new odor cues in order to acquire neurons in a new location in OFC.

Statistical Data Analysis

Units were sorted using Offline Sorter software from Plexon Inc. (Dallas, TX) using a k-means algorithm. Sorted files were next processed in Neuroexplorer to extract relevant event markers and unit timestamps. These data were then analyzed in Matlab (Natick, MA). To analyze activity in response to the novel odors, we examined activity between 300–1300 ms subsequent to initial odor onset, which corresponded approximately with the novel odor delivery to the odor port. To analyze activity during the period while rats wait in the well, we examined activity between 0–1000 ms subsequent to reward well entry, which corresponded with the duration of time immediately prior to reward delivery. To analyze activity during the reward period, we examined activity between 1000–2000 ms subsequent to reward well entry, which corresponded with reward delivery and the first second of reward consumption. Analyses excluded the first seven trials to average a 10 and 5 trial cut-off for putative sensory firing on unblocking days 1 and 2, respectively. Firing rates were not normalized prior to z-score normalization within the task states. ANOVA on individual cells was restricted to cells in which there were 20 trials in each condition and included only those trials. ANOVA on Euclidean distance between task states with factors of region, day, and between/within trial epoch was performed on the combined within stage distances (4 dendrograms x 3 epochs x 6 pairwise measures) and between stage distances (4 dendrograms x 4 trial types x 3 pairwise measures).

Hierarchical clustering analysis

Analysis was restricted to cells with a baseline firing rate under 10 Hz. We used a 10 Hz high pass cut-off to exclude fast spiking cells, as we had in the original studies. We did so to focus our analysis on cells whose firing rate is consistent with the projection neurons of the OFC, glutamatergic pyramidal cells. This screen excluded 44 cells in lOFC and 4 cells in mOFC. For both the full regional pseudo-ensembles and more restricted ensembles of simultaneously recorded neurons, a matrix of neural responding was then created. Each row of the matrix was one of the twelve trial type/epoch pairs, and each column was the mean responding of each neuron during that period. Cells whose standard deviation in firing between states was in the bottom 5% for that population were removed in order to minimize amplification of noise within the data set. Rows were then normalized by converting each neuron’s firing to a standardized z-score for that row. Hierarchical agglomerative clustering was then performed on the Euclidean distance between rows, each of which represents a unique task state. Distances between each trial type or phase were the Euclidean distance between normalized firing for pairs of trial types or phases as determined by the linkage clustering analysis. Euclidean distances of standardized points (task states) were plotted for each day/brain region. Each group of nodes whose linkage was less than 50% of the maximal distance between groups is a unique color for visualization purposes. A detailed description of this analysis can be found in (Farovik et al., 2015; McKenzie et al., 2014).

Results

We recorded single-unit activity in lOFC and mOFC in 18 and 13 rats, respectively, during an odor-based unblocking task (Figure 1a). After implantation of microelectrodes in OFC, rats were trained to sample an odor in a central port following house light illumination and then respond to a reward well below for a single medium-sized drop of chocolate milk. This training was extensive, lasting for at least four days, and was meant to establish the initial odor as a reliable predictor of this specific outcome. Each rat then underwent 1–6 rounds of unblocking.

Figure 1. Experimental outline, behavior summary.

Figure 1

(a) Thirsty rats were initially trained to enter an odor port after a house light lit up, then to go to the reward well below to receive a drop of chocolate milk. 4 trial types were randomly intermixed in the unblocking session. The first was a reminder of initial training. On the other three trial types, the originally trained odor was briefly presented, followed by one of three novel odors. The reward following the novel odors was either unchanged (black; blocked trials), larger in size (blue; upshift trials), or smaller in size (green; downshift trials). In the probe test stage, we assessed learning by presenting the novel odors without a subsequent reward. (b) Time in the reward well on the probe test trials in rats with recording sites in lOFC. ANOVA for time spent in the reward well with odor (blocked, upshift, downshift), and trial (1–10) as factors found a significant effect of odor (ANOVA, F2,118 =24.25, p=1.51x10−9) and trial (ANOVA, F9,531 =19.89, p<1x10−13). Planned comparisons confirmed that in the first two two-trial block, rats spent significantly more time in the reward well following the upshift odor (p= 0.0086, p = 0.016, respectively) relative to the blocked odor. Rats also spent less time in the reward well following the downshift odor relative to the blocked odor (p= 0.0033, p = 0.0036, p = 0.012, p = 0.0018, 0.0072, respectively). * p<0.05, x p<0.01, + p<0.001. Error bars indicate standard error of the mean. (c) lOFC single unit activity was recorded from the lateral orbital and agranular insular cortices. Locations are shown at 3.24 and 3.72 mm anterior to bregma. AIV, AID = agranular insular area, LO = lateral orbital cortex. (d) Time in the reward well on the probe test trials in rats with recording sites in mOFC. ANOVA for time spent in the reward well with odor (blocked, upshift, downshift), and trial (1–10) as factors found a significant effect of odor (ANOVA, F2,66 =18.88, p=3.27x10−7) and trial (ANOVA, F9,297 =9.94, p=2.43x10−13). Planned comparisons confirmed that in the first two-trial block, rats spent significantly more time in the reward well following the upshift odor (p=0.0078) relative to the blocked odor. Rats also spent less time in the reward well following the downshift odor on trials 3–8 relative to the blocked odor (p=0.0306, p = 0.0009, p = 0.0003, respectively). * p<0.05, x p<0.01, + p<0.001. Error bars indicate standard error of the mean. (e) mOFC single unit activity was recorded from the medial orbital cortex with some recordings in or overlapping ventral orbital cortex. Locations are shown at 4.68 and 5.16 mm anterior to bregma. MO = medial orbital cortex, VO = ventral orbital cortex. (f) Task structure as organized into 12 states by trial epoch (Odor, Wait, Reward) and trial type (Downshift, Blocked, Initial, Upshift).

Each round of unblocking began with two days of training and consisted of four trial types (Figure 1a, compound training). One type was a reminder: the initially trained odor was followed by the expected outcome. On the other three trial types (upshift, downshift, blocked), rats were presented with the initially trained odor, followed immediately by one of three novel odors. On blocked trials, the novel odor was followed by the expected medium-sized drop of milk, whereas on upshift and downshift trials, the novel odor was followed by a noticeably larger or smaller drop of milk, respectively.

Rats learned to differentiate between the novel odor cues

In the unblocking sessions, rats were sensitive to presentation of the novel odors, exhibiting longer latencies to respond at the reward well following odor sampling on these three trial types. Longer latencies to the novel odors were most apparent on the very first trial of each session in both groups, particularly on day 1 (data not shown; ANOVA in the lOFC group revealed a main effect of trial (F19,1083=12.072, p<1x10−4) and a trial x day interaction (F19,1083=8.395, p<1x10−4)). ANOVA in the mOFC group revealed a main effect of trial (F19,608=7.6, p<1x10−4), cue, (F2,64=9.27, p=2.91x10−4), and a trial x day interaction (F19,608=2.17, p=0.0028)). In addition to this effect, the rats also learned that two of these odors predicted meaningful changes in the outcome. This was evident in the extinction probe test in which they initially spent more time in the fluid well following sampling of the upshift odor and less time following sampling of the downshift odor, versus the blocked odor, as if expecting more and less reward respectively on up- and downshift trials in both the lOFC (Figure 1b) and mOFC groups (Figure 1d).

Individual cells differentially respond to trial epoch and type in lOFC and mOFC

We recorded 334 single units during the first day of unblocking and 346 units on the second unblocking day in 60 rounds of training across all 18 rats in lOFC (Figure 1c). We recorded 188 single units during the first day of unblocking and 212 units on the second unblocking day in 34 rounds of training across all 13 rats in mOFC (Figure 1e). On day one, the proportion of odor responsive cells in mOFC was approximately 2/3 that in lOFC: 29/188 (15.4%) compared to 86/334 (25.7%). On day two, the proportion of odor responsive cells in mOFC was less than half that in lOFC: 25/212 (11.8%) compared to 88/346 (25.4%). These proportions are significantly different by two-sample t-test on both days, (p = 0.006 and p = 1x10−4, respectively). Odor responses were characterized in depth in previous reports (N Lopatina et al., 2015; N. Lopatina et al., 2016), and included cells responsive both to trial epoch and trial type in both regions. Here, we extend our analyses to the wait and reward epochs, including all recorded cells firing under 10 Hz, independent of their response characteristics. Individual units in lOFC (Figure 2a–e) and mOFC (Figure 2f–i) displayed a range of responses during the wait and reward epochs. Some cells differentiated only between trial epoch and not type (Figure 2a–b, f–g). Some cells differentiated between trial type only in one epoch (Figure 2c–d, h–i), and others in both epochs (Figure 2e, j).

Figure 2. Single unit firing of wait and reward responsive neurons.

Figure 2

Raster plots for firing of single units on initial (red), blocked (black), upshift (blue), and downshift (green) trials in units that differentiate between trial type or trial epoch during well entry or reward delivery. The first vertical line indicates well entry. The second vertical line indicates reward delivery 1 second following well entry. Each tick represents a spike. Average activity across all trials is plotted by odor (bottom). (a–e) Raster plots for lOFC show a) Response that does not differentiate between trial type that is highest during the wait epoch, b) Response that does not differentiate between trial type that is highest during the reward epoch, c) Response that differentiates between trial type during the wait epoch, d) Response that differentiates between trial type during the reward epoch, e) Response that differentiates between trial type during both the wait and reward epochs. (f–j) Raster plots for mOFC show f) Response that does not differentiate between trial type that is highest during the wait epoch, g) Response that does not differentiate between trial type that is highest during the reward epoch, h) Response that differentiates between trial type during the wait epoch, i) Response that differentiates between trial type during the reward epoch, j) Response that differentiates between trial type during both the wait and reward epochs.

To determine how these single unit responses varied by trial epoch and trial type, we next examined individual units’ firing during three 1s periods corresponding to odor delivery, responding at the well, and reward delivery. With a repeated measures 2-factor ANOVA with factors of trial type (blocked, upshift, downshift, initial) and epoch (odor, wait, and reward), we identified cells that showed an effect of trial type or epoch but not an interaction that was more significant than the main effect (p<0.05, Table 1). On both days, cells showed differential proportions of cells exhibiting a main effect of epoch or type. A larger proportion of lOFC cells had a significant effect of trial epoch, whereas a larger proportion of cells in mOFC had a significant effect of trial type. Chi-squared test on the number of units with an effect of epoch & type in each region found that the difference in the proportion of epoch and type responsive cells was significant (chi-square = 16.05, p<10−4). The F-statistics ranged between cells and are indicated in Table 2, as are degrees of freedom, interaction effects, and ANOVA results including cells exhibiting an interaction effect.

Table 1.

Percentage of individual cells responsive during trial epoch and type by day

lOFC mOFC
Day 1 Day 2 Average Day 1 Day 2 Average
Trial epoch 70.9 70.7 70.8 53.9 43.9 48.9
Trial type 18.3 20.5 19.4 28.4 24.4 26.4

Numerical values indicate the percentage of cells that showed a significant effect in two factor ANOVA within either epoch or type categories by day. Cells in which a significant interaction was more significant than the main effect were excluded, as were those whose interaction was up to 1000-fold less significant than the main effect.

Table 2.

Percentage of individual cells responsive during trial epoch and type by day

lOFC mOFC
Day 1 Day 2 avg Day 1 Day 2 avg
Trial epoch all significant cells 81 80.1 80.55 66.7 52.8 59.75
Trial epoch all significant cells’ F-stats (min/mean) F2,228=3.0/97.4 F2,228=3.0/91.9 F2,228=3.1/69.9 F2,228=3.0/58.1
Trial epoch excluding interaction 70.9 70.7 70.8 53.9 43.9 48.9
Trial epoch only significant cells’ F-stats (min/mean) F2,228=3.1/110.1 F2,228=3.1/103.8 F2,228=3.1/84.6 F2,228=3.0/68.1
Trial type all significant cells 39.8 38.5 39.15 48.2 34.4 41.3
Trial type all significant cells’ F-stats (min/mean) F3,228=2.6/7.8 F3,228=2.6/8.1 F3,228=2.7/9.6 F3,228=2.7/5.7
Trial type excluding interaction 18.3 20.5 19.4 28.4 24.4 26.4
Trial type only significant cells’ F-stats (min/mean) F3,228=2.6/9.5 F3,228=2.7/9.5 F3,228=2.7/9.4 F3,228=2.7/5.7
Interaction all significant cells 38.4 35 36.7 38.3 22.2 30.25
Interaction all significant cells’ F-stats (min/mean) F6,228=2.1/4.3 F6,228=2.2/5.0 F6,228=2.2/6.8 F6,228=2.1/4.3

Numerical values correspond to percentage of cells that showed a significant effect in two factor ANOVA within either epoch or type categories by day.

Primary hierarchical clustering of trial epoch in lOFC and trial type in mOFC in population pseudo-ensembles

We next performed a hierarchical clustering analysis on the same time windows and trial types as in our single unit analysis to see if this subregional difference in responses by trial epoch and trial type was reflected in clustering of the task space in the recorded populations (See Figure 1f for a schematic of the task space). This analysis revealed that activity in lOFC primarily grouped states by trial epoch, so that the odor states were grouped together, as were the wait and reward states (Figure 3a,e). By contrast, activity in mOFC showed a much greater influence of trial type on how the states were grouped (Figure 3i,m), particularly during un-cued epochs (wait and reward) on day 2, which were largely grouped into pairs. ANOVA of Euclidean distance between task states with factors of region, day, and between/within trial epoch found main effects of region (F1,112 = 18.89, p = 3.1x10−5) and between/within trial epoch (F1,112 = 61.21, p = 1.0x10−43), an interaction between day & between/within trial epoch (F1,112 = 4.25, p = 0.04), an interaction between region & between/within trial epoch (F1,112 = 116.8, p = 4.4x10−19), and an interaction between region, between/within trial epoch, and day (F1,112 = 10.1, p = 0.0019). Within and between task epoch distances are summarized in Figure 3q–t. These results suggest that lOFC activity is most similar within a trial epoch regardless of the trial type, whereas mOFC activity is most similar within trial type (Figure 3m).

Figure 3. Clustering of lOFC and mOFC population pseudo-ensembles by trial epoch and type.

Figure 3

(a–p) Dendrogram of hierarchical clustering results show clustering of events in three trial epochs (Odor, Well, Reward) and four trial types (Blocked, Upshift, Downshift, and Initial). Euclidean distance between task states indicates distance between events. Each group of nodes within the dendrogram whose linkage is less than 50% of the maximal distance between groups is assigned a unique color for visualization purposes. Curly brackets indicate groups which are meaningful within the task. One group per grouping type is labeled with that grouping’s main feature in adjacent text, with additional groupings indicated in curly brackets of the same color. Insets show the projection of the task states in the space of the first two principal components. lOFC subdivided into groups on the basis of epoch within the trial, while mOFC groupings were largely organized by an interaction of epoch and value. Dendrogram of hierarchical clustering results from cells with a baseline firing rate below 10 Hz recorded in (a–d) lOFC on day 1, (e–h) lOFC on day 2, (i–l) mOFC on day 1, (m–p) mOFC on day 2. The first column’s results are over trials 8–30, which are considered to reflect associative neural activity (a,e,i,m). The second columns’ results include the first half of the 30 trials examined (b,f,j,n). The third columns’ results include the second half of the 30 trials examined (c,g,k,o). The fourth columns’ results include the second principal component in the second half of the 30 trials examined (d,h,l,p). (q–t) Mean within and between epoch Euclidean distances as depicted in the above dendrograms. Distances are shown as a percentage of maximal distance between task states.

mOFC population activity changes over time

While there are no noticeable changes in hierarchical clustering in lOFC (early trials in Figure 3b,f and late trials in Figure 3c, g), there is a change in task structure in mOFC over the course of learning (early trials in Figure 3j,n and late trials in Figure 3k,o). On day 1, mOFC pseudoensembles largely resembles the epoch-based task representation in lOFC (Figure 3i), Particularly in the first half of trials (Figure 3j). However, a structure based on the representation of trial type emerges over learning (2nd day, Figure 3m–o). The difference in clustering within a trial type is most prominent in the second principal component (Figures 3a–p, inset). While variance along the first principal component is explained by whether or not a state is an odor, the second principal component maps on to trial epoch in lOFC (Figure 3d,h) and trial type in mOFC (Figure 3l,p). Only in mOFC does this structure require learning of the associations of the task, and also develops over the course of learning. Further, in mOFC, uncued epochs of trial type 2 (blocked) and 3 (initial) clustered closest between trial types on both days (Figure 3k–p). While the odor cues themselves have different associated values, the impending or received outcome during the wait and reward delivery periods are identical: rats receive a medium-sized bolus of chocolate milk on both trial types. This proximity also developed only after the first half of trials on day 1 in mOFC, and was most pronounced on the second day of unblocking training.

Primary hierarchical clustering of trial epoch in lOFC and trial type in mOFC in simultaneously recorded ensembles

We repeated our hierarchical clustering analysis in simultaneously recorded ensembles to see if individual ensembles displayed the same hierarchical clustering of the task space as the population pseudo-ensembles. We identified 45 ensembles with 8 or more units (30 in lOFC and 15 in mOFC with an average size of 12 units) and repeated the hierarchical clustering analysis on each individual ensemble (Ensembles by region and day in Figure 4a,d,g,j; all ensembles summarized in Figure 4m–n). Many of the ensembles’ dendrograms exhibited similar clustering to the dendrogram of their corresponding population. The ratio of between:within trial epoch distances for all ensembles are consistent with results in pseudoensembles (Figure 4c,f,i,l; bars colored cyan indicate ensembles in which the distances within and between trial epoch were significantly different by two sample t-test (p<0.01)). The clustering within the second principal component (Figure 4b,e,h,k) also resembles that of the population pseudoensembles. Table 3 summarizes ensemble numbers and sizes. Overall, we found lower within:between ratios in lOFC and higher ratios in mOFC, consistent with the population dendrograms.

Figure 4. Clustering of simultaneously recorded ensembles by trial epoch and type.

Figure 4

(a,b,d,e,g,h,j,k) Dendrograms as in figure 3 for example individual simultaneously recorded ensembles. Each group of nodes within the dendrogram whose linkage is less than 50% of the maximal distance between groups is assigned a unique color for visualization purposes. Curly brackets indicate groups which are meaningful within the task. One group per grouping type is labeled with that grouping’s main feature in adjacent text, with additional groupings indicated in curly brackets of the same color. Insets show the projection of the task states in the space of the first two principal components. Dendrogram of hierarchical clustering results from cells with a baseline firing rate below 10 Hz recorded in an ensemble recorded in (a–b) lOFC on day 1, 11 cells; (d–e) lOFC on day 2, 12 cells; (g–h) mOFC on day 1, 11 cells; (j–k) mOFC on day 2, 13 cells. The first column’s results include trials starting with trial 15 (a,d,g,j). The second columns’ results include the second principal component in the same set of trials as the first column (b,e,h,k). (c,f,I,l) the between epoch: within epoch distances for all ensembles recorded on the corresponding session, with ensembles in which the difference between distances was significantly different by t-test, p<0.01, indicated in cyan. (m–n) Mean within and between epoch Euclidean distances as depicted in the above dendrograms. Distances are shown as a percentage of maximal distance between task states.

Table 3.

Ensemble number, cell count, and percent significantly differing in within/between epoch distance

lOFC mOFC
day 1 day 2 day 1 day 2
Number ensembles 13 17 5 10
Average cells/ensemble 13.07 11.64 12.20 11.50

Discussion

We used our previously reported Pavlovian unblocking task data to investigate whether and how different subregions of the OFC construct a cognitive map of behavioral task space (Tolman, 1948; Wilson et al., 2014). We broke our Pavlovian unblocking task down into 12 states: three trial epochs (odor cue, waiting in the well, and reward delivery), each occurring in four trial types of different value. Using a hierarchical clustering analysis (Farovik et al., 2015), we examined how neural ensembles in medial and lateral OFC represented these task states. We found that ensembles in lOFC preferentially clustered states by trial epoch, while those in mOFC preferentially clustered states by trial type. This differential clustering was observed whether ensembles were composed of simultaneously recorded cells or consisted of pseudo-ensembles of cells recorded across sessions. The bias was also present in the single unit correlates, where more cells exhibited a main effect of trial epoch in lOFC and trial type in mOFC. We also examined clustering between equally valuable states: waiting in the well and reward delivery on initial and blocked trial types, which have identical outcomes. In both simultaneously recorded ensembles and pseudo-ensembles, we found closer clustering of equally valuable states in mOFC than lOFC, consistent with a heavier emphasis on clustering based on relative state values in mOFC.

Our results show that activity in the OFC provides what can be construed as a cognitive map of the task space. Ensemble and single unit activity clearly distinguished different epochs within each trial and also different trials. While in some ways this is not news, since many studies have shown that OFC neurons fire to all the events that comprise a trial (Kennerley, Dahmubed, Lara, & Wallis, 2009; Luk & Wallis, 2013; Padoa-Schioppa & Assad, 2006; Schoenbaum & Eichenbaum, 1995; Thorpe, Rolls, & Maddison, 1983), this is only the second report to our knowledge that has analyzed this activity specifically from the perspective of how the states are represented relative to one another. In the first (Farovik et al., 2015), representations of task states in a contextual, spatial digging task were compared to the organization of states in the same task in hippocampus (McKenzie et al., 2014), an area more traditionally associated with cognitive mapping (O’Keefe & Nadel, 1978). Those reports found very similar local representations of the individual task states in OFC and hippocampus, but a very different global picture, with OFC representations being organized globally based on the likelihood of reward. The similarities in OFC and hippocampal representations of task space is intriguing in light of speculation that the two regions may be engaged in parallel processing of such relationships (Wikenheiser & Schoenbaum, 2016), speculation supported by recent experimental work showing grid-like BOLD correlates in both areas (Constantinescu, O’Reilly, & Behrens, 2016). The unique effect of reward on the map of states in OFC highlights the importance of reward or of the goal of the behavior to the organization of state space in the OFC. Our result is in accord with this emphasis inasmuch as we found a logical organization of the states that reflects the demands and organization of the task. This is consistent with the idea that the prefrontal cortex generally and the orbitofrontal cortex specifically is involved in constructing cognitive maps in order to promote, as is classically claimed, the flexible organization of behavior (Constantinescu et al., 2016; Fuster, 1997; Wikenheiser & Schoenbaum, 2016).

However, our data also show that subregions in the OFC create cognitive maps that emphasize different features of the behavioral landscape relevant to expected outcomes and their respective values. mOFC predominantly grouped task states by trial type, a higher-order organization requiring episodic information, which reflected value in our task. This emphasis on value is reminiscent of the prior study (Farovik et al., 2015); recording in a relatively medial part of OFC, they found that whether or not a state was rewarded was the dominant organizing feature of the task space in their data. Our data suggest value is a less important organizing principle as you move into lOFC, which represented contextual information about task states according to their epoch within a trial. Thus, the odors are grouped together, as were the states reflecting the response in the well, and even reward. This was true even though states in a category differ dramatically in value. The representation of information orthogonal to value notable given recent ideas regarding prefrontal function (Koechlin & Summerfield, 2007) which highlight the importance of contextual and episodic information necessary for behavioral control. It is also important to keep in mind that while representations may differ across regions, the mOFC and lOFC are strongly interconnected both with each other and with other regions, thus one may support the other. The lateral network receives sensory inputs from several modalities: olfaction, taste, visceral afferents, somatic sensation and vision (Illig, 2005). The medial network is characterized as the largest cortical output to visceromotor regions in the hypothalamus and brainstem (Hoover & Vertes, 2011). Based on the connectivity differences between these two OFC subregions, it appears that there is a gradual rotation from input space to output space between the lOFC and mOFC.

These results extend our previous analyses of these data, which focused on the cue epoch and reported signaling of outcome features in lOFC and adjusted value in medial OFC. The present results examine the cue epoch signaling within the context of the other task epochs. The sensitivity of mOFC to trial type is consistent with signaling related to outcome value in the cue epoch, particularly in grouping of trial types relating to the same outcome together. It is also consistent with the results in the only other study of which we are aware that has looked at value-related signaling in rat mOFC, which reported that cells in the mOFC responded more strongly to cues predictive of low value than high value rewards (Burton, Kashtelyan, Bryden, & Roesch, 2013). In each case, value appears to be a critical organizing principle in mOFC. By contrast, the sensitivity of lOFC to trial epoch is somewhat orthogonal to our previous findings of predicted outcome feature signaling in this region. However, if one views each trial epoch as a sort of outcome of prior epochs in the trial, then the grouping of different epochs in lOFC would be in accord with encoding of the features of the outcomes in the cue epoch.

These differences in task space representations between the two regions is interesting in light of the debate about coding of value in mOFC versus lOFC (Hare, Camerer, Knoepfle, & Rangel, 2010; Hare, O’Doherty, Camerer, Schultz, & Rangel, 2008; Levy & Glimcher, 2011; McNamee, Rangel, & O’Doherty, 2013; Padoa-Schioppa, 2009, 2013; Padoa-Schioppa & Assad, 2006, 2008; Plassman, O’Doherty, & Rangel, 2010; Plassmann, O’Doherty, & Rangel, 2007; Strait, Blanchard, & Hayden, 2014; Xie & Padoa-Schioppa, 2016) and the emergence of theoretical accounts that attempt to dissociate the functions of these two subregions (Noonan, Kolling, Walton, & Rushworth, 2012; Noonan et al., 2010; Rudebeck & Murray, 2011a, 2011b; Walton, Behrens, Buckley, Rudebeck, & Rushworth, 2010). Examining the organization of task states and in particular whether the heightened emphasis on value in mOFC and on trial detail in lOFC will replicate in other task settings, particularly ones that dissociate value from the level of organization of the trials, could be a novel way to shed light on the differences in functions of these subregions.

Acknowledgments

The authors declare no competing financial interests. This work was supported by the Intramural Research Program at the National Institute on Drug Abuse. The opinions expressed in this article are the authors’ own and do not reflect the view of the NIH/DHHS.

References

  1. Burton AC, Kashtelyan V, Bryden DW, Roesch MR. Increased firing to cues that predict low-value reward in the medial orbitofrontal cortex. Cerebral Cortex. 2013 doi: 10.1093/cercor/bht189. epub ahead of print. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Constantinescu AO, O’Reilly JX, Behrens TE. Organizing conceptual knowledge in humans with a gridlike code. Science. 2016;352:1464–1468. doi: 10.1126/science.aaf0941. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Farovik A, Place RJ, McKenzie S, Porter B, Munro CE, Eichenbaum H. Orbitofrontal cortex encodes memories within value-based schemas and represents contexts that guide memory retrieval. Journal of Neuroscience. 2015;35:8333–8344. doi: 10.1523/JNEUROSCI.0134-15.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Fuster JM. The Prefrontal Cortex. 3. New York: Lippin-Ravencott; 1997. [Google Scholar]
  5. Gallagher M, McMahan RW, Schoenbaum G. Orbitofrontal cortex and representation of incentive value in associative learning. Journal of Neuroscience. 1999;19:6610–6614. doi: 10.1523/JNEUROSCI.19-15-06610.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Hare TA, Camerer CF, Knoepfle DT, Rangel A. Value computations in ventral medial prefrontal cortex during charitable decision making incorporate input from regions involved in social cognition. Journal of Neuroscience. 2010;30:583–590. doi: 10.1523/JNEUROSCI.4089-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Hare TA, O’Doherty J, Camerer CF, Schultz W, Rangel A. Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. Journal of Neuroscience. 2008;28:5623–5630. doi: 10.1523/JNEUROSCI.1309-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Heilbronner SR, Rodriquez-Romaguera J, Quirk GJ, Groenewegen HJ, Haber SN. Circuit based cortico-striatal homologies between rat and primate. Biological Psychiatry. doi: 10.1016/j.biopsych.2016.05.012. (in press) [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Hoover WB, Vertes WB. Projections of the medial orbital and ventral orbital cortex in the rat. The Journal of Comparative Neurology. 2011;519:3766–3801. doi: 10.1002/cne.22733. [DOI] [PubMed] [Google Scholar]
  10. Illig KR. Projections from orbitofrontal cortex to anterior piriform cortex in the rat suggest a role in olfactory information processing. Journal of Comparative Neurology. 2005;488:224–231. doi: 10.1002/cne.20595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Izquierdo AD, Murray EA. Bilateral orbital prefrontal cortex lesions disrupt reinforcer devaluation effects in rhesus monkeys. Society for Neuroscience Abstracts. 2000;26:978. [Google Scholar]
  12. Jones JL, Esber GR, McDannald MA, Gruber AJ, Hernandez G, Mirenzi A, Schoenbaum G. Orbitofrontal cortex supports behavior and learning using inferred but not cached values. Science. 2012;338:953–956. doi: 10.1126/science.1227489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Kennerley SW, Dahmubed AF, Lara AH, Wallis JD. Neurons in the frontal lobe encode the value of multiple decision variables. Journal of Cognitive Neuroscience. 2009;21:1162–1178. doi: 10.1162/jocn.2009.21100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Koechlin E, Summerfield C. An information theoretical approach to prefrontal executive function. Trends in Cognitive Sciences. 2007;11:229–235. doi: 10.1016/j.tics.2007.04.005. [DOI] [PubMed] [Google Scholar]
  15. Levy DJ, Glimcher PW. Comparing apples and oranges: Using reward-specific and reward-general subjective value representation in the brain. Journal of Neuroscience. 2011;31:14693–14707. doi: 10.1523/JNEUROSCI.2218-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Lopatina N, McDannald MA, Styer CV, Sadacca BF, Cheer JF, Schoenbaum G. Lateral orbitofrontal neurons acquire responses to upshifted, downshifted, or blocked cues during unblocking. eLIFE. 2015 doi: 10.7554/eLife.11299. epub. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Lopatina N, McDannald MA, Styer CV, Peterson JF, Sadacca BF, Cheer JF, Schoenbaum G. Medial Orbitofrontal Neurons Preferentially Signal Cues Predicting Changes in Reward during Unblocking. J Neurosci. 2016;36(32):8416–8424. doi: 10.1523/jneurosci.1101-16.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Luk C-H, Wallis JD. Choice coding in frontal cortex during stimulus-guided or action-guided decision-making. Journal of Neuroscience. 2013;33:1864–1871. doi: 10.1523/JNEUROSCI.4920-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. McKenzie S, Frank AJ, Kinsky NR, Porter B, Riviere PD, Eichenbaum H. Hippocampal representation of related and opposing memories develop with distinct, heirarchically organized neural schemas. Neuron. 2014;83:202–215. doi: 10.1016/j.neuron.2014.05.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. McNamee D, Rangel A, O’Doherty JP. Category-dependent and category-independent goal-value codes in human ventromedial prefrontal cortex. Nature Neuroscience. 2013;16:479–485. doi: 10.1038/nn.3337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Noonan MP, Kolling N, Walton ME, Rushworth MF. Re-evaluating the role of the orbitofrontal cortex in reward and reinforcement. European Journal of Neuroscience. 2012;35:997–1010. doi: 10.1111/j.1460-9568.2012.08023.x. [DOI] [PubMed] [Google Scholar]
  22. Noonan MP, Walton ME, Behrens TE, Sallet J, Buckley MJ, Rushworth MF. Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex. Proceedings of the National Academy of Science. 2010;107:20547–20552. doi: 10.1073/pnas.1012246107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. O’Keefe J, Nadel L. The Hippocampus as a Cognitive Map. Oxford: Clarendon Press; 1978. [Google Scholar]
  24. Padoa-Schioppa C. Range-adapting representation of economic value in the orbitofrontal cortex. Journal of Neuroscience. 2009;29:14004–14014. doi: 10.1523/JNEUROSCI.3751-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Padoa-Schioppa C. Neuronal origins of choice variability in economic decisions. Neuron. 2013;80:1322–1336. doi: 10.1016/j.neuron.2013.09.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Padoa-Schioppa C, Assad JA. Neurons in orbitofrontal cortex encode economic value. Nature. 2006;441:223–226. doi: 10.1038/nature04676. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Padoa-Schioppa C, Assad JA. The representation of economic value in the orbitofrontal cortex is invariant for changes in menu. Nature Neuroscience. 2008;11:95–102. doi: 10.1038/nn2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Pickens CL, Setlow B, Saddoris MP, Gallagher M, Holland PC, Schoenbaum G. Different roles for orbitofrontal cortex and basolateral amygdala in a reinforcer devaluation task. Journal of Neuroscience. 2003;23:11078–11084. doi: 10.1523/JNEUROSCI.23-35-11078.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Plassman H, O’Doherty JP, Rangel A. Appetitive and aversive goal values are encoded in the medial orbitofrontal cortex at the time of decision making. Journal of Neuroscience. 2010;30:10799–10808. doi: 10.1523/JNEUROSCI.0788-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Plassmann H, O’Doherty J, Rangel A. Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. Journal of Neuroscience. 2007;27:9984–9988. doi: 10.1523/JNEUROSCI.2131-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Rudebeck PH, Murray EA. Balkanizing the primate orbitofrontal cortex: distinct subregions for comparing and contrasting values. Annals of the New York Academy of Sciences. 2011a;1239:1–13. doi: 10.1111/j.1749-6632.2011.06267.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Rudebeck PH, Murray EA. Dissociable effects of subtotal lesions within the macaque orbital prefrontal cortex on reward-guided behavior. Journal of Neuroscience. 2011b;31:10569–10578. doi: 10.1523/JNEUROSCI.0091-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Rudebeck PH, Saunders RC, Prescott AT, Chau LS, Murray EA. Prefrontal mechanisms of behavioral flexibility, emotion regulation and value updating. Nature Neuroscience. 2013;16:1140–1145. doi: 10.1038/nn.3440. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Schoenbaum G, Eichenbaum H. Information coding in the rodent prefrontal cortex. I. Single-neuron activity in orbitofrontal cortex compared with that in pyriform cortex. Journal of Neurophysiology. 1995;74:733–750. doi: 10.1152/jn.1995.74.2.733. [DOI] [PubMed] [Google Scholar]
  35. Strait CE, Blanchard TC, Hayden BY. Reward value comparison via mutual inhibition in ventromedial prefrontal cortex. Neuron. 2014;82:1357–1366. doi: 10.1016/j.neuron.2014.04.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Thorpe SJ, Rolls ET, Maddison S. The orbitofrontal cortex: neuronal activity in the behaving monkey. Experimental Brain Research. 1983;49:93–115. doi: 10.1007/BF00235545. [DOI] [PubMed] [Google Scholar]
  37. Tolman EC. Cognitive maps in rats and men. Psychological Review. 1948;55:189–208. doi: 10.1037/h0061626. [DOI] [PubMed] [Google Scholar]
  38. Walton ME, Behrens TEJ, Buckley MJ, Rudebeck PH, Rushworth MFS. Separable learning systems in the macaque brain and the role of the orbitofrontal cortex in contingent learning. Neuron. 2010;65:927–939. doi: 10.1016/j.neuron.2010.02.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. West EA, DesJardin JT, Gale K, Malkova L. Transient inactivation of orbitofrontal cortex blocks reinforcer devaluation in macaques. Journal of Neuroscience. 2011;31:15128–15135. doi: 10.1523/JNEUROSCI.3295-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Wikenheiser AM, Schoenbaum G. Over the river, through the woods: cognitive maps in the hippocampus and orbitofrontal cortex. Nature Reviews Neuroscience. 2016 doi: 10.1038/nrn.2016.56. epub ahead of print. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Wilson RC, Takahashi YK, Schoenbaum G, Niv Y. Orbitofrontal cortex as a cognitive map of task space. Neuron. 2014;81:267–279. doi: 10.1016/j.neuron.2013.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Xie J, Padoa-Schioppa C. Neuronal remapping and circuit persistence in economic decisions. Nature Neuroscience. 2016;19:855–861. doi: 10.1038/nn.4300. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES