Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Jun 4.
Published in final edited form as: Neuron. 2014 Jun 4;82(5):1171–1182. doi: 10.1016/j.neuron.2014.04.028

Brain networks for exploration decisions utilizing distinct modeled information types during contextual learning

Jane X Wang 1,*, Joel L Voss 1
PMCID: PMC4081497  NIHMSID: NIHMS590699  PMID: 24908493

Summary

Exploration permits acquisition of the most relevant information during learning. However, the specific information needed, the influences of this information on decision-making, and the relevant neural mechanisms remain poorly understood. We modeled distinct information types available during contextual association learning and used model-based fMRI in conjunction with manipulation of exploratory decision-making to identify neural activity associated with information-based decisions. We identified hippocampal-prefrontal contributions to advantageous decisions based on immediately available novel information, distinct from striatal contributions to advantageous decisions based on the sum total available (accumulated) information. Furthermore, network-level interactions among these regions during exploratory decision-making were related to learning success. These findings link strategic exploration decisions during learning to quantifiable information and advance understanding of adaptive behavior by identifying the distinct and interactive nature of brain-network contributions to decisions based on distinct information types.


Exploration behaviors during learning critically determine the information that is available and can be used to strategically acquire specific information needed to fill gaps in our memory/knowledge (Metcalfe and Jacobs, 2010). Exploration can thus determine what is learned, and learned information can in turn determine what will be explored. However crucial these mutual exploration-learning interactions are for memory success, little is known regarding their dynamics or neural mechanisms in humans.

Nonhuman animals can explore adaptively to improve learning. For instance, rodents sporadically exhibit iterative viewing of options at decision points during maze learning. This exploration pattern predicts learning success and effective generalization when the maze is subsequently altered (Tolman, 1948) and has been associated with hippocampal function (Buckner, 2010; Johnson and Redish, 2007). We have identified hippocampal-centered brain networks in humans associated with exploration behaviors that enhance learning, relative to receipt of the same stimuli but without active exploration (Voss et al., 2011a; 2011b). Interestingly, a specific exploration pattern that enhanced learning and hippocampal-prefrontal engagement was the revisitation of recently seen objects (Voss et al., 2011b), similar to the strategic exploration pattern observed in rodent maze learning. These findings implicate hippocampus and prefrontal cortex in online control of exploration (Buckner, 2010; Eichenbaum and Fortin, 2009; Wang et al., in press), which could extend current functional accounts of these structures in advantageous decisions based on long-term memory (Buckner and Carroll, 2007; Schacter et al., 2012). In parallel research, dopamine-modulated pathways centered on the basal ganglia have been associated with strategic exploration during reinforcement learning and reward seeking (Hills, 2006; Pennartz et al., 2009), which could interact with hippocampus to support joint memory-reward influences on exploration (Shohamy and Adcock, 2010). However, further specification of the unique and interactive roles of hippocampus, prefrontal cortex, and basal ganglia in exploration will require measurement of the information that must be learned, such that the exploration decisions made to acquire this information can be isolated.

Indeed, it is an exceptional challenge to quantify the information on which individuals base exploration decisions during learning. Although it is possible to measure visual information for many stimuli (Beard and Ahumada, 1998), including entropy information relevant to novelty (Strange et al., 2005), this information does not necessarily drive exploration decisions. For instance, episodic learning is critically dependent on conceptual, gist, contextual, and other information types that are difficult to quantify. Moreover, current decision-making models, such as those for reinforcement learning, capitalize on the strong influence of reward on behavior to estimate internal decision variables (Frank and Claus, 2006), and in doing so conflate information available in the environment, information that is actually learned, and putative decision-making processes. Because available information cannot be isolated by these models (and likewise for many models of perceptual decisions), they do not permit isolation of the exploration decisions used to selectively acquire this information. Furthermore, existing decision-making models generally account for learning of single parameters such as reward likelihood or perceptual identity (Ding and Gold, 2013). In contrast, episodic learning can require the integration of multiple information types over time (i.e., objects sampled within scenes, associations among sequentially presented items, etc.), thereby increasing the uncertainty of directly modeling decision-related variables.

To overcome these challenges, we adopted a blended modeling and experimental approach, whereby we modeled the information available during episodic learning and manipulated the ability to control exploration in order to isolate decisions based on modeled information. A contextual-association learning task required exploration of different contexts to identify contextual rules for item-item associations (similar to Badre et al., 2009). This allowed us to quantify contextual association information relevant for learning, based on extensions of optimal foraging theory that consider information as a finite resource that requires sampling (Hills, 2006; Pirolli and Card, 1999). Using a simple model with minimal assumptions, we quantified two aspects of information conceptualized as having distinct influences on learning and exploration (Frank et al., 2001; Johnson et al., 2012): (1) newly available information (NAI), which is the increase in available information provided when an event provides new information regarding contextual associations, and (2) accumulated available information (AAI), which is the total information previously encountered during exploration measured at any moment. To isolate exploration decisions, we manipulated the ability to actively explore using a condition in which subjects could control exploration (Active Learning) versus a condition in which the same information was passively studied (Passive Learning, as in Voss et al., 2011a; 2011b). This allowed us to isolate behavioral and neural correlates of exploration decisions based on modeled NAI and AAI using model-based fMRI in conjunction with comparisons between Active and Passive conditions.

We reasoned that neural activity associated with Active decisions based on NAI (relative to Passive exposure to NAI) would implicate regions in exploration decision-making based on information that is immediately novel. Although prevailing accounts of hippocampal and prefrontal contributions to adaptive behavior emphasize long-term memory (Buckner and Carroll, 2007; Schacter et al., 2012), we found hippocampal and prefrontal involvement in NAI-based decisions, reflecting their role in the immediate use of novel information to support exploration decisions. In contrast, we identified regions of dorsal striatum associated with Active decisions based on AAI. This implicates dorsal striatum in exploration decisions based on accumulated information, substantiating theorized roles in strategic behavioral planning (Alexander et al., 1986; Martin, 1996) beyond involvement in slow learning of predictable stimulus-response associations (Packard and Knowlton, 2002). Finally, measures of background connectivity (Norman-Haignere et al., 2012) were analyzed to test putative network-level interactivity among these AAI-related and NAI-related regions in relation to advantageous exploration decisions. We found that greater interactivity predicted superior learning, indicating an important role for interplay of AAI- and NAI-related processing for advantageous exploration decisions.

Results

Relationships among NAI and AAI, exploration strategies, and learning

On each trial, an object and two faces were presented in one of four screen quadrants (Figure 1). The object had two features (shape and texture), and the quadrant determined the feature that was relevant for the object-face association. Subjects learned the correct object-face pairings and thus the relevant feature for each quadrant based on feedback. We used the pattern of quadrant visits and object-face pairings to calculate NAI and AAI (Figure 2, Experimental Procedures). We first sought to identify effects of NAI and AAI on exploration choices and on learning success in the Active condition, using the full sample (N=42) such that we could identify exploration strategies used by high-performing subjects (performing above chance; see Experimental Procedures) in contrast to the lack of effective strategies in low-performing subjects (performing at or below chance).

Figure 1. Contextual object-face association task.

Figure 1

(A) Contextual associations were based on either shape or texture features of objects that served as cues. In shape quadrants, only shape (e.g. star-shaped versus pentagon-shaped) determined the correct object-face associations. In texture quadrants, only texture (e.g. white circles versus black dots) determined the correct object-face associations. (B) Example configuration of quadrants, which varied for different blocks of the experiment, with two shape and two texture quadrants in each block. Subjects were not instructed regarding the salient feature in each quadrant, but were required to learn contextual associations via feedback. (C) Each trial involved highlight of the selected quadrant followed by presentation of the object cue and two faces, during which subjects attempted to select the target face. Trials concluded with feedback. (D) Example quadrant sequence, with the quadrant selected for each trial highlighted in blue.

Figure 2. Quantifying contextual association information.

Figure 2

(A) Example quantification of contextual information for a texture quadrant. Eight trials are shown, and faces are perfectly correlated with texture, but not well correlated with shape, as depicted by contingency tables of counts for co-occurrences of each feature with each face for consecutive 3-trial intervals. High-information trials are those with nonzero difference in covariation for texture versus shape and produce NAI, which is large initially but diminishes with subsequent high-information trials. AAI is the integration of NAI over time for the current quadrant. (B) Example measurement of information for idealized (left) and random (right) sequences of quadrant visits. Colored lines corresponding to quadrant colors represent quadrant-specific available information, while light green and dark green lines represent NAI and AAI, respectively. Idealized (i.e., consecutive) quadrant visits produce rapid increase of AAI for the current quadrant. In contrast, random quadrant visits produces minimal NAI and AAI increases for each quadrant. Note that AAI can appear to decrease when switching quadrants because it is a measure specific to the current quadrant (as rules in all quadrants are independent and therefore must be determined separately). AAI for each quadrant persists across exploration of other quadrants.

Quadrant visits during the first half of learning (“early learning”) were characterized by persistence (i.e., consecutive selections of the same quadrant). Notably, this strategy was more robust for high-accuracy subjects (n = 26; see Experimental Procedures). The probability of shifting quadrants was significantly lower for high-accuracy subjects than low-accuracy subjects [t(40) = 2.21, p = 0.032] (Figure 3A). Indeed, there was a significant interaction of accuracy (low versus high) with the shift probability in early versus later learning [F(1,78) = 6.027, p = 0.016], because high-accuracy subjects shifted less in early than in later learning [t(25) = 6.29, p < 0.0001], whereas low-accuracy subjects did not [t(15) = 0.09, p = 0.930]. Thus, only high-accuracy subjects made quadrant visits suggesting sensitivity to modeled information; i.e., their persistence strategy maximized NAI and AAI early in learning.

Figure 3. Patterns of exploration decisions and concomitant contextual information.

Figure 3

(A) Patterns of quadrant visits for all subjects, ordered by test accuracy. The dotted line represents cutoff for high versus low accuracy (80%; see Experimental Procedures). Quadrants 1, 2, 3, and 4 indicate the 1st, 2nd, 3rd, and 4th quadrants visited by each subject, not an indication of the absolute quadrants visited. The bar graph (bottom) shows mean probability of shifting quadrants for both accuracy groups (Low and High) in early and late Learning. (B) Sequence of the four information trial types ordered as in A. The line graph (bottom) shows the histogram of the four trial types over time, averaged over all runs. Error bars represent standard error of the mean (s.e.m.). Line thickness represents point-by-point s.e.m. * p < 0.05, ***p < 0.001.

We categorized each trial according to NAI versus AAI (Figure 2B, see Experimental Procedures), and hypothesized that both factors could contribute to stay/shift exploration decisions. Trials were categorized into four types based on the presence/absence of NAI and AAI: 1) no NAI and no AAI (“NAI−/AAI−”), 2) NAI but no AAI (“NAI+/AAI−”), 3) both NAI and AAI (“NAI+/AAI+”), and 4) no NAI but AAI (“NAI−/AAI+”). These trial types correspond to the presence or absence of information available, but not necessarily the information that subjects learned or retrieved from memory. For high accuracy subjects, performance varied based on information category. Accuracy increased from lowest to highest across the NAI−/AAI−, NAI+/AAI−, NAI+/AAI+, and NAI−/AAI+ trial types [mean = 62.6, 76.3, 83.2, 85.8, respectively; main effect of information type, F(3,96) = 5.75, p = 0.001], indicating that they were acquiring most of the available information. In contrast, performance of low-accuracy subjects did not indicate acquisition of information according to the four categories [mean = 54.7, 56.5, 61.9, 65.7; F(3,56) = 0.911, p = 0.441]. Furthermore, the average relative prevalence of the four information categories changed throughout learning, paralleling changes in the persistence behavior of high-accuracy subjects (Figure 3B versus 3A). There were more NAI+ trials earlier in Learning when persistence was also high, whereas there were more AAI+ trials later in Learning (Figure 3B). Exploration decisions in high-accuracy subjects thus paralleled changes in information, whereas the decisions of low-accuracy subjects were less sensitive to the availability of information.

To test for relationships between information and individual decisions to stay versus shift, we examined the probability of shifting quadrants given trial information category. High-accuracy subjects were increasingly more likely to shift across the NAI−/AAI−, NAI+/AAI−, NAI+/AAI+, and NAI−/AAI+ categories [Figure 4; F(3,96) = 4.875, p < 0.005; calculated relative to simulated chance likelihood, see Experimental Procedures]. High-accuracy subjects were thus least likely to shift with no information (NIA/AAI− trials) and increasingly likely to shift given increments in information. In contrast, stay/shift decisions in low-accuracy subjects did not vary significantly by information category [F(3,56) = 0.373, p = 0.773].

Figure 4. Information-based shift/stay decisions.

Figure 4

(A) Probability of shifting quadrants during Active Learning (Z-score, calculated relative to simulated random chance, see Experimental Procedures) for each information trial type, for low-accuracy subjects and high-accuracy subjects (see also Figure S1). (B) Probability of shifting quadrants plotted versus test accuracy for each information trial type. Background shading on linear fits represents 95% confidence intervals. Error bars represent s.e.m. * p < 0.05, **p < 0.01, ***p < 0.001.

To test strategic value of information-based decisions, we analyzed shift tendency given NAI and AAI in relationship to performance during the subsequent memory test. The tendency to shift given no available information (NAI−/AAI−) was negatively associated with memory performance [Figure 4; r(40) = −0.486, p < 0.005]. In contrast, the tendency to shift after trials with any AAI was positively associated with memory performance [NAI+/AAI+: r(40) = 0.565, p < 0.0001; NAI−/AAI+: r(40) = 0.499, p < 0.001]. The tendency to shift following newly available information but without accumulated available information (NAI+/AAI−) was unrelated to performance. Information-based exploration decisions were thus strategic in correlating with superior learning. Collectively, information-based decisions (probability of shifting for all four trial types) accounted for a large proportion of the variance in test accuracy [multiple linear regression, Radj2=0.462, F(4,37) = 9.795, p < 0.0001]; i.e., of the many factors that could have caused individual differences in memory performance, stay/shift decisions based on modeled NAI and AAI explained approximately 50% of the variability.

Information-based decisions versus passive information exposure

NAI and AAI quantified information availability, not necessarily its successful acquisition. Because NAI and AAI were based solely on the sequence of quadrant visits and the stimuli observed, they were equivalent for Active and Passive conditions (Experimental Procedures). We thus compared these conditions to isolate effects of Active information-based decisions, versus nonspecific effects of Passive exposure to the same information (Figure 5A, Experimental Procedures). As in previous findings of improved learning for self-directed exploration (Voss et al., 2011a; 2011b) memory performance was superior following Active versus Passive Learning [Figure 5B; t(41) = 2.350, p = 0.024], despite the same overall information availability. Thus, better learning occurred when subjects controlled stay/shift decisions based on their own assessments of information.

Figure 5. Benefits of Active versus Passive Learning.

Figure 5

(A) In the Active Learning condition, subjects selected quadrants on each trial. In the Passive condition, the quadrant order was predetermined (yoked to previous participants’ patterns of quadrant selections made during the Active condition, see Experimental Procedures). (B) Accuracy during memory testing was greater following Active versus Passive Learning. (C) Accuracy achieved for a sequence of quadrant visits in the Active Learning condition was not significantly correlated with the accuracy achieved when the same sequence was viewed in the Passive Learning condition (p=0.333), exemplifying the importance of information-based exploration decisions made by subjects. Marker size indicates average NAI calculated from the sequence of quadrant visits. The green solid line indicates linear fit, with gray dashed line indicating diagonal. Error bars represent s.e.m. *p < 0.05.

To test the extent that performance depended on the “quality” of information provided in Learning, we compared memory performance following Active Learning in one subject to memory performance by the subject receiving the same information in the Passive condition. We reasoned that if a particular sequence of quadrant visits and stimuli were advantageous, then accuracy would be high for both the subject that generated it in the Active condition and for the subject that received it during the Passive condition. In general, if performance depended on only the available information, then accuracy should be positively related for the two conditions. However, we found a non-significant correlation [r(40) = 0.159, p = 0.334], which does not support the notion that the information provided during Passive learning significantly impacted learning success. For comparison, robust correlation between accuracy for each subject’s Active compared to the same subject’s Passive Learning conditions [r(40) = 0.565, p < 0.0001] indicated reliable individual differences in learning capability relevant for both conditions. Findings thus collectively suggest that advantageous information-based decisions were made in the Active condition, and therefore within-subjects comparisons between the Active and Passive conditions could be used to identify brain activity relevant for information-based decisions.

Neural activity associated with NAI-versus AAI-based decisions

Model-based fMRI was used in conjunction with Active/Passive comparisons to identify neural activity associated with information-based decisions in the fMRI subsample. Similar relationships between information-based exploration decisions and performance were identified in the subsample as in the full sample (Figure S1). In order to isolate neural correlates of NAI-versus AAI-based decisions, we assessed neural activity in relation to trial-by-trial measures of NAI and AAI for the Active condition compared to the same measures for the Passive condition. Because only high-accuracy subjects reliably made NAI- and AAI-based decisions, analyses concerned only these subjects (n=15; Experimental Procedures).

Activity in distinct regions was uniquely associated with decisions based on NAI versus AAI. The Active versus Passive comparison for NAI identified activity of anterior hippocampus and anterior prefrontal cortex, including superior and inferior frontopolar cortex (FPC) regions that extended into, respectively, dorsolateral and ventral orbitofrontal cortex (BA 10, 46, and 47; Figure 6A–B, Table 1). These findings implicate hippocampus and FPC in the immediate use of novel information to make exploration decisions. This contribution was independent from general learning and/or long-term memory processing, as nonspecific processing was similarly present for Active and Passive conditions. In contrast, the Active versus Passive comparison for AAI identified activity of inferior parietal lobule (IPL, primarily angular gyrus) and dorsal striatum, including caudate nucleus and putamen (Figure 6A–B, Table 1, and Figure S2).

Figure 6. Brain activity and connectivity supporting exploration decisions based on NAI and AAI.

Figure 6

(A) Brain activity associated with exploration decisions based on NAI (left) and AAI (right). Mean NAI and AAI values are plotted across the learning session, averaged over all subjects and sessions. (B) Beta values for fit to NAI and AAI parametric regressors shown separately for Active and Passive conditions (see also Figure S2). (C) Background connectivity during Active Learning among the regions shown in A (Table 1). Width of lines represents connectivity and color indicates significance level for prediction of memory performance during subsequent test. No connectivity values in the Passive condition predicted memory performance. Significance was determined at an FDR-corrected threshold of p < 0.05. MRI coordinates are MNI. Shading of line plots and error bars indicate s.e.m. sFPC, superior frontopolar cortex; iFPC, inferior frontopolar cortex; Hipp., hippocampus; Caud., caudate; Put., putamen; IPL, inferior parietal lobule.

Table 1. Summary of fMRI findings.

Brodmann areas are listed such that the majority of the cluster is located within the first BA, while extending into the second BA.

Region MNI centroid coordinates Brodmann
area
Volume
(mm3)
X (mm) Y (mm) Z (mm)
Information intake Active > Passive
L. Superior frontopolar −36 46 22 10/46 786
L. Inferior frontopolar −41 38 −4 10/47 783
R. Hippocampus (body)* 30 −21 −24 N/A 138
Accumulated information Active > Passive
R. Inferior parietal lobule 42 −64 41 39/40 493
R. Caudate (head) 18 17 4 N/A 449
R. Putamen 24 −4 3 N/A 395**
Shift > Stay given information (Active)
L. hippocampus (head)* −28 −14 −17 N/A 91
*

Identified by the MTL-targeted analysis, see Experimental Procedures.

**

p = 0.055, corrected.

We tested putative interactions among the brain regions associated with NAI- and AAI-based decisions using the “background connectivity” method (Norman-Haignere et al., 2012). This procedure identifies connectivity due to sustained interactions among regions that are independent from stimulus-evoked interregional similarities (i.e., correlation among residuals after estimation and removal of stimulus-evoked activity). We hypothesized that the correlated activity of hippocampus-FPC (NAI-based decisions) and caudate-putamen-IPL (AAI-based decisions) would provide evidence for the interaction of NAI-related and AAI-related processing supporting advantageous exploration decisions. We compared background connectivity for the Active versus the Passive conditions for the regions associated with NAI- and AAI-based decisions (Table 1) and related this connectivity to performance achieved during later memory testing. This allowed us to identify connectivity associated with advantageous exploration decisions. In the Active condition, connectivity among several regions was significantly related to accuracy (ranging from Spearman’s ρ = 0.57 to 0.86; p < 0.05, FDR-corrected for multiple comparisons). This included connectivity between caudate and putamen as well as connectivity of these regions with hippocampus and both superior and inferior FPC (Figure 6C). Notably, hippocampus and FPC regions did not demonstrate connectivity with one another predictive of performance (despite high overall connectivity), but rather did so individually with caudate and putamen. Although average connectivity in the Passive condition was not statistically different from that in the Active condition at the group level (all FDR-corrected p-values > 0.51), individual differences in connectivity were unrelated to accuracy (all FDR-corrected p-values > 0.68), unlike in the Active condition. These findings suggest that connectivity within hippocampal-striatal and cortico-striatal networks and coordination between these networks supported exploration decisions in the Active condition associated with better learning.

Isolating behavioral and neural correlates of exploration decisions

In order to pinpoint specific relationships among information, behavior, and neural activity, we sought to identify behavioral expressions of information processing associated with exploration decisions and their neural correlates. We used eye-movement tracking to identify behaviors during the face-selection period indicative of the forthcoming decision to shift from or stay within the current quadrant made on the next trial. We reasoned that eye movements could provide a covert measure of information processing related to this decision, as eye movements can provide covert measures of processing in other learning and memory settings (Hannula et al., 2010). Indeed, preferential viewing of the target face (relative to the foil) increased as subjects learned context-dependent associations during Active and Passive Learning sessions (Figure S3), indicating that eye movements expressed knowledge of contextual associations that could be used to make exploration decisions.

Timecourse analysis indicated that preferential target viewing occurred during face-selection periods immediately prior to decisions to shift quadrants (versus stay) during Active Learning, and only when information (either NAI or AAI) was available (Figure 7A). When collapsed across all 20 time points, preferential target viewing increased before shift decisions in the Active condition when information was present [t(11) = 2.554, p = 0.027]. Analysis of individual time points indicated that the 4 immediately before face selection showed significant shift/stay pairwise differences (FDR-corrected p < 0.05), whereas 0 showed significant pairwise differences in the Active condition with no information. Likewise, preferential target viewing was not observed in the Passive condition regardless of information (no differences when collapsing across time points and 0 significant pairwise differences). This is consonant with our finding that performance was not sensitive to information in the Passive condition and our interpretation that performance is higher in the Active condition because high-accuracy subjects used assessments of information to guide exploration (as indicated here by eye-movement measures of memory associated with information-based decisions).

Figure 7. Viewing patterns and hippocampal activity associated with stay/shift decisions given information.

Figure 7

(A) Plots show the target/foil viewing bias obtained via eye-tracking during the face-selection period immediately preceding decisions to stay in the same quadrant versus decisions to shift to a new quadrant. Target viewing bias was only observed in the Active Learning condition and only for trials with any available information (NAI+ or AAI+). Orange arrows indicate onset of the object/face display, blue arrows indicate the selection response (Active condition), and red arrows indicate the end of the selection period (Passive condition). (see Experimental Procedures and Figure S3). (B) Corresponding activity of anterior hippocampus for face-selection periods with subsequent shift (versus stay) decision in trials with any available information (NAI+ or AAI+). Coordinates are MNI.Shading of line plots indicates s.e.m.

* Pairwise stay versus shift bias difference p < 0.05, FDR corrected.

A corresponding analysis of neural activity during the face-selection period prior to exploration decisions (shift versus stay decisions for trials with the presence of either AAI or NAI) identified activity of anterior hippocampus (Figure 7B). Paralleling the eye-movement effects, this activity was selective to trials with available information (NAI or AAI) and for the Active Learning condition only. Hippocampal activity therefore directly corresponded with the eye-movement patterns, thus establishing tight linkage between hippocampal activity and specific eyemovement behavioral correlates of information processing that support exploration decisions.

Discussion

By quantifying information available to individuals concerning contextual object-face associations and manipulating the opportunity for self-directed exploration, we identified neural activity associated with exploration decisions during learning based on information. Contextual association information was modeled as a finite and spatially localized resource (Hills, 2006; Pirolli and Card, 1999), providing a simple metric relevant to task performance. Further, we fractionated information into NAI and AAI in order to account for differences in decisions utilizing immediately available new information (NAI) versus persistently available accumulated information (AAI), as these distinct information types have been theorized to distinctly influence learning and exploration (Frank et al., 2001; Johnson et al., 2012). Information-based exploration decisions were strategic in that they maximized the rate of newly available information and improved learning, as demonstrated by better performance in subsequent memory tests. By comparing Active to Passive Learning conditions that were matched in information availability, we identified networks of brain regions involved in advantageous information-based decisions. Regions of caudate, putamen, and IPL were associated with AAI-based exploration, whereas superior and inferior FPC and hippocampal regions were associated with NAI-based exploration. Background connectivity among these regions predicted learning success, but only when subjects made exploration decisions in the Active condition. By quantifying the use of available information to guide exploration decisions that enhance learning and specifying relevant brain networks and brain-behavior relationships, these findings advance understanding of neural mechanisms for adaptive memory-based behavior.

Hippocampus and FPC were associated with exploration decisions based on NAI. A substantial literature has implicated prefrontal cortex in contextual regulation and decision-making (reviewed in Lee and Seo, 2007) with FPC regions involved in relatively high contextual complexity (Badre et al., 2009). FPC has been especially linked to task switching based on unexpected events (Koechlin et al., 2000). Strikingly, Boorman and colleagues (2009) identified regions proximal to iFPC and IPL regions described here in relation to switches to alternative choices. Our findings suggest that FPC sensitivity to novel information could saliently drive changes in behavior, especially during exploratory learning. Hippocampal memory and prediction functions are also relevant for detection of novel information and decision-making (Buckner, 2010; Gupta et al., 2009). However, there is little information regarding individual and joint contributions of prefrontal cortex and hippocampus to information-based exploration during learning. Our NAI findings advance an emerging literature on the role of prefrontal cortex and hippocampus in the immediate/short-term use of memory to guide exploration decisions (Fujisawa and Buzsaki, 2011; Guitart-Masip et al., 2013; Ross et al., 2011; Yee et al., 2013; reviewed in Wang et al., in press), as distinct from hypothesized roles of hippocampus in the putative use of long-term memory representations to make predictions and decisions (Buckner and Carroll, 2007; Schacter et al., 2012). Unlike in previous studies, we isolated the involvement of hippocampus and FPC in the use of immediately available novel information (NAI) to make exploration decisions. This is because we distinguished NAI-based decisions in the Active condition from simple learning and/or working memory maintenance of the same information that occurred in the Active and Passive conditions and from decisions based on AAI. Similarly, we identified anterior hippocampal activity specifically related to eye-movement correlates of contextual information processing that predicted immediately forthcoming stay/shift decisions. Our findings thus solidify and specify the role of hippocampus and prefrontal cortex (specifically FPC, although activations extended into dorsolateral prefrontal and ventral orbitofrontal cortex; Table 1) in the immediate use of newly available information for exploration decision-making.

In contrast, caudate, putamen, and IPL were involved in decisions based on AAI. Strategic exploration for rewards is associated with striatal networks, owing to the central role of these networks in statistical and reinforcement learning (Hills, 2006; Kim and Hikosaka, 2013; Pennartz et al., 2009). Although specific association with AAI-based decisions in our study is consistent with the role of these structures in learning by accumulation of information over time, our findings are distinct in that they demonstrate decision-making activity of dorsal striatum for exploration, and in the absence of overt reinforcement. Our findings thus support theories, derived from computational modeling, that striatum contributes to decision-making in addition to learning and irrespective of task demands or reward (Frank et al., 2001; Guthrie et al., 2013). Further, parallel corticostriatal loops are theorized to be involved in decision-making in different contexts (Alexander et al., 1986), and the notion that these basal ganglia regions could interact with IPL to support integrative functions associated with AAI-based decisions is supported by tight anatomical and functional connections between these regions (Martin, 1996).

Indeed, our findings of network-level interactivity of regions identified for NAI- and AAI-based decisions suggest interactivity of the distinct decision-making processes supported by these regions. Cortical-striatal recurrent networks encompass regions that we identified for information-based decisions, including both superior and inferior FPC regions (BA 10 extending into 46 and BA 10 extending into 47, respectively), IPL (BA 40 extending into 39), caudate, and putamen. Background connectivity of these regions was specifically related to learning success when subjects made exploration decisions in the Active condition, thus implicating this network in advantageous exploration decision-making. A similar relationship was identified for background connectivity of hippocampus with caudate and putamen, yet no direct connectivity of hippocampus with FPC or IPL regions was identified in relation to task performance. These results are consistent with the anatomical organization of anatomically and functionally distinct recurrent striatal networks involving cortex versus hippocampus (Alexander et al., 1986; Martin, 1996), thereby demonstrating and characterizing the distinct contributions of these networks to information-based decisions.

Although hippocampal interactivity is generally greater with ventral striatum than with dorsal striatum regions identified here (Alexander et al., 1986; Kahn and Shohamy, 2013), as emphasized in theories of memory-reward interactions for adaptive behavior (Adcock et al., 2006; Shohamy et al., 2004), some findings have implicated hippocampal interactivity with dorsal striatum in episodic encoding (Sadeh et al., 2011). The functional connectivity patterns we identified do not imply or require direct anatomical connectivity (Honey et al., 2009; O'Reilly et al., 2013), nor do we infer causality or unique functional connectivity among regions. Given the limitations of fMRI, we emphasize the need for validation of connectivity patterns by neurophysiological measures. Indeed, animal models might be necessary to resolve what are likely rapid and iterative interactions among hippocampus, prefrontal cortex, and striatum in support of information-based exploration (Wang et al., in press), owing to their recurrent organization (Alexander et al., 1986; Martin, 1996). Our exploration paradigm and information model could be readily adapted for animal studies of contextual learning (Buschman et al., 2012; Navawongse and Eichenbaum, 2013).

Our findings of hippocampal, prefrontal, and dorsal striatal activity and connectivity associated with information-based decisions enrich current theories of adaptive memory behavior. By isolating information-based decisions from general aspects of learning and memory that could be associated with reward-related processing (e.g., novelty, familiarity, and other factors potentially related to dopaminergic signaling; Hansen and Manahan-Vaughan, 2012), we show that involvement of these structures in adaptive behavior is not merely a ramification of the secondary reward provided by familiar or novel information. These results indicate that accumulated information regarding the current environment (i.e., total current knowledge) and novel information that serves to update accumulated knowledge provide distinct yet interactive information sources on which strategic exploration decisions can be based to support adaptive behavior. Such interactivity would allow organisms to judiciously explore for information that solidifies current knowledge (AAI) as well as for information that updates current knowledge (NAI). We thus highlight the role of hippocampal-prefrontal and striatal contributions to exploration decisions that capitalize on existing knowledge (AAI) and respond to new information that could signal changes in relevant stimulus-response relationships (NAI), thus building on previous neuroanatomical accounts of such exploration processes (Frank et al., 2001; Johnson et al., 2012).

These findings suggest that memory impairments caused by damage and/or dysfunction of hippocampal brain networks should also involve deficits in exploration decisions that normally support effective learning (see also Gupta et al., 2009; Voss et al., 2011a; 2011b; Yee et al., 2013). These deficits could exacerbate learning difficulties experienced by brain-damaged individuals. Furthermore, our findings suggest that individuals vary in terms of their ability to seek information relevant for learning, and our identification of information-based exploration decisions and their neural mechanisms is therefore relevant to understanding successful versus poor learning in a variety of circumstances.

Experimental Procedures

Subjects

All subjects (N=42; 22 females; ages 18–35) had normal or corrected-to-normal vision and did not report neurological or psychiatric disorders. All subjects gave written, informed consent and were remunerated for their participation. The Northwestern University Institutional Review Board approved the protocol.

Experiment Design

There were two Active Learning and two Passive Learning blocks. Each block comprised two phases: Learning (32 trials) and Test (20 trials). Subjects learned object-face associations presented in one of four contexts (quadrants) on the screen. Object-face associations varied contextually based on object features, with context governing the object feature (shape or texture) used to guide correct face selection.

For Active Learning blocks, subjects selected one quadrant in the Quadrant Selection period (3 sec for behavioral subjects, 2 sec for MRI subjects; Figure S4), after which the selection was confirmed by yellow highlighting for 2 sec (jittered 1–3 s for MRI subjects). If subjects did not respond in time, a random quadrant was selected (mean = 1.3 trials per block). Next, an object and two faces were presented and subjects selected one face with a button press (Face Selection period). Target and foil faces were randomly assigned to the right or left side for each trial, encouraging stimulus-based rather than action- or location-based learning. Subjects were given 5 s for face selection, after which feedback (correct or incorrect) was provided for 2 sec. After feedback, another trial began after a 4–10 s ISI. After each Learning phase, a Test followed (after an approx. 1-min delay). During Test, quadrants were predetermined, subjects had 5 s to select a face, and no feedback was given.

In the Passive condition, stimuli presented during Learning were taken from the previous subject’s choices in the Active condition. Subjects did not make quadrant or face selections, but viewed sequences of quadrant and face selections recorded from the previous participant’s Active blocks. Therefore, subject n’s Passive visual display was “yoked” to subject n-1’s self-selected Active visual display. As in Active blocks, a Test followed each Passive Learning session.

Each block included 2 unique faces, 2 unique texture categories, and 2 unique shape categories. Faces included professional-quality headshots of nonfamous individuals. Texture and shape categories each included 3 exemplars (Figure S5). Contextual information thus concerned texture and shape categories, not individual textures and shapes, in order to discourage responding based on simple stimulus-level associations and to encourage rule-based learning (Badre et al., 2009). The configuration of contextual rules across the four quadrants varied across blocks so that subjects could not use prior learning to succeed. Stimuli were counterbalanced across learning conditions and subjects, such that Active objects for one subject were the Passive objects for the next subject (i.e., the same information was given on average for the two conditions). The order of learning conditions was counterbalanced, with Active and Passive blocks alternating in each subject (either A-P-A-P or P-A-P-A). For the first subject, the Passive stimulus sequence was provided by an additional individual who otherwise was excluded (a “seed”). Each subject completed a practice session before the experiment.

Because texture-face associations conflicted with shape-face associations in half of the trials, association knowledge without contextual knowledge would support an accurate guessing rate of 75%. Therefore, we classified high-performing subjects as those with slightly above-chance performance (>80%, to account for chance variability) during the Test phases (only 3 of 42 subjects scored between 75 and 80% accuracy, and their assignment to either the high- or low-performance groups did not significantly change group-level effects). We first sought to detect relationships between modeled information and behavior in the full sample, including lowaccuracy subjects for comparison with high accuracy subjects to identify information-based exploration decisions that contributed to learning success (which occurred in high-accuracy but not low-accuracy subjects). We then used fMRI to identify neural activity related to advantageous information-based decisions in a subsample of high-accuracy subjects, who reliably demonstrated these decisions (see below).

Information metric

In any quadrant, either the object shape or texture determined the correct object-face association (i.e., only one feature was relevant). Subjects learned the relevant feature for each quadrant solely based on feedback. This required integrating knowledge over multiple trials within a quadrant (context), as any one trial was not diagnostic. Specifically, subjects had to learn that correct faces covaried with one feature but not the other.

We modeled available context information by considering sequences of quadrant visits and object-face pairings. Information was modeled as a finite resource existing within a quadrant that could be made available on any trial in which there was unequal evidence in favor of one feature over the other. This evidence was calculated by considering covariation between the face and both the shape and the texture, integrated over consecutive trials (Figure 2A), which is equivalent to the mutual information of faces with object features. We defined an information metric NAI (“newly available information”) that quantifies information available in the current trial regarding contextual associations given the evidence derived from a particular sequence of quadrants and objects/faces integrated over consecutive trials. Specifically, we calculated the covariance of faces with both shapes and textures (measured as the Χ2 measure of association for binary discrete variables), and assessed if one was greater than the other. NAI at trial t is governed by the equation:

NAI(t)={0.4*Iq(t),ifχT2χS20,otherwise,

where Iq(t) is the amount of existing (finite) information in the current quadrant q and χT/S2 is the Pearson’s chi-squared statistic calculated given the sequence of textures/shapes and faces observed within the current quadrant q integrated over the last three trials (i.e. trial t, t-1, and t-2, where t is the current trial). Any trial on which NAI is nonzero is considered a high-information trial (Figure 2A). The presence or absence of newly available information in any given trial is denoted by “NAI+” or “NAI−”. Information available before the current trial (i.e. the integral of NAI) is denoted as “accumulated available information,” abbreviated as “AAI,” and its presence or absence in any given trial is denoted by “AAI+” or “AAI−”. Because contextual rules within different quadrants are independent, evidence must be accumulated for each quadrant separately. NAI and AAI therefore refer to information newly and previously available relevant to only the current quadrant (i.e. AAI can decrease when switching to a new quadrant depending on the history of quadrant visits, but is always monotonically increasing within the same quadrant, even when switching back). Both metrics are used as continuous amplitude parametric regressors for the fMRI analyses (described below).

As in information-foraging models (Pirolli and Card, 1999), each quadrant was modeled as having finite contextual information Iq that the subject could acquire. We set the learning rate to 40%, so that subjects collect 40% of the information left in the trial, leaving the amount of information available in quadrant q to be Iq (t+1) = 0.6 * Iq (t). Although the learning rate value was motivated by general intuition of how many trials were required to learn the contextual rules experimentally, the specific value had no significant effect on the fMRI results for a sizeable range and no effect whatsoever on the behavioral results (Figure S6).

Simulation of shift/stay baseline rates

Analyses of shift probability for NAI−/AAI−, NAI+/AAI−, NAI+/AAI+, and NAI−/AAI+ trials (Figure 4) used simulation of the baseline shift rate, given that NAI and AAI are partially co-determined by shifting behavior (i.e. consecutive staying increases high-information trials early in learning). We used Monte Carlo simulations to account for this partial dependence given a fixed total probability of shifting. For each subject, we calculated the total probability of shifting and performed 500 simulations using this value to generate random patterns of quadrant visits. We then categorized trials into the four information trial types given these simulated patterns, and calculated probabilities of shifting for each information trial type. Z-scores for subjects’ actual shift probability were calculated using these null distributions. Z-scores were aggregated for the group analysis. Therefore, this analysis also accounted for inter-subject differences in overall tendency to shift.

MRI Data Acquisition and Analysis

MRI data were acquired using whole-brain imaging parameters reported in the Supplemental Experimental Procedures. MRI data were collected for 21 (all right-handed) of the 42 subjects, with data collection from this subsample occurring intermixed with data collection from behavior-only subjects. One subject was excluded from fMRI analyses for excessive movement (>4 mm over a run). Another five were excluded for not achieving high accuracy in the Active condition (80%, see above), resulting in 15 included subjects (8 females). We restricted analyses to high-accuracy subjects to isolate neural correlates of information-based exploratory learning, which were not observed in low accuracy subjects. Thus, low-accuracy subjects would not generate neural correlates of information-based decision-making. Critically, patterns of information-based exploration decisions were similar for the fMRI subsample as for the full sample, with robust evidence for advantageous exploration decisions based on NAI and AAI in the subsample (see Results and Figure S1).

Four of the high-accuracy fMRI subjects were yoked to low-accuracy subjects for the Passive condition. These subjects’ passive performance was not statistically different from those yoked to other high-performing subjects [t(13)=0.157, p=0.878, Means = 87.6, 86.3]. This finding is consistent with the lack of significant relationship between the specific Passive sequence provided and test accuracy in the full sample (see Results) and indicates that the fMRI analysis was not strongly influenced by this factor.

Functional and structural MRI data were analyzed using AFNI (Cox, 1996) and preprocessed using standard procedures reported in Supplemental Experimental Procedures. To estimate fMRI activity related to trial-by-trial information measures, event onsets were simultaneously amplitude modulated by NAI and AAI values for each trial (i.e., parametric analyses of both variables; detailed in Supplemental Experimental Procedures). This allowed us to separately identify activity that linearly varied with the magnitude of each type of information while removing variance accounted for by the other information type. NAI onsets were at Feedback when novel information became available. AAI onsets were at Face Selection when overall information was relevant (Figure S4). A separate, non-parametrically modulated analysis was performed for stay/shift decisions given information (both AAI and NAI together, versus lack of both information types) in the Active and Passive Learning conditions.

Regions exhibiting significant activity at the group level were identified via random-effects analysis with a combined voxel-wise and spatial extent threshold method incorporating Monte Carlo simulation (Forman et al., 1995) and mixed-effects multilevel analysis (MEMA) (Chen et al., 2012). The voxel-wise threshold was set to p<0.005, and the spatial-extent threshold for whole-brain analyses was identified as 119 contiguous supra-threshold voxels (402 mm3) to obtain a combined corrected threshold of p<0.05. A threshold of 25 voxels (84 mm3) was used for planned assessments of activity within medial temporal lobe structures (hippocampus, parahippocampal gyrus, perirhinal and entorhinal cortex, defined as the overlap of MTL cortex and hippocampus in the averaged normalized brain of our MRI subjects with these regions as defined by the N27 atlas (Holmes et al., 1998). Planned assessments of prefrontal cortex and basal ganglia regions would not have yielded significantly different results, as activation clusters identified in those regions were larger than the extent threshold determined by simulation.

Connectivity analysis

Connectivity analyses using the “background connectivity” method (Norman-Haignere et al., 2012) involved the 6 regions of interest (ROIs) identified by the NAI and AAI fMRI contrasts (Tables 1 and 2). After extracting the residual timeseries for each ROI (see Supplemental Experimental Procedures), we constructed a connectivity graph for each subject by calculating the Spearman’s rank-correlation coefficient ρ of all pairs of time series in each Learning condition separately. Spearman’s rank-correlation was used to avoid assumptions of normality. Correlation coefficients were converted to Fisher’s z-scores for group analyses. Relationships between connectivity and Test accuracy were conducted by cross-correlating (using Spearman’s rank correlation) z-scores from the Active Learning condition with Active Test performance and z-scores from the Passive Learning condition with Passive Test performance. Because we removed stimulus-driven variance on functional connectivity, findings are interpreted in terms of changes in connectivity related to task, as in similar task-related functional connectivity analyses (e.g., Cole et al., 2013; Wang et al., 2013). Even with the removal of stimulus-driven variance, higher order network effects due to task can remain (Fair et al., 2007), a property that allowed us to identify interregional functional connectivity specifically related to advantageous exploration decision-making in the Active condition. Corrections for multiple comparisons were made using the Benjamini-Hochberg procedure for controlling false-discovery rate (Hochberg and Benjamini, 1990).

Eye-tracking Experimental Procedures

Eye-tracking data were successfully obtained (using procedures reported in Supplemental Experimental Procedures) during fMRI acquisition from 12 of the high-accuracy fMRI subjects that contributed to fMRI data analyses (5 females), with calibration failure in the other three subjects. Fixations during Learning trials in regions of interest (ROIs) corresponding to object and face locations were analyzed with custom scripts in Matlab (The Math Works, Inc.).

Timecourse analyses of normalized mean viewing values were performed using paired t-tests. To account for multiple comparisons as well as auto-correlation of these timeseries, we used the Benjamini-Hochberg procedure for controlling FDR, which is typically preferred when measures are not statistically independent (Hochberg and Benjamini, 1990).

Supplementary Material

01
02

Highlights.

  • Exploration decisions during learning were strategically based on information

  • Decisions and brain activity were sensitive to distinct modeled information types

  • Hippocampal-prefrontal and dorsal striatal areas distinctly contributed to decisions

  • Increased network connectivity during decision-making predicted learning success

Acknowledgements

We thank Neal Cohen, Patrick Watson, Hillary Schwarb, and Kelly Brandstatt for helpful comments. Research was supported by award number P50-MH094263 from the National Institute of Mental Health and R00-NS069788 and F32-NS083340 from the National Institute of Neurological Disorders and Stroke. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Adcock RA, Thangavel A, Whitfield-Gabrieli S, Knutson B, Gabrieli JD. Reward-motivated learning: mesolimbic activation precedes memory formation. Neuron. 2006;50:507–517. doi: 10.1016/j.neuron.2006.03.036. [DOI] [PubMed] [Google Scholar]
  2. Alexander GE, DeLong MR, Strick PL. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 1986;9:357–381. doi: 10.1146/annurev.ne.09.030186.002041. [DOI] [PubMed] [Google Scholar]
  3. Badre D, Hoffman J, Cooney JW, D'Esposito M. Hierarchical cognitive control deficits following damage to the human frontal lobe. Nat. Neurosci. 2009;12:515–522. doi: 10.1038/nn.2277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Beard BL, Ahumada AJ. A technique to extract relevant image features for visual tasks. Proc. SPIE. 1998;3299:79–85. [Google Scholar]
  5. Boorman ED, Behrens TE, Woolrich MW, Rushworth MF. How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron. 2009;62:733–743. doi: 10.1016/j.neuron.2009.05.014. [DOI] [PubMed] [Google Scholar]
  6. Buckner RL. The role of the hippocampus in prediction and imagination. Annu. Rev. Psychol. 2010;61:27–48. C21–C28. doi: 10.1146/annurev.psych.60.110707.163508. [DOI] [PubMed] [Google Scholar]
  7. Buckner RL, Carroll DC. Self-projection and the brain. Trends Cogn. Sci. 2007;11:49–57. doi: 10.1016/j.tics.2006.11.004. [DOI] [PubMed] [Google Scholar]
  8. Buschman TJ, Denovellis EL, Diogo C, Bullock D, Miller EK. Synchronous oscillatory neural ensembles for rules in the prefrontal cortex. Neuron. 2012;76:838–846. doi: 10.1016/j.neuron.2012.09.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Chen G, Saad ZS, Nath AR, Beauchamp MS, Cox RW. FMRI group analysis combining effect estimates and their variances. Neuroimage. 2012;60:747–765. doi: 10.1016/j.neuroimage.2011.12.060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cole MW, Reynolds JR, Power JD, Repovs G, Anticevic A, Braver TS. Multi-task connectivity reveals flexible hubs for adaptive task control. Nat. Neurosci. 2013;16:1348–1355. doi: 10.1038/nn.3470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Cox RW. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput. Biomed. Res. 1996;29:162–173. doi: 10.1006/cbmr.1996.0014. [DOI] [PubMed] [Google Scholar]
  12. Ding L, Gold JI. The basal ganglia's contributions to perceptual decision making. Neuron. 2013;79:640–649. doi: 10.1016/j.neuron.2013.07.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Eichenbaum H, Fortin NJ. The neurobiology of memory based predictions. Philos. Trans. R. Soc. Lond. B. 2009;364:1183–1191. doi: 10.1098/rstb.2008.0306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Fair DA, Schlaggar BL, Cohen AL, Miezin FM, Dosenbach NU, Wenger KK, Fox MD, Snyder AZ, Raichle ME, Petersen SE. A method for using blocked and event-related fMRI data to study "resting state" functional connectivity. Neuroimage. 2007;35:396–405. doi: 10.1016/j.neuroimage.2006.11.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Forman SD, Cohen JD, Fitzgerald M, Eddy WF, Mintun MA, Noll DC. Improved assessment of significant activation in functional magnetic resonance imaging (fMRI): use of a cluster-size threshold. Magn. Reson. Med. 1995;33:636–647. doi: 10.1002/mrm.1910330508. [DOI] [PubMed] [Google Scholar]
  16. Frank MJ, Claus ED. Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychol. Rev. 2006;113:300–326. doi: 10.1037/0033-295X.113.2.300. [DOI] [PubMed] [Google Scholar]
  17. Frank MJ, Loughry B, O'Reilly RC. Interactions between frontal cortex and basal ganglia in working memory: a computational model. Cogn. Affect. Behav. Neurosci. 2001;1:137–160. doi: 10.3758/cabn.1.2.137. [DOI] [PubMed] [Google Scholar]
  18. Fujisawa S, Buzsaki G. A 4 Hz oscillation adaptively synchronizes prefrontal, VTA, hippocampal activities. Neuron. 2011;72:153–165. doi: 10.1016/j.neuron.2011.08.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Guitart-Masip M, Barnes GR, Horner A, Bauer M, Dolan RJ, Duzel E. Synchronization of medial temporal lobe and prefrontal rhythms in human decision making. J. Neurosci. 2013;33:442–451. doi: 10.1523/JNEUROSCI.2573-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Gupta R, Duff MC, Denburg NL, Cohen NJ, Bechara A, Tranel D. Declarative memory is critical for sustained advantageous complex decision-making. Neuropsychologia. 2009;47:1686–1693. doi: 10.1016/j.neuropsychologia.2009.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Guthrie M, Leblois A, Garenne A, Boraud T. Interaction between cognitive and motor cortico-basal ganglia loops during decision making: a computational study. J. Neurophysiol. 2013;109:3025–3040. doi: 10.1152/jn.00026.2013. [DOI] [PubMed] [Google Scholar]
  22. Hannula DE, Althoff RR, Warren DE, Riggs L, Cohen NJ, Ryan JD. Worth a glance: using eye movements to investigate the cognitive neuroscience of memory. Front. Hum. Neurosci. 2010;4:166. doi: 10.3389/fnhum.2010.00166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Hansen N, Manahan-Vaughan D. Dopamine D1/D5 Receptors Mediate Informational Saliency that Promotes Persistent Hippocampal Long-Term Plasticity. Cereb. Cortex. 2012 doi: 10.1093/cercor/bhs362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Hills TT. Animal foraging and the evolution of goal-directed cognition. Cogn. Sci. 2006;30:3–41. doi: 10.1207/s15516709cog0000_50. [DOI] [PubMed] [Google Scholar]
  25. Hochberg Y, Benjamini Y. More powerful procedures for multiple significance testing. Stat. Med. 1990;9:811–818. doi: 10.1002/sim.4780090710. [DOI] [PubMed] [Google Scholar]
  26. Holmes CJ, Hoge R, Collins L, Woods R, Toga AW, Evans AC. Enhancement of MR images using registration for signal averaging. J. Comput. Assist. Tomogr. 1998;22:324–333. doi: 10.1097/00004728-199803000-00032. [DOI] [PubMed] [Google Scholar]
  27. Honey CJ, Sporns O, Cammoun L, Gigandet X, Thiran JP, Meuli R, Hagmann P. Predicting human resting-state functional connectivity from structural connectivity. Proc. Natl. Acad. Sci. USA. 2009;106:2035–2040. doi: 10.1073/pnas.0811168106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Johnson A, Redish AD. Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. J. Neurosci. 2007;27:12176–12189. doi: 10.1523/JNEUROSCI.3761-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Johnson A, Varberg Z, Benhardus J, Maahs A, Schrater P. The hippocampus and exploration: dynamically evolving behavior and neural representations. Front. Hum. Neurosci. 2012;6:216. doi: 10.3389/fnhum.2012.00216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Kahn I, Shohamy D. Intrinsic connectivity between the hippocampus, nucleus accumbens, and ventral tegmental area in humans. Hippocampus. 2013;23:187–192. doi: 10.1002/hipo.22077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kim HF, Hikosaka O. Distinct basal ganglia circuits controlling behaviors guided by flexible and stable values. Neuron. 2013;79:1001–1010. doi: 10.1016/j.neuron.2013.06.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Koechlin E, Corrado G, Pietrini P, Grafman J. Dissociating the role of the medial and lateral anterior prefrontal cortex in human planning. Proc. Natl. Acad. Sci. USA. 2000;97:7651–7656. doi: 10.1073/pnas.130177397. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Lee D, Seo H. Mechanisms of reinforcement learning and decision making in the primate dorsolateral prefrontal cortex. Ann. N. Y. Acad. Sci. 2007;1104:108–122. doi: 10.1196/annals.1390.007. [DOI] [PubMed] [Google Scholar]
  34. Martin JH. Neuroanatomy: Text and Atlas. 2nd edn. Stamford, Connecticut: Appleton & Lange; 1996. [Google Scholar]
  35. Metcalfe J, Jacobs WJ. People's study time allocation and its relation to animal foraging. Behav. Process. 2010;83:213–221. doi: 10.1016/j.beproc.2009.12.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Navawongse R, Eichenbaum H. Distinct pathways for rule-based retrieval and spatial mapping of memory representations in hippocampal neurons. J. Neurosci. 2013;33:1002–1013. doi: 10.1523/JNEUROSCI.3891-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Norman-Haignere SV, McCarthy G, Chun MM, Turk-Browne NB. Category-selective background connectivity in ventral visual cortex. Cereb. Cortex. 2012;22:391–402. doi: 10.1093/cercor/bhr118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. O'Reilly JX, Croxson PL, Jbabdi S, Sallet J, Noonan MP, Mars RB, Browning PG, Wilson CR, Mitchell AS, Miller KL, et al. Causal effect of disconnection lesions on interhemispheric functional connectivity in rhesus monkeys. Proc. Natl. Acad. Sci. USA. 2013;110:13982–13987. doi: 10.1073/pnas.1305062110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Packard MG, Knowlton BJ. Learning and memory functions of the Basal Ganglia. Annu. Rev. Neurosci. 2002;25:563–593. doi: 10.1146/annurev.neuro.25.112701.142937. [DOI] [PubMed] [Google Scholar]
  40. Pennartz CM, Berke JD, Graybiel AM, Ito R, Lansink CS, van der Meer M, Redish AD, Smith KS, Voorn P. Corticostriatal Interactions during Learning, Memory Processing, and Decision Making. J. Neurosci. 2009;29:12831–12838. doi: 10.1523/JNEUROSCI.3177-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Pirolli P, Card S. Information foraging. Psychol. Rev. 1999;106:643–675. [Google Scholar]
  42. Ross RS, Sherrill KR, Stern CE. The hippocampus is functionally connected to the striatum and orbitofrontal cortex during context dependent decision making. Brain Res. 2011;1423:53–66. doi: 10.1016/j.brainres.2011.09.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Sadeh T, Shohamy D, Levy DR, Reggev N, Maril A. Cooperation between the hippocampus and the striatum during episodic encoding. J. Cogn. Neurosci. 2011;23:1597–1608. doi: 10.1162/jocn.2010.21549. [DOI] [PubMed] [Google Scholar]
  44. Schacter DL, Addis DR, Hassabis D, Martin VC, Spreng RN, Szpunar KK. The future of memory: remembering, imagining, and the brain. Neuron. 2012;76:677–694. doi: 10.1016/j.neuron.2012.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Shohamy D, Adcock RA. Dopamine and adaptive memory. Trends Cogn. Sci. 2010;14:464–472. doi: 10.1016/j.tics.2010.08.002. [DOI] [PubMed] [Google Scholar]
  46. Shohamy D, Myers CE, Grossman S, Sage J, Gluck MA, Poldrack RA. Cortico-striatal contributions to feedback-based learning: converging data from neuroimaging and neuropsychology. Brain. 2004;127:851–859. doi: 10.1093/brain/awh100. [DOI] [PubMed] [Google Scholar]
  47. Strange BA, Duggins A, Penny W, Dolan RJ, Friston KJ. Information theory, novelty and hippocampal responses: unpredicted or unpredictable? Neural Networks. 2005;18:225–230. doi: 10.1016/j.neunet.2004.12.004. [DOI] [PubMed] [Google Scholar]
  48. Tolman EC. Cognitive maps in rats and men. Psychol. Rev. 1948;55:189–208. doi: 10.1037/h0061626. [DOI] [PubMed] [Google Scholar]
  49. Voss JL, Gonsalves BD, Federmeier KD, Tranel D, Cohen NJ. Hippocampal brain-network coordination during volitional exploratory behavior enhances learning. Nat. Neurosci. 2011a;14:115–120. doi: 10.1038/nn.2693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Voss JL, Warren DE, Gonsalves BD, Federmeier KD, Tranel D, Cohen NJ. Spontaneous revisitation during visual exploration as a link among strategic behavior, learning, and the hippocampus. Proc. Natl. Acad. Sci. USA. 2011b;108:E402–E409. doi: 10.1073/pnas.1100225108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Wang JX, Bartolotti J, Amaral LA, Booth JR. Changes in task-related functional connectivity across multiple spatial scales are related to reading performance. PLoS One. 2013;8:e59204. doi: 10.1371/journal.pone.0059204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Wang JX, Cohen NJ, Voss JL. Covert rapid action-memory simulation (CRAMS): A hypothesis of hippocampal-prefrontal interactions for adaptive behavior. Neurobiol. Learn. Mem. doi: 10.1016/j.nlm.2014.04.003. (in press). [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Yee LTS, Warren DE, Voss JL, Duff MC, Tranel D, Cohen NJ. The hippocampus uses information just encountered to guide efficient ongoing behavior. Hippocampus. 2013;24:154–164. doi: 10.1002/hipo.22211. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01
02

RESOURCES