Abstract
Primates use perceptual and mnemonic visuospatial representations to perform everyday functions. Neurons in the lateral prefrontal cortex (LPFC) have been shown to encode both of these representations during tasks where eye movements are strictly controlled and visual stimuli are reduced in complexity. This raises the question of whether perceptual and mnemonic representations encoded by LPFC neurons remain robust during naturalistic vision—in the presence of a rich visual scenery and during eye movements. Here we investigate this issue by training macaque monkeys to perform working memory and perception tasks in a visually complex virtual environment that requires navigation using a joystick and allows for free visual exploration of the scene. We recorded the activity of 3950 neurons in the LPFC (areas 8a and 9/46) of two male rhesus macaques using multielectrode arrays, and measured eye movements using video tracking. We found that navigation trajectories to target locations and eye movement behavior differed between the perception and working memory tasks, suggesting that animals used different behavioral strategies. Single neurons were tuned to target location during cue encoding and working memory delay, and neural ensemble activity was predictive of the behavior of the animals. Neural decoding of the target location was stable throughout the working memory delay epoch. However, neural representations of similar target locations differed between the working memory and perception tasks. These findings indicate that during naturalistic vision, LPFC neurons maintain robust and distinct neural codes for mnemonic and perceptual visuospatial representations.
SIGNIFICANCE STATEMENT We show that lateral prefrontal cortex neurons encode working memory and perceptual representations during a naturalistic task set in a virtual environment. We show that despite eye movement and complex visual input, neurons maintain robust working memory representations of space, which are distinct from neuronal representations for perception. We further provide novel insight into the use of virtual environments to construct behavioral tasks for electrophysiological experiments.
Keywords: nonhuman primate, prefrontal cortex, visual perception, working memory
Introduction
Seminal lesion studies in the early 20th century demonstrated that the primate lateral prefrontal cortex (LPFC) plays a pivotal role during delayed response tasks involving the maintenance of information in working memory (WM; Baddeley, 1986; for review, see Roussy et al., 2021b). Neurons in the LPFC maintain WM representations of space (Funahashi et al., 1989; Goldman-Rakic, 1995; Miller et al., 1996; Suzuki and Gottlieb, 2013; Leavitt et al., 2017b; Constantinidis et al., 2018), as well as perceptual representations (Mendoza-Halliday and Martinez-Trujillo, 2017; Roussy et al., 2021b). However, neurons in the LPFC are also thought to encode signals related to eye position (Hasegawa et al., 1998; Boulay et al., 2016; Bullock et al., 2017). Many of the previous studies of visual WM and perception in the LPFC that sampled neuronal activity have been conducted in conditions where gaze is constrained, and stimuli are shown on a homogeneous computer screen. However, during natural vision, primates sample complex information via gaze shifts in visual scenes that contain multiple items and variable layouts. It is unclear whether perceptual and WM representations in LPFC neurons remain invariable or deteriorate under these naturalistic conditions.
One of the most universally recognized spatial WM tasks is the oculomotor delayed response (ODR) task in which animals are required to saccade to a remembered cued location (Funahashi et al., 1989; Leavitt et al., 2018). During the cue presentation and delay epoch of the task, animals must maintain gaze on a fixation point. Breaking fixation results in an “error trial,” meaning that correct performance of the task is contingent on maintaining proper eye position during the delay epoch. This intentional and task-pertinent eye fixation limits the possible effect of gaze shifts and eye position on the measured neuronal activity. However, this strict control of eye position during memory maintenance deviates from how WM is used in naturalistic conditions. In day-to-day life, we move our eyes while using WM, yet we can maintain robust WM representations of locations despite those changes in eye position. It is currently unclear how unrestrained eye position in a visually complex environment may affect the ability of neurons and neuronal ensembles in the LPFC to represent perceptual and mnemonic information.
Here, we measure firing rates of neurons in the LPFC of two macaques during virtual WM and perceptual tasks while allowing the animals to freely view a rich visual environment. We recorded the activity of 3950 neurons in the LPFC (areas 8a and 9/46; Petrides, 2005) of both animals while measuring eye position. Neuronal activity was predictive of target location during WM and perception despite changes in eye position. Eye position poorly predicted target location when compared with neuronal activity. Additionally, using linear classifiers, we found that coding of remembered and perceived targets does not generalize in LPFC neuronal populations.
Materials and Methods
The same two male rhesus macaques (Macaca mulatta) were used in both tasks (age, 10 and 9 years; weight, 12 and 10 kg).
Ethics statement.
Animal care and handling including basic care, animal training, surgical procedures, and experimental injections were preapproved by the Western University Animal Care Committee. This approval ensures that federal (Canadian Council on Animal Care), provincial (Ontario Animals in Research Act), and other national Canadian Association for Laboratory Animal Medicine standards for the ethical use of animals are followed. Regular assessments for physical and psychological well being of the animals were conducted by researchers, registered veterinary technicians, and veterinarians.
Task.
The current task takes place in a virtual environment that was created using the Unreal Developer Kit (UDK; May 2012 release; Epic Games). The nine targets were arranged in a 3 × 3 grid spaced ∼0.5 s apart (movement speed during navigation was fixed). For the working memory task, the target is present only during the cue epoch. For the perception task, the target is present in the cue, delay, and response epochs. Detailed descriptions of this platform and the recording setup can be found in the study by Doucet et al. (2016).
Experimental setup.
During the task training period, animals were implanted with custom-fit PEEK (polyetheretherketone) cranial implants, which housed the head posts and recording equipment [Neuronitek (for more information, see Blonde et al., 2018)]. Subjects performed all experiments while seated in a standard primate chair (Neuronitek) located in an isolated radio frequency-shielded room with the only illumination originating from the computer monitor. Animals were head posted during experiments and received a juice reward through an electronic reward integration system (Crist Instruments). The task was presented on a computer LCD monitor (27 inch; model VG278H, ASUS; resolution, 1024 × 768 pixel; refresh rate, 75 Hz; screen height, 33.5 cm; screen width, 45 cm) positioned 80 cm from the eyes of the animals. Eye position was tracked using a video-oculography system (EyeLink 1000, SR Research) with sampling at 500 Hz.
Microelectrode array implant.
We chronically implanted two 10 × 10 microelectrode arrays (96 channels; length, 1.5 mm; electrode separation, least 0.4 mm; Utah Array, Blackrock Neurotech) located in each animal in the left LPFC (area 8a dorsal and ventral, anterior to the arcuate sulcus, and on either side of the principal sulcus; Petrides, 2005). Electrode arrays were placed and impacted ∼1.5 mm into the cortex. Reference wires were placed beneath the dura, and a grounding wire was attached between screws in contact with the pedestal and the border of the craniotomy.
Processing of neuronal data.
Neuronal data were recorded using a Cerebus Neuronal Signal Processor (Blackrock Neurotech) via a Cereport adapter. The neuronal signal was digitized (16 bit) at a sample rate of 30 kHz. Spike waveforms were detected online by thresholding at 3.4 SDs of the signal. The extracted spikes were semiautomatically resorted with techniques using the Plexon Offline Sorter (Plexon). Sorting results were then manually refined. We collected behavioral data across 20 WM sessions [nonhuman primate (NHP) B, 12 sessions; NHP T, 8 sessions] and neuronal data from 19 WM sessions. Behavior was recorded from 19 perception sessions (NHP B, 14 sessions; NHP T, 5 sessions). Neuronal data were analyzed from 13 sessions in which the WM and perception tasks were performed during the same session.
Task performance.
The percentage of correct trials was calculated for both the WM and perception tasks. Response time was calculated for correct trials as the duration between the start of navigation and the time in which animals reach the correct target location. The task arena was divided into a 4 × 4 grid forming 16 area cells (see Fig. 2d). The trajectory of the animal was calculated for each trial consisting of x and y coordinates sampled every 0.002 s. We calculated the number of samples that fell within each cell—this determined which cells the animals entered during navigation as well as how much of the total trajectory fell within each cell (related to time spent in cells). Our optimal trajectory measure is calculated by dividing the real length of the trajectory (the Euclidean distance from each x, y positional data point) by the true optimal distance (determined by the Euclidean distance from the start location to the target location for a particular trial). A value of 1 indicates the shortest possible (i.e., most optimal) trajectory length.
Figure 2.
Task behavior. a, Percentage of correct trials for the working memory and perception tasks for each animal. Dark gray lines represent mean values, and each data point represents a session. b, Response time for correct trials for the working memory and perception tasks for each animal. Dark gray lines represent mean values, and each data point represents a session. c, Animal trajectories plotted for an example session and two example target locations (in pink) in which green trajectories indicate correct trials and black trajectories indicate incorrect trials. Example sessions are included for working memory and perception tasks as well as for both animals. d, The virtual arena divided into 16 regional cells. The number of times each cell is entered (i.e., the number of trajectory points within each cell) is shown averaged over sessions for two example locations (in pink). Examples are included for working memory and perception tasks as well as both animals. e, The optimal trajectory measure shows that the optimal trajectory to correct target locations is based on path length in which a value of 1, marked by the gray dashed line, reflects the shortest possible path. The optimal trajectory is plotted for the working memory and perception tasks for each animal. Dark gray lines represent median values, and each data point represents a session. *p < 0.01, **p < 0.001, ***p < 0.0001.
Characterizing eye movement.
The percentage of eye data points on-screen is calculated as the number of data points that fall within the screen limits divided by the total number of eye data points during a given epoch. Off-screen data points occur when the animal looks outside of the defined screen limits or when the animal closes its eyes (i.e., during blinking).
We characterized eye movements as saccades, fixations, or smooth pursuits based on methods outlined in the study by Corrigan et al. (2017). Eye movement data were first cleaned by removing blinks, periods of lost signal, or corneal-loss spikes (occurs when corneal reflection is lost and regained). The clean eye signal was smoothed with a second-order Savitzky–Golay filter with a window of 11 samples. Saccades were identified by periods of high angular acceleration of the eye of at least 10 ms. Individual saccades were determined by intersaccadic intervals of at least 40 ms. Saccade start and end points were determined by consistent direction and velocity considering a threshold of continuous change of >20° for at least three samples, or an acute change of >60° at one sample. Foveations were classified as fixations or smooth pursuits based on sample direction and ratios of distances. Dispersion of samples, consistency of direction, total path displacement, and the total spatial range were considered.
We calculated the percentage of total eye movement events classified as fixations or saccades for each epoch during WM and perception and the percentage of smooth pursuits for the response epoch.
Main sequence calculation.
The main sequence reflects the relationship between the amplitude of the saccade and the peak velocity of the eye rotation toward the end point of the saccade. Saccade amplitude and velocity can change based on the value of the saccade target (Bendiksby and Platt, 2006) or the alertness of the subject (Di Stasi et al., 2013). To calculate the main sequence, we separated saccades into bins of 3° of amplitude, starting at 2°, and computed the average peak velocity for each bin. Saccades within the same amplitude bins were matched between tasks to account for the influence of saccade start location and direction (direction with a tolerance of ±13°, and the starting location within 7°).
Spatial tuning.
Tuning for spatial location was computed in all units (3950, 3092 in NHP B, 858 in NHP T) in 19 WM sessions using Kruskal–Wallis one-way ANOVA on epoch-averaged firing rates with target location as the independent variable. A neuron was defined as tuned if the test resulted in p < 0.05.
Fano factor.
Trial-to-trial variability in neuron activity was examined using Fano factor, a measure of spike count variability in relation to the mean number of spikes. This was calculated as follows:
where is the trial-to-trial SD of the firing rate (spike counts) and is the variance during time window ; is the mean trial–trial spike count during the same time window. The Fano factor was calculated for single delay-selective neurons over the delay epoch (2000 ms) and for population-averaged activity over the delay epoch (2000 ms).
Decoding target location from neuronal ensembles.
We used a linear classifier [Support Vector Machine (SVM); Libsvm 3.14; Fan et al., 2008] with fivefold cross-validation to decode the target position from z score-normalized population-level responses using both single units and multiunits on a single-trial basis. We grouped targets based on location in the virtual arena into the following three groups: right targets, center targets, and left targets (leaving us with three classes; 33.33% chance level). We used the best ensemble method detailed in the study by Leavitt et al. (2017a), in which we determined the highest performing neuron, paired this neuron with all others in the population to achieve the best pair, and combined the best pair iteratively with all other neurons to form the best trio. This was repeated until we reached a best ensemble of 20 neurons. The classifiers used firing rates calculated >500 ms time windows. Decoding accuracy at each time window was compared with chance performance using t tests.
We used 13 sessions for the comparison between the decoding of target column (left, right, center) using either correct or incorrect trials. These sessions were used because they contained samples from each target condition for incorrect trials. The number of trial observations was balanced between correct and incorrect trials for each session using data sampling. Results were averaged over 10 iterations of random sampling without replacement.
Gaze analysis.
We calculated the total fixation time during the delay epoch as well as the fixation time on the trial-specific target location for correct trials and incorrect trials. We compared the proportion of fixation time on the target location related to all fixation time during delay (target location fixation duration/total fixation duration) between correct and incorrect trials.
Decoding target location using eye position.
The screen was divided into 16 cells of equal dimensions. The number of foveations classified as fixations was calculated within each cell during the cue and delay epochs. We used a linear classifier (SVM) with fivefold cross-validation to determine whether the target location could be predicted by the number of fixations within each area of the screen under the assumption that animals gather information from the virtual environment during such fixation periods (Corrigan et al., 2017).
Decoding eye position from neuronal data.
To examine the influence of saccade direction, amplitude, and fixation (gaze) position, we calculated the firing rate, fixation location, saccade direction, and amplitude during fixation periods in the delay epoch. We designed a linear regression for each neuron using firing rate during the fixation as the response (dependent variable) and binned saccade direction (binned into eight bins spanning 45° of a 360° direction circular space), saccade amplitude in degrees of visual angle, and fixation position (x, y screen coordinates) as predictors (independent variables), as follows:
in which y is the single neuron firing rate during all fixations over a session, is the saccade direction (categorical predictor derived by dividing the visual field into eight sections, binning saccade direction by degrees), is the saccade amplitude (in degrees of visual angle), is the fixation position (x-screen coordinate), and is the fixation position (y-screen coordinate).
We then used the residual firing rates from this model for each neuron as input into an SVM linear classifier with fivefold cross-validation to predict the target condition. We used an SVM classifier with the same parameters to also predict the target location from the raw firing rates during the same fixation periods.
We used a linear classifier (SVM) with fivefold cross-validation to decode eye position on screen based on neuronal firing rates during periods of eye fixation. Four target locations were selected as part of this analysis since their locations were nonoverlapping on screen. Fixation periods occurring in either the cue or delay epoch that fell within these regions were used. Short fixation periods were removed (amplitude, <6 ms). The firing rate was calculated for each neuron during each fixation period and were z score normalized. Neuronal populations included single units and multiunits.
Decoding target location for working memory and perception.
We used a linear classifier (SVM) with fivefold cross-validation to decode target location (nine targets) based on population neuronal activity. We used 13 sessions in which animals performed both the WM and perception tasks so that we could use the same population of neurons. We altered training and testing conditions so that classifiers were either trained on population activity during congruent tasks or incongruent tasks (e.g., trained on WM and tested on perception).
We divided WM trials into two random and separate datasets and tested/trained classifiers on one-half of the trials and trained on the other half. For the WM task, we trained classifiers on either congruent or incongruent task epochs (e.g., train during cue and test during delay).
Statistics.
Statistical comparisons were conducted using 20 WM and 19 perceptual sessions. Thirteen sessions in which WM and perception were recorded in the same day were included when comparing neural activity between WM and perception tasks. Thirteen sessions that had a sufficient number of incorrect trials (i.e., incorrect trials for each target condition) were used for decoding using either correct or incorrect trials. ANOVA was used for group comparisons in which more than two groups are compared followed by Tukey–Kramer post hoc testing, controlling for multiple comparisons (a nonparametric equivalent was used in cases where data are non-normally distributed and best reflected by median values). t Tests were used when comparing the means of two groups, and Wilcoxon rank-sum tests were used when comparing median values of two groups. Additional statistical information is outlined in Table 1.
Table 1.
Statistics reporting table
Figures | Subject | Data counts | Statistical test | Comparison | Statistics | p Valuea |
---|---|---|---|---|---|---|
2a Percentage of correct trials | NHP B, NHP T | 20 WM sessions, 19 perception sessions | Two-way ANOVA Tukey–Kramer multiple comparisons | Animal task interaction NHP B Per– NHP B WM NHP T Per– NHP T WM NHP B WM– NHP T WM | F(1,35) = 84.7 F(1,35) = 199.6 F(1,35) = 58.9 |
p < 0.0001
p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 |
2b Response time |
NHP B, NHP T | 20 WM sessions, 19 perception sessions | Two-way ANOVA | Animal task interaction |
F(1,35) = 0.62 F(1,35) = 0.01 F(1,35) = 0.98 |
p = 0.44 p = 0.94 p = 0.33 |
3e Optimal trajectory |
NHP B, NHP T | 20 WM sessions, 19 perception sessions | Wilcoxon rank-sum test | NHP B NHP T |
Rank = 234 Rank = 72 |
p = 0.0002
p = 0.02 |
3a Percentage on screen |
NHP B, NHP T | 20 WM sessions, 19 perception sessions | Two-way ANOVA Tukey–Kramer multiple comparisons |
Epoch task interaction cueWM–delayWM delayWM–responseWM cueWM–responseWM cuePer–delayPer delayPer–responsePer cuePer–responsePer cueWM–cuePer delayWM–delayPer responseWM–responsePer |
F(2,111) = 6.9 F(1,111) = 8.4 F(2,111) = 20 |
p = 0.002 p = 0.005 p < 0.0001 p = 0.01 p = 0.0004 p = 0.9 p = 0.8 p = 0.0007 p < 0.0001 p = 1 p = 0.44 p < 0.0001 |
3d Percentage of eye movement events |
NHP B, NHP T | 20 WM sessions, 19 perception sessions | Two-way ANOVA Tukey–Kramer multiple comparisons Two-way ANOVA Tukey–Kramer multiple comparisons |
Fixation Epoch task interaction cueWM–delayWM cueWM–responseWM delayWM–responseWM cuePer–delayPer cuePer–responsePer delayPer–responsePer Saccade Epoch task interaction cueWM–delayWM cueWM–responseWM delayWM–responseWM cuePer–delayPer cuePer–responsePer delayPer–responsePer cueWM–cuePer delayWM–delayPer responseWM–responsePer |
F(2,111) = 191.8 F(1,111) = 11.3 F(2,111) = 0.62 F(2,111) = 64 F(1,111) = 142.2 F(2,111) = 10.9 |
p < 0.0001 p = 0.001 p = 0.54 p = 0.3 p < 0.0001 p < 0.0001 p = 0.85 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p < 0.0001 p = 0.99 p < 0.0001 p < 0.0001 p < 0.0001 p = 0.007 p < 0.0001 p = 0.0007 p < 0.0001 p < 0.0001 |
3e Main sequence between epochs |
NHP B, NHP T | 20 WM sessions, 19 perception sessions | One-way ANOVA Tukey–Kramer One-way ANOVA Tukey–Kramer |
WM Amplitude bin Delay–cue Cue–response Delay–response Perception Amplitude bin Delay-cue Cue–response Delay–response |
Cohen's d Cohen's d Cohen's d Cohen's d Cohen's d Cohen's d |
4 bins: p < 0.05 4 bins: p < 0.05 2 bins: p > 0.2 3 bins: p < 0.05 3 bins: p > 0.2 3 bins: p < 0.05 1 bin: p > 0.2 6 bins: p < 0.05 4 bins, p < 0.05 1 bin: p > 0.2 6 bins: p < 0.05 6 bins: p > 0.2 6 bins: p < 0.05 3 bins: p > 0.2 |
3f Main sequence on and off target |
NHP B, NHP T | 20 WM sessions, 19 perception sessions | t test | WM Perception |
Cohen's d Cohen's d |
0 bins: p < 0.05 0 bins: p > 0.2 6 bins: p < 0.05 3 bins: p > 0.2 |
4g Compare neuron and population Fano factor |
NHP B, NHP T | 19 WM sessions | t test | Single neuron and population Fano factor |
t(1904) = 26.37 | p < 0.0001 |
6b Decoding ensemble over time |
NHP B, NHP T | 19 WM sessions 14 time windows 4 time windows |
Kruskal–Wallis | Time windows All trial time Delay time |
h(13,252) = 17.3 h(3,72) = 4.9 |
p = 0.19 p = 0.18 |
6c Decoding trial outcome |
NHP B, NHP T | 19 WM sessions | t test | Compare decoding accuracy to chance (50%) | p = 9.19e-07 | |
6d Decoding using correct or incorrect trials |
NHP B, NHP T | 13 WM sessions | t test | Correct–incorrect | t(24) = 4.04 | p = 4.71e-04 |
7c Fixation on target |
NHP B, NHP T | 20 WM sessions | Wilcoxon rank-sum test | Correct–incorrect | Rank = 482 | p = 0.053 |
7d Eye position decoding |
NHP B, NHP T | 20 WM sessions | Kruskal–Wallis Tukey–Kramer multiple comparisons |
Epochs cueCue–delayDelay cueCue–delayCue cueCue–cueDelay delayDelay–delayCue delayDelay–cueDelay delayCue–cueDelay |
h(3,76) = 51.1 |
p < 0.0001 p = 0.04 p < 0.0001 p < 0.0001 p = 0.002 p = 0.01 p = 0.96 |
7f Decoding using firing rate or residuals |
NHP B, NHP T | 19 WM sessions | t test | Firing rate, residual values from linear model | t(36) = 1.14 | p = 0.26 |
7h Decoding neural data eye position |
NHP B, NHP T | 19 WM sessions | Wilcoxon rank-sum test | Cue–delay | Rank = 472 | p = 0.003 |
8a Cue, cross-task decoding |
NHP B, NHP T | 13 WM sessions, 13 perception sessions | Kruskal–Wallis Tukey–Kramer multiple comparisons |
Tasks WMWM–PerPer WMWM– WMPer WMWM–PerWM PerPer–WMPer PerPer–PerWM WMPer–PerWM |
h(3,48) = 39.2 |
p < 0.0001 p = 0.77 p < 0.0001 p < 0.0001 p = 0.0007 p = 0.0005 p = 1 |
8b Delay, cross- task decoding |
NHP B, NHP T | 13 WM sessions, 13 perception sessions | Kruskal–Wallis Tukey–Kramer multiple comparisons |
Tasks WMWM–PerPer WMWM–WMPer WMWM–PerWM PerPer–WMPer PerPer–PerWM WMPer–PerWM |
h(3,48) = 39.2 |
p < 0.0001 p = 0.99 p < 0.0001 p = 0.0009 p < 0.0001 p = 0.0003 p = 0.79 |
8c WM half-trial decoding |
NHP B, NHP T | 13 WM sessions | Kruskal-Wallis | Full- and half-WM trials | h(1,24) = 11.6 | p = 0.0006 |
8e Cross-temporal decoding |
NHP B, NHP T | 19 WM sessions | Kruskal-Wallis | Time windows All trial time Delay time |
h(3,72) = 7.7 h(13,252) = 21.7 |
p = 0.05 p = 0.06 |
Per, Perception.
aBold indicates significance.
Data availability.
MATLAB codes used to analyze the data are available from author M.R. Data supporting the findings of this study are available from the corresponding authors on reasonable request and will be fulfilled by M.R.
Results
Naturalistic working memory and perception tasks
We developed a naturalistic spatial WM task using a virtual reality engine (Unreal Engine 3, UDK, Unreal Engine). The task took place in a virtual arena that allowed for free navigation using a joystick. Importantly, to simulate natural behavior, animals were permitted free visual exploration (unconstrained eye movements) during the entire trial duration. On each trial, a target was presented for 3 s during the cue epoch at one of nine locations in the virtual arena (Fig. 1a,b). In the WM task, the target then disappeared during a 2 s delay epoch. Navigation was disabled (i.e., joystick movements did not trigger any movement in the virtual arena) during the cue and delay epochs. Subsequently, navigation was enabled, and animals were required to virtually navigate to the target location within a 10 s response epoch to obtain a juice reward (Fig. 1c). We also developed a perceptual version of this task in which the target remains on screen for the trial duration (Fig. 1c). We trained two rhesus monkeys (M. mulatta) on both virtual tasks and recorded neuronal activity during task performance using two 96-channel microelectrode arrays (Utah Arrays) in each animal. Arrays were implanted in the left LPFC (area 8a and 9/46; one on each side of the principal sulcus, anterior to the arcuate sulcus; Fig. 1d,e; Petrides, 2005).
Figure 1.
Experimental setup. a, Animal in task setup with joystick, reward system, eye recording system, and monitor displayed. b, Overhead view of the virtual environment indicating the start location and the nine target locations. c, Task timeline displaying the cue, delay, and response epochs for the working memory and perception tasks. The target remains on screen throughout the delay and response epochs during the perception task. d, 3D-modeled brain image from an MRI of NHP B with Utah Array locations in the left hemisphere indicated by pink squares. e, Intraoperative photographs showing the location of the implanted Utah Arrays in both animals.
Task performance and animal behavior
We analyzed behavior from 20 WM sessions (NHP B, 12 sessions; NHP T, 8 sessions) and 19 perception sessions (NHP B, 14 sessions; NHP T, 5 sessions). Both animals performed the tasks above chance (theoretical chance, ∼11%). Both animals performed significantly better on the perception (NHP B: mean, 98%; NHP T: mean, 95%) than on the WM memory task (NHP B: mean, 87%; NHP T: mean, 57%), reflecting the increased difficulty of including a memory delay epoch (Fig. 2a). Response times for correct trials were consistent between animals and tasks (Fig. 2b).
We plotted animal trajectories to two example target locations to understand how animals were navigating in the virtual space (Fig. 2c). We divided the environment into a 16-cell grid and calculated the number of times that animals entered each cell as part of their navigation trajectory. Two example target locations averaged over all sessions are shown in Figure 2d. We next calculated the trajectory of animals in the environment in each correct trial from their starting location to the location of the target to determine how precise animals navigated toward targets. This real trajectory length was divided by the optimal trajectory length (i.e., Euclidean distance from start to target location), resulting in a measure of deviation from optimal trajectory where a value of 1 indicates that animals took the shortest possible trajectory to a target. Trajectory lengths were similar between animals during perception (NHP B: median, 1.0; NHP T: median, 1.1) and during WM (NHP B: median, 1.8; NHP T: median, 1.9). However, trajectories were more optimal during the perception task than during the WM task, indicating less precise navigation to targets during WM, when the target was not visible (Fig. 2e). Overall, these results indicate that both animals used similar behavioral strategies to perform the tasks based on similar response times and trajectories.
Eye behavior during naturalistic working memory and perception
Our virtual reality setup allowed for precise tracking of eye movement and gaze position; therefore, we measured eye movement behavior during both tasks. First, we calculated the proportion of eye position data points falling within the presentation screen. “Eyes off screen” occurs when the animals close their eyes or, most often, when they look away from the screen. The proportion of eye data points falling within screen boundaries differed between task epochs and between the WM and perception tasks. During WM, animals maintained eye position on the screen less during the delay epoch (mean, 86.0%) than during the cue (mean, 92.9%) or response epochs (mean, 95.0%). During perception, animals maintained their eyes on the screen less during the response epoch (mean, 81.0%) than during the delay epoch (mean, 89.9%) or the cue epoch (mean, 92.6%). Unlike during WM, the percentage of eye position on-screen during perception cue and delay epochs showed no significant difference (Fig. 3a).
Figure 3.
Eye movement behavior. a, Left column, The percentage of eye data points that fall within the boundaries of the screen during the cue, delay, and response epochs shown for working memory sessions. Each data point represents a session. Right column, The percentage of eye data points that fall within the boundaries of the screen during the cue, delay, and response epochs shown for perception sessions. Each data point represents a session. b, Eye traces over trial time categorized into fixations (orange), saccades (green), and smooth pursuits (purple) for an example working memory trial and an example perception trial. c, All eye traces categorized into fixations (orange), saccades (green), and smooth pursuits (purple) in screen coordinates for an example working memory trial and an example perception trial. d, Percentage of eye movement events classified as fixations or saccades during the different task epochs for working memory and perception sessions. Error bars represent the SEM. Asterisks on the left represent significance between the cue and delay epochs, asterisks in the middle represent significance between the cue and response epochs, and asterisks on the right represent significance between the delay and response epochs. Asterisk color corresponds to eye movement type. e, Main sequence for the working memory and perception task during different task epochs. Asterisks represent significance at each amplitude bin. Blue asterisks represent significance between the cue and delay epochs. Green asterisks represent significance between the cue and response epochs. Pink asterisks represent significance between the delay and response epochs. f, Main sequence for the working memory and perception tasks for saccades landing on and off of target location. Asterisks represent significance between on-target and off-target saccades at each amplitude bin. Error bars indicate the SEM. *p < 0.01, **p < 0.001, ***p < 0.0001.
We categorized eye movement into fixations, saccades, and smooth pursuits (Corrigan et al., 2017). Example traces displaying the categorization can be found in Figure 3, b and c. We compared the proportion of eye movements that fall within each category between task epochs during perception and WM. The proportion of eye movements classified as fixations significantly differed between trial epochs and between WM and perception tasks (Fig. 3d). During WM, animals made the most fixations during the cue epoch with fewer made during the delay and response epochs (cue: mean, 46.2%; delay: mean, 44.3%; response: mean, 33.3%). During perception, animals also fixated the least during the response epoch with more fixations made during the cue and delay epochs (cue: mean, 47.3%; delay: mean, 46.2%; response: mean, 35.9%).
The proportion of eye movements classified as saccades significantly differed between trial epochs and between WM and perception tasks. During WM, the proportion of saccades was highest in the response epoch with fewer occurring in the cue epoch and fewest during the delay epoch (cue: mean, 37.2%; delay: mean, 36.9%; response: mean, 41.8%; Fig. 3d, left). During perception, animals also made the highest proportion of saccades during the response epoch (mean, 36.7%) with fewer occurring during the cue epoch (mean, 33.5%) and the delay epoch (mean, 27.6%; Fig. 3d, right). Between WM and perception response epochs, there was a larger proportion of smooth pursuits during perception (mean, 30.4%) than during WM (mean, 28.1%). The latter may be linked to the presence of the target during perception but not during WM.
During the WM task delay epoch, there was a larger proportion of eye movements classified as saccades than during the corresponding epoch of the perception task (Fig. 3d). There was also a larger percentage of eye movements onscreen during the response epoch of the WM task than during the corresponding epoch of the perception task (Fig. 3a). During the WM task response epoch, there was also a larger proportion of eye movements classified as saccades than during the corresponding epoch of the perception task (Fig. 3d).
To further explore saccadic activity, we calculated the main sequence, reflecting the relationship between saccade peak velocity and amplitude (see Materials and Methods; Fig. 3e). Saccade velocity was significantly different (higher peak velocities as a function of saccade amplitude) in the response epoch compared with the cue and delay epochs during perception for all amplitude bins (t test: p < 0.05; effect size, >0.2). The increased velocity of saccades during perception response may reflect the use of saccades to track the target during navigation, which does not occur during WM when the targets were no longer present (Fig. 3e). It may also signify an increase in arousal during navigation, which would be more demanding than the other task epochs. We also compared the main sequences between saccades that land on target and off target during the delay epoch (Fig. 3f). We found that on-target saccades resulted in larger peak velocities; however, these differences were more pronounced and were only significant during the perception task (WM: t test, p > 0.05; perception: t test, 6 bins, p < 0.05). Therefore, saccades that land on target versus those that land off target show a greater difference when the target was physically present compared with when it was removed during the WM delay.
These behavioral results indicate a difference in animal behavior during different task epochs and between WM and perception. In particular, less time spent looking onscreen during the delay epoch of the WM task combined with fewer fixations, and no significant differences in saccade amplitude to targets compared with off-target locations suggests that removal of the visual target influenced the pattern of saccades. It is possible that animals searched for landmarks that could serve as references for the target location or that they relied on an allocentric mnemonic representation of the target location. Decreased fixation and increased number of saccades during the response epoch as well as an increase in saccade peak velocity may suggest a similar strategy as well as reflect the dynamic nature of the response epoch of the task in which the visual environment changes as the animal changes position in the arena.
Neural spatial selectivity
We recorded the activity of 3950 units between the dorsally (1992 units) and ventrally (1958 units) placed multielectrode arrays. Many units in this sample displayed delay activity. Figure 4, a and b, shows activity patterns of two neurons that selectively increased their activity during the delay epoch for preferred target locations. Tuning for target location was identified in the population for cue and delay epochs (cue: ventral: mean, 22%; dorsal: mean, 16%; delay: ventral: mean, 14%; dorsal: mean, 12%), and many neurons were tuned during both the cue and delay epochs (ventral: mean, 37%; dorsal: mean, 48%; Fig. 4c,d). The majority of single neurons displayed trial-to-trial variability closer to that expected by a Poisson process (Fano factor close to 1; Fig. 4e). The population trial-to-trial variability, considered as the average spike count in single trials when pooling across simultaneously recorded neurons, exhibited considerably lower variability than anticipated from a Poisson process (Fig. 4f) and lower variability than single neurons (Fig. 4g; t test, p < 0.0001). This indicates that although individual neurons displayed variable trial-to-trial firing rates, the number of spikes fired by the neural population across trials remains more consistent.
Figure 4.
Neural coding for remembered locations. a, Neural activity for an example neuron recorded by the dorsally located electrode array. The spike density function in the left panel displays the activity of the neuron over trial time for the nine target locations. The inlet displays a normalized firing rate for all target locations. The right panel displays a raster for the same neuron in which trials are sorted by preferred to the least preferred target locations. The delay epoch is indicated by the salmon-colored column. b, Neural activity for an example neuron recorded by the ventrally located electrode array. c, The proportion of tuned neurons during the cue (blue), delay (pink/salmon), and both the cue and delay epochs (orange) recorded in the dorsally located array. d, The proportion of tuned neurons recorded in the ventrally located array. e, Trial-to-trial Fano factor for all delay neurons during the delay epoch across sessions. Inlet displays the spike count variance by spike count mean used to calculate the Fano factor for all neurons. Sessions are represented by different dot colors. f, Fano factor for all delay neuron population activity over the delay epoch. Inlet displays the population averaged spike count variance and spike count mean for each target condition and session (indicated by different colors). g, Comparison between the single-neuron and population Fano factors. The red lines represent median values, and the bottom and top edges of the box indicate the 25th and 75th percentiles. The whiskers extend to nonoutlier data points (within 1.5 SDs). Gray dots represent outliers. ***p < 0.0001.
At the population level, neurons with the same spatial tuning exhibited increased delay activity during single trials when their preferred target location was presented. Populations of neurons with different spatial tuning from the target presented displayed a lower magnitude of activity (Fig. 5). To determine how much information about the remembered target locations was contained in the population of neurons, we used a linear classifier (SVM) to decode the target location from neuronal firing rates within 500 ms time bins. We used a best ensemble method in which the most informative unit was found and was paired with all other neurons in the population until the best pair was found. The best pair was grouped with all neurons in the population until the best trio was found. This process was continued until the ensemble contains 20 neurons (Leavitt et al., 2017a). To achieve a sample size required for training and testing the classifier for all sessions, we combined trials from all targets located on the right, left, and center of the environment so decoding was performed using three classes. We were able to decode the target location in single trials from the neural activity during delay using linear classifiers. An example session in Figure 6a shows decoding accuracy for different ensemble sizes during the delay epoch divided into four 500 ms time segments. Decoding accuracy over time was above chance (33.33%) for all time windows, ranging from 68% during the last 500 ms of the included response epoch to 87% toward the end of the cue epoch (Fig. 6b). The decoding accuracy was consistent over the delay epoch (Fig. 6b), indicating robust information content for remembered locations during our naturalistic task.
Figure 5.
Single-trial population activity. Delay neuron population activity plotted for single-trial examples for each target location during an example session. Time represents the trial time from cue onset, and the delay epoch is indicated by the salmon-colored column. The target location for each trial is indicated by the arena inlet (white circle). Blue lines represent the population activity of delay neurons that prefer the target presented in the trial, and gray lines represent the population activity of all other delay neurons tuned for other locations. The inlet displays the average delay activity for both populations. Error bars indicate the SEM. n-Values represent neurons in the blue population.
Figure 6.
Decoding of remembered locations. a, Decoding accuracy for one example session during the delay epoch divided into four 500 ms temporal segments using neural ensembles of different sizes. The inlet illustrates the grouping of targets into three classes. The dashed gray line represents chance decoding performance. Dots represent the decoding accuracy of individual neurons. b, Median decoding accuracy over trial time. The salmon-colored column represents the delay period, and the gray dashed line represents chance decoding. The yellow bars on top of the figure represent significance from chance performance for each time window (t test, p < 0.05). Error bars indicate the SEM. c, Decoding trial outcome. Dots represent data per session, and the gray dashed line indicates chance performance (50%). d, Decoding accuracy using correct or incorrect trials. Dots represent data from different sessions, and gray solid lines connect data from the session. The gray dashed line represents chance decoding.
We tested whether the firing rate of the recorded neuronal population during the delay period provided enough information to distinguish between correct and incorrect trials. Using linear SVM classification, we were able to predict trial outcome above chance (50%) based on delay epoch population activity (mean, 61.4%; median, 63.2%; t test, p = 9.19e-07; Fig. 6c). To determine whether decoding performance of remembered target location relates to task performance, we used linear SVM classification to predict target location (left, center, right) using either all correct or all incorrect trials. We balanced the number of correct and incorrect trial samples to make a comparison between the two trial types. Decoding accuracy was significantly higher for correct trials compared with incorrect trials (correct trials: mean, 67.05%; incorrect trials: median, 41.86%; t test, p = 4.71e-04; Fig. 6d). This indicates that population activity during delay was more predictive of target location in correct trials than in incorrect trials.
Fixation on the target location
One potential issue in allowing for natural eye movements is that animals could maintain their gaze on the empty cued location during the delay or visually “rehearse” their movement plan. We explored this possibility by analyzing gaze behavior on the targets. We plotted all fixation points on the screen for one session for two example target locations (Fig. 7a). Fixation points span the horizontal extent of the screen (constitutes the task-relevant area). Figure 7b shows heat maps of fixation locations averaged over all sessions for two example target locations during the delay epoch. Gaze was not limited to the location in which the target was presented. It was also directed to nontarget stimuli in the environment such as the tree, as would occur in naturalistic contexts.
Figure 7.
Fixations on screen and target locations. a, All fixations for an example session plotted on screen for the cue period (blue) and delay period (green) for two example target locations. b, Heat maps averaged over working memory sessions showing fixation locations on screen during the delay period for two example target locations. c, The percentage of total fixations that fall within the target location during the delay period for correct and incorrect trials. Dark gray lines represent median values, and each data point represents a session. d, Decoding accuracy for predicting the target location from the location of eye fixations on screen during the cue and delay epochs. Classifiers are trained on the first epoch listed in the x-axis label and tested on the second epoch. The dashed line represents chance decoding accuracy. e, Added variable plot for a linear regression model predicting firing rate during periods of eye fixation during delay epochs from eye position and saccade direction and amplitude for an example neuron. The solid black line represents the model fit, and the dashed lines represent 95% confidence bounds of the fit. f, Decoding accuracy for predicting target location from real population firing rates during fixation periods and for predicting target location from residual values after fitting the model exemplified in e to each neuron in a population. Dots represent data per session, dark gray lines represent mean values, and the dashed line represents chance decoding performance (11%). g, Outlined regions of the screen encompassing four target locations that are separable onscreen. h, Median decoding accuracy predicting eye position within the outlined regions shown in g using neural population data during fixation periods during the cue and delay epochs. The gray dashed line represents chance decoding accuracy. Error bars indicate the SEM. *p < 0.01, **p < 0.001, ***p < 0.0001.
To examine whether increased fixation on the cued target location was used as a behavioral strategy to improve performance, we calculated the percentage of fixations falling within the bounds of the location of the target. Overall, the percentage of fixations on the target location was very low during the delay epoch (median, 3%). There was no significant difference between correct and incorrect trials, suggesting that increased fixation on cued target locations during the delay epoch may not be an effective strategy in correctly performed trials (correct: median, 3.5%; incorrect: median, 2.6%; Wilcoxon rank-sum, p > 0.05; Fig. 7c).
To determine how predictive fixation location was of the target location, we divided the screen into 16 cells and calculated the number of fixation points that fell within each cell during the cue and delay epochs. We trained an SVM classifier with a linear kernel to predict which of the nine target locations was presented based on where on screen the animal was fixating. The classifier performed above chance (11.11%) during both epochs but performed significantly better during the cue epoch (median decoding accuracy, 31.4%) compared with the delay epoch (median decoding accuracy, 20.8%), suggesting reduced patterns of target-specific fixation during the delay (Fig. 7d). To determine whether eye fixation was similar between cue and delay epochs of the WM task, we trained classifiers using eye fixation positions during the cue epoch and tested the classifiers using eye fixation positions from the delay epoch. We similarly trained classifiers on delay data and tested them on cue data. Decoding accuracy was close to chance level (11.11%) when classifiers were cross-trained between epochs of the WM task, and it was significantly lower than training and testing on congruent epochs (Fig. 7d). This shows that the position of fixations (i.e., gaze position) were different between the cue and delay epochs during the WM task.
Previous studies have shown that LPFC neurons encode information related to eye movements and gaze position (Bullock et al., 2017). To corroborate these findings, we examined whether neuronal activity in our sample of LPFC neurons contained information about the gaze position and planned saccade direction of the animals. We designed multiple linear regression models to predict firing rate for each neuron during delay epoch eye fixation periods from saccade direction, amplitude, and fixation position (Fig. 7e). After fitting the model to a neuron, we obtained the residual values. These values represent the residual firing rates that are not accounted for by the model. We repeat the procedure for neurons within the same population (i.e., same recording session). We then trained linear SVM classifiers to predict target location using either the firing rate residual values or the raw firing rates from the same population of neurons during the same fixation periods. Decoding accuracy was similar using either type of data (residual: mean, 21.39; real firing rate: mean, 24.95; t test, p = 0.26; Fig. 7f), and both were significantly higher than chance (11.11%; real firing rate: t test, p = 8.6e-06; residual: t test, p = 1.7e-04), indicating that saccade amplitude, direction, and eye position information were not the main contributors to the decoding of the remembered target location.
Decoding of gaze position from neural activity in LPFC neurons
To further examine neural activity related to gaze, particularly fixation on task-relevant stimuli, we next examined neural activity during fixation on target locations. We selected four targets shown in Figure 7g that were nonoverlapping on the screen and measured neuronal firing rates while animals fixated on each one of the target locations. We used SVM classification and found that we could decode the gaze position from neural activity. The decoding accuracy was significantly higher during the cue epoch (median decoding accuracy, 65.4%) of the WM task compared with the delay epoch (median decoding accuracy, 35.0%; Fig. 7h), suggesting that more information was available to the neuronal population when animals fixate on a target that was present on screen compared with when the target was no longer present. Indeed, the decoding accuracy during the delay epoch was close to chance (25%), suggesting that firing rates during fixation in the delay period carried little information about the remembered target location. One possible explanation for this finding is that decoding during the cue epoch may have been dominated by visual responses to the target. During the delay epoch, when no visual cue was present, eye position contributes poorly to decoding. These findings suggest that eye position signals do not necessarily contribute to the ability of many LPFC neurons to encode WM representations in complex and dynamic environments.
Separation between coding for working memory and perception
Unlike the WM task, during the perception task, the target was accessible throughout the trial. Thus, it is possible that some neurons respond to the target only when it was present in the perception task (perceptual neurons) and some neurons are only active during the delay period of the WM task (mnemonic neurons; Mendoza-Halliday and Martinez-Trujillo, 2017; Roussy et al., 2021b). Therefore, we hypothesized that neural population activity profiles differ during the perception and WM tasks. To test this hypothesis, we collected neuronal data from 13 sessions in which animals performed both the WM and perception tasks. The same population of simultaneously active neurons was recorded during both tasks during these sessions. This allowed us to use SVM classification to cross-train neural data between WM and perception to predict the nine target locations. We specifically tested the prediction that SVM classifiers trained in one task will not generalize the performance to the other task.
Decoding performance was similar between WM and perception when classifiers were trained and tested on congruent tasks (i.e., trained on WM and tested on WM; Fig. 8a,b). The same population of neurons can maintain similar amounts of information about the target location whether targets remain on screen (perception) or disappear (WM; perception: median decoding accuracy, 71.5%; WM: median decoding accuracy, 68.1%). Although the same neurons were recorded during each task, decoding performance dropped to close to chance level (11.11%) when the classifiers were trained on perception trials and tested on WM trials or when the classifiers were trained on WM trials and tested on perception trials (Fig. 8a,b). In comparison, classifiers trained on one-half of the WM trials and tested on the other half resulted in performance well above chance levels (median decoding accuracy, 51.3%; Fig. 8c). The latter indicates that our results were not an artifact of using different sets of trials for testing and training the classifiers, but were an effect of task type (perception vs WM).
Figure 8.
Neural coding for working memory and perception. a, Decoding accuracy for predicting target location for the perception and working memory tasks during the cue epoch. Classifiers are trained on the task that appears first in the x-axis label and are tested on the task that appears second. The asterisk color represents significant differences with the condition of that color. Dark gray lines represent median values. The dashed gray line represents chance decoding. b, Decoding accuracy for predicting target location for the perception and working memory tasks during the delay epoch. c, Decoding accuracy for the working memory task during the delay epoch using all trials for classifier training and testing or training on half of the trials and testing on half of the trials. The red lines represent median values, and the bottom and top edges of the box indicate the 25th and 75th percentiles. The whiskers extend to nonoutlier data points (within 1.5 SDs). d, Cross-epoch median decoding accuracy for the working memory task. e, Decoding accuracy when classifiers are cross-trained between 500 ms time windows. *p < 0.01, **p < 0.001, ***p < 0.0001.
We also conducted cross-epoch decoding for WM in which we trained and tested on combinations of cue, delay, and response epochs. Decoding performance was greatest when the classifiers were trained and tested with data from the same epoch and lowest when it was trained and tested on data from response and cue epochs (train on cue – test on response: median decoding accuracy, 11.0%; train on response – test on cue: median decoding accuracy, 12.3%) and when data were trained on the delay epoch and tested on the cue epoch (median decoding accuracy, 17.0%; Fig. 8d). We also conducted cross-temporal decoding in which we trained and tested classifiers between congruent and incongruent time windows of 500 ms. These results indicate higher decoding accuracy when classifiers were trained and tested between temporally near time windows within the same trial epoch (Fig. 8e). These data suggest that different neural activity profiles support LPFC neural codes for WM and perception.
Discussion
By using complex virtual reality tasks, we were able to explore visuospatial WM and perception in naturalistic settings, incorporating natural eye movements and virtual navigation. We found that animals were able to accurately perform both tasks and identified distinct navigation strategies and eye movement behavior that occur during WM and perception. Whereas animals used a visually guided strategy in the perception task, they necessarily switched their strategy during WM. We also demonstrate the suitability of naturalistic WM tasks for neuronal recording in the LPFC, particularly those that allow for natural eye movements. We found that neurons in the primate LPFC are strongly tuned for target location during cue and delay epochs and that the amount of information during delay about target location remains consistent within the population of neurons on the single-trial level. We also found that neuronal activity during fixation on target location is less predictive of target location during the delay epoch compared with the cue epoch indicating that eye position information does not necessarily contribute to the decoding of target location during WM tasks. Information about target location encoded by the same neuronal population during the perception delay was not predictive of target location during the memory delay, indicating different patterns of population activity during perception and WM. Different population dynamics also exist between target encoding and memory epochs in the WM task.
Influence of naturalistic task elements
One unique element of our task is the complex virtual environment in which it takes place since it contains nonrelevant task stimuli. Based on the robust WM signals we describe, the LPFC may allow for the encoding of representations that are uniquely dissociated from distracting stimuli. Indeed, previous studies demonstrate that LPFC differs from areas such as the posterior parietal cortex where WM representations are perturbed by visual distractors (Suzuki and Gottlieb, 2013; Jacob and Nieder, 2014). Evidence collected decades earlier from studies by Malmo (1942) and Orbach and Fischer (1959) also report the importance of the PFC in maintaining WM representations in the presence of irrelevant incoming visual signals. However, we must be cautious when defining nonrelevant stimuli, particularly in our virtual WM task where some of the elements of the environment (e.g., tree) may potentially be used as landmarks to estimate the target location during navigation.
Importantly, despite unconstrained eye movements, animals perform well on our WM task and the neuronal population maintains target selectivity and information about remembered location throughout the delay epoch. These findings may seem to contradict some previous literature showing that forced saccadic eye movements during memory delay reduces WM performance in human subjects (Postle et al., 2006) and differentiates the LPFC from regions like the frontal eye fields where shifts in gaze disrupt WM signals (Balan and Ferrera, 2003). However, a distinction between our task and previous research is the production of forced versus naturally occurring saccades. Because the latter may be spontaneously and voluntarily triggered by the subjects, they may not interfere with performance in the same manner as task-dependent saccades. Indeed, before the widespread use of the ODR and other oculomotor-dependent tasks, simple delayed response tasks were used that displayed two targets and relied on an arm motor response using the Wisconsin General Test Apparatus or button pressing. Although eye movements were not controlled in these classic experiments, studies reported neurons in the PFC that displayed clear delay activity and spatial selectivity (Fuster and Alexander, 1971; Kojima and Goldman-Rakic, 1982).
Natural eye behavior and visuospatial working memory
Although our experimental paradigms aimed to approach natural behavior, potential concerns may arise surrounding the decision to not control eye position. For example, one may argue that animals would simply visually rehearse the target location by maintaining gaze fixation on the target of interest. We found substantial evidence against this behavioral strategy. Eye behavior differed between periods when the target was available compared with times when the target was unavailable like during the WM delay and response epochs. During WM delay, animals spent significantly less time looking onscreen, suggesting eye movement behavior that is less focused on specific elements in the environment such as target location. The number of fixations to target locations during WM delay only comprised 3% of fixations, and there was no significant difference between the number of fixations on target between correct and incorrect trials, suggesting that fixation on target location during delay was not used as a successful behavioral strategy. From these results, one may infer that the LPFC maintains an allocentric representation of the remembered location that is independent from gaze or fixation position. This issue, however, needs further exploration.
Using linear classifiers, we also identified that eye position on-screen was significantly more predictive of target location during the cue epoch compared with the delay epoch. Classifiers that were trained on eye position data from the cue epoch and tested on eye position data from the delay epoch resulted in decoding accuracy below chance level, suggesting different eye movement patterns between target encoding and memory maintenance. Moreover, in a recent study, we demonstrate that eye behavior remains unaffected by pharmaceutical manipulation that severely reduces WM coding and performance. In this study, despite significant changes to WM processing, gaze is equally as predictive of target location before and after systemic ketamine administration (Roussy et al., 2021a).
Saccade characteristics are influenced by external motivations like task reward (Takikawa et al., 2002). Increases in peak velocities have been observed for task-related saccades—when fixating on a target is needed for information processing—compared with saccades without a task-related motivation (Bieg et al., 2012). This increased saccadic speed may be used to gather task-related information quicker. Saccades to target locations may be considered task-relevant compared with nontarget saccades, thus supporting correct task completion and reward. We found that saccades that land on target versus those that land off target show a greater difference in velocity when the target is physically present during the cue epoch or perception task compared with when it is removed during the WM delay. In fact, there were no significant differences in saccade speed to targets compared with nontargets during the WM delay. This may suggest that saccades to target locations during memory delay were influenced less by task-relevant motivation and information seeking than those made during the cue-encoding epoch. Alternatively, it may reflect the fact that visually guided saccades to a target show higher peak velocities than to an “empty” location in space (Edelman et al., 2006).
Another potential issue is contamination of WM signals by signals related to eye movement. We explored the amount of information contained by neural activity about target location during fixation on target locations during the cue and delay epochs. We found significantly lower decoding accuracy during the delay epoch compared with the cue epoch, suggesting that more information was available to the neuronal population when animals fixate on a target that is present compared with when the target is absent. Indeed, the decoding accuracy during the delay epoch was close to chance (25%), suggesting that animals did not receive substantial spatial information about the target location during periods of target location fixation during delay. These results may be because of the activation of visual neurons by the presence of a visual target during the cue epoch.
Although saccadic responses are seen in the PFC, the task and type of motor response required by the task have been shown to alter neuronal responses (Quintana et al., 1988; Yajeya et al., 1988; Sakagami and Niki, 1994; Johnston and Everling, 2006; Warden and Miller, 2010). Neuronal responses to eye movements like saccades in the PFC are often identified during trials of tasks that are contingent on an oculomotor response. Neuronal responses to saccades are, however, notably absent when saccades are spontaneous and task independent such as during intertrial intervals (Funahashi, 2014). Indeed, in a recent study, using the same virtual task, we analyzed the proportion of neurons that were tuned for saccade position in retinocentric and spatiocentric reference frames. Only 9% of neurons were tuned for retinocentric saccades and 11% for spatiocentric. More importantly, only 2% and 3% of neurons, respectively, were also tuned for remembered target location, providing further evidence for separate populations of neurons that code for eye position and remembered locations during WM tasks (Roussy et al., 2021a).
Perception and working memory in areas 8a and 9/46
The separation of perception and WM has been recognized since 1883 when neurologic conditions were described in which patients exclusively lost either the ability to perceive objects or picture them in mind (Bernard, 1883; Behrmann et al., 1994). Early lesion studies also point to a separation of these functions in LPFC in which large lesions consistently produced WM deficits while retaining perceptual discrimination functions (for review, see Roussy et al., 2021b). Moreover, pharmacological manipulations using muscimol and ketamine produce WM deficits without altering perceptual performance (Sawaguchi and Iba, 2001; Roussy et al., 2021a).
Here, we found that population codes for perception and WM representations of target location are not interchangeable. This finding is supported by previous work from the study by Mendoza-Halliday and Martinez-Trujillo (2017), who found separate populations of LPFC neurons that code for either perception or WM for visual motion direction. After combining neurons into a pseudopopulation, they further demonstrated that a decoder using population activity patterns could discriminate whether neuronal representations were perceptual or mnemonic, suggesting different patterns of neuronal activity corresponding to each function. That study, however, used pseudopopulations of neurons rather than simultaneously recorded neurons to examine WM for motion direction and did not use naturalistic virtual tasks in which gaze is unconstrained. Our results expand on and validate the results of that study for naturalistic visuospatial WM.
How is it possible for the LPFC to represent perceived visual features without confounding WM representations? One possibility is that patterns of activity remain separate through the activation of perceptual, mnemonic, and mixed neurons. Activity patterns of perception and WM cells may help the brain monitor and discriminate between the internal (WM) and external (perception) representations. Abnormal patterns of activation may cause disruptions in internally and externally driven representations triggering hallucinations, for example, if perceptual neurons are activated without visual input. Interestingly, ketamine administration similarly disrupts patterns of activity during WM through disinhibition of neuron activity for nonpreferred locations, causing severe WM deficits (Roussy et al., 2021a).
Conclusion
Our findings provide evidence of robust perceptual and WM representations in the macaque monkey LPFC during naturalistic tasks in virtual environments in which eye movements are unconstrained and the visual scene contains complex stimuli. We find minimal impact of natural eye movement on WM performance or neuronal coding for WM. Finally, we provide evidence for different neural codes for perceptual and mnemonic representations in the LPFC.
Footnotes
This work was supported by a Canadian Institute of Health Research Project Grant; the Natural Sciences and Engineering Research Council of Canada (NSERC); an Ontario Graduate Scholarship; the Chrysalis Foundation (London, Ontario); and NEURONEX (NSF-CIHR-DFG Brain Initiative grant). We thank registered veterinary technicians Kim Thomaes and Rhonda Kersten from Western University for assistance in surgery and animal care; Guillaume Doucet from the University of Ottawa for technical assistance related to the Unreal Development Kit; Kevin Barker from Neuronitek for engineering equipment for our experiments; and Jonathan C. Lau from the Division of Neurosurgery, University Hospital, for providing advice regarding surgery and surgical planning.
L.P. has received salary support from the Tanna Schulich Endowment Chair for Neuroscience and Mental Health. The authors declare no other competing financial interests.
References
- Baddeley AD (1986) Working memory. Oxford: Clarendon Press. [Google Scholar]
- Balan PF, Ferrera VP (2003) Effects of gaze shifts on maintenance of spatial memory in macaque frontal eye field. J Neurosci 23:5446–5454. 10.1523/JNEUROSCI.23-13-05446.2003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Behrmann M, Moscovitch M, Winocur G (1994) Intact visual imagery and impaired visual perception in a patient with visual agnosia. J Exp Psychol Hum Percept Perform 20:1068–1087. 10.1037/0096-1523.20.5.1068 [DOI] [PubMed] [Google Scholar]
- Bendiksby MS, Platt ML (2006) Neural correlates of reward and attention in macaque area LIP. Neuropsychologia 44:2411–2420. 10.1016/j.neuropsychologia.2006.04.011 [DOI] [PubMed] [Google Scholar]
- Bernard DAF (1883) Clinique des maladies nerveuses: un cas de suppression brusque et isolée de la vision mentale des signes et des objets: formes et couleurs. Paris: Imprimerie Alcan-Lévy. [Google Scholar]
- Bieg HJ, Bresciani JP, Bülthoff HH, Chuang LL (2012) Looking for discriminating is different from looking for looking's sake. PLoS One 7:e45445. 10.1371/journal.pone.0045445 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blonde JD, Roussy M, Luna R, Mahmoudian B, Gulli RA, Barker KC, Lau JC, Martinez-Trujillo JC (2018) Customizable cap implants for neurophysiological experimentation. J Neurosci Methods 304:103–117. 10.1016/j.jneumeth.2018.04.016 [DOI] [PubMed] [Google Scholar]
- Boulay CB, Pieper F, Leavitt M, Martinez-Trujillo J, Sachs AJ (2016) Single-trial decoding of intended eye movement goals from lateral prefrontal cortex neural ensembles. J Neurophysiol 115:486–499. 10.1152/jn.00788.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bullock KR, Pieper F, Sachs AJ, Martinez-Trujillo JC (2017) Visual and presaccadic activity in area 8Ar of the macaque monkey lateral prefrontal cortex. J Neurophysiol 118:15–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Constantinidis C, Funahashi S, Lee D, Murray JD, Qi XL, Wang M, Arnsten AFT (2018) Persistent spiking activity underlies working memory. J Neurosci 38:7020–7028. 10.1523/JNEUROSCI.2486-17.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corrigan BW, Gulli RA, Doucet G, Martinez-Trujillo JC (2017) Characterizing eye movement behaviors and kinematics of non-human primates during virtual navigation tasks. J Vis 17(12):15, 1–22. 10.1167/17.12.15 [DOI] [PubMed] [Google Scholar]
- Di Stasi LL, Mccamy MB, Catena A, Macknik SL, Cañas JJ, Martinez-Conde S (2013) Microsaccade and drift dynamics reflect mental fatigue. Eur J Neurosci 38:2389–2398. 10.1111/ejn.12248 [DOI] [PubMed] [Google Scholar]
- Doucet G, Gulli RA, Martinez-Trujillo JC (2016) Cross-species 3D virtual reality toolbox for visual and cognitive experiments. J Neurosci Methods 266:84–93. 10.1016/j.jneumeth.2016.03.009 [DOI] [PubMed] [Google Scholar]
- Edelman JA, Valenzuela N, Barton JJ (2006) Antisaccade velocity, but not latency, results from a lack of saccade visual guidance. Vision Res 46:1411–1421. 10.1016/j.visres.2005.09.013 [DOI] [PubMed] [Google Scholar]
- Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874. [Google Scholar]
- Funahashi S (2014) Saccade-related activity in the prefrontal cortex: its role in eye movement control and cognitive functions. Front Integr Neurosci 8:54. 10.3389/fnint.2014.00054 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Funahashi S, Bruce CJ, Goldman-Rakic PS (1989) Mnemonic coding of visual space in the monkey's dorsolateral prefrontal cortex. J Neurophysiol 61:331–349. 10.1152/jn.1989.61.2.331 [DOI] [PubMed] [Google Scholar]
- Fuster JM, Alexander GE (1971) Neuron activity related to short-term memory. Science 173:652–654. 10.1126/science.173.3997.652 [DOI] [PubMed] [Google Scholar]
- Goldman-Rakic PS (1995) Cellular basis of working memory. Neuron 14:477–485. 10.1016/0896-6273(95)90304-6 [DOI] [PubMed] [Google Scholar]
- Hasegawa R, Sawaguchi T, Kubota K, Fuster K (1998) Monkey prefrontal neuronal activity coding the forthcoming saccade in an oculomotor delayed matching-to-sample task. J Neurophysiol 79:322–333. 10.1152/jn.1998.79.1.322 [DOI] [PubMed] [Google Scholar]
- Jacob SN, Nieder A (2014) Complementary roles for primate frontal and parietal cortex in guarding working memory from distractor stimuli. Neuron 83:226–237. 10.1016/j.neuron.2014.05.009 [DOI] [PubMed] [Google Scholar]
- Johnston K, Everling S (2006) Neural activity in monkey prefrontal cortex is modulated by task context and behavioral instruction during delayed-match-to-sample and conditional prosaccade-antisaccade tasks. J Cogn Neurosci 18:749–765. 10.1162/jocn.2006.18.5.749 [DOI] [PubMed] [Google Scholar]
- Kojima S, Goldman-Rakic PS (1982) Delay-related activity of prefrontal neurons in rhesus monkeys performing delayed response. Brain Res 248:43–49. 10.1016/0006-8993(82)91145-3 [DOI] [PubMed] [Google Scholar]
- Leavitt ML, Pieper F, Sachs AJ, Martinez-Trujillo JC (2017a) Correlated variability modifies working memory fidelity in primate prefrontal neuronal ensembles. Proc Natl Acad Sci U|S|A 114:E2494–E2503. 10.1073/pnas.1619949114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leavitt ML, Mendoza-Halliday D, Martinez-Trujillo JC (2017b) Sustained activity encoding working memories: not fully distributed. Trends Neurosci 40:328–346. 10.1016/j.tins.2017.04.004 [DOI] [PubMed] [Google Scholar]
- Leavitt ML, Pieper F, Sachs AJ, Martinez-Trujillo JC (2018) A quadrantic bias in prefrontal representation of visual-mnemonic space. Cereb Cortex 28:2405–2421. 10.1093/cercor/bhx142 [DOI] [PubMed] [Google Scholar]
- Malmo RB (1942) Interference factors in delayed response in monkeys after removal of frontal lobes. J Neurophysiol 5:295–308. 10.1152/jn.1942.5.4.295 [DOI] [Google Scholar]
- Mendoza-Halliday D, Martinez-Trujillo JC (2017) Neuronal population coding of perceived and memorized visual features in the lateral prefrontal cortex. Nat Commun 8:15471. 10.1038/ncomms15471 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller EK, Erickson CA, Desimone R (1996) Neural mechanisms of visual working memory in prefrontal cortex of the macaque. J Neurosci 16:5154–5167. 10.1523/JNEUROSCI.16-16-05154.1996 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orbach J, Fischer GJ (1959) Bilateral resections of frontal granular cortex: factors influencing delayed response and discrimination performance in monkeys. Arch Neurol 1:78–86. 10.1001/archneur.1959.03840010080010 [DOI] [PubMed] [Google Scholar]
- Petrides M (2005) Lateral prefrontal cortex: architectonic and functional organization. Philos Trans R Soc Lond B Biol Sci 360:781–795. 10.1098/rstb.2005.1631 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Postle BR, Idzikowski C, Sala S, Della Logie RH, Baddeley AD (2006) The selective disruption of spatial working memory by eye movements. Q J Exp Psychol (Hove) 59:100–120. 10.1080/17470210500151410 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quintana J, Yajeya J, Fuster JM (1988) Prefrontal representation of stimulus attributes during delay tasks. I. Unit activity in cross-temporal integration of sensory and sensory-motor information. Brain Res 474:211–221. 10.1016/0006-8993(88)90436-2 [DOI] [PubMed] [Google Scholar]
- Roussy M, Luna R, Duong L, Corrigan B, Gulli RA, Nogueira R, Moreno-Bote R, Sachs AJ, Palaniyappan L, Martinez-Trujillo JC (2021a) Ketamine disrupts naturalistic coding of working memory in primate lateral prefrontal cortex networks. Mol Psychiatry 26:6688–6703. 2021, 10.1038/s41380-021-01082-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roussy M, Mendoza-Halliday D, Martinez-Trujillo JC (2021b) Neural substrates of visual perception and working memory: two sides of the same coin or two different coins? Front Neural Circuits 15:764177. 10.3389/fncir.2021.764177 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sakagami M, Niki H (1994) Encoding of behavioral significance of visual stimuli by primate prefrontal neurons: relation to relevant task conditions. Exp Brain Res 97:423–436. 10.1007/BF00241536 [DOI] [PubMed] [Google Scholar]
- Sawaguchi T, Iba M (2001) Prefrontal cortical representation of visuospatial working memory in monkeys examined by local inactivation with muscimol. J Neurophysiol 86:2041–2053. 10.1152/jn.2001.86.4.2041 [DOI] [PubMed] [Google Scholar]
- Suzuki M, Gottlieb J (2013) Distinct neural mechanisms of distractor suppression in the frontal and parietal lobe. Nat Neurosci 16:98–104. 10.1038/nn.3282 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takikawa Y, Kawagoe R, Itoh H, Nakahara H, Hikosaka O (2002) Modulation of saccadic eye movements by predicted reward outcome. Exp Brain Res 142:284–291. 10.1007/s00221-001-0928-1 [DOI] [PubMed] [Google Scholar]
- Warden MR, Miller EK (2010) Task-dependent changes in short-term memory in the prefrontal cortex. J Neurosci 30:15801–15810. 10.1523/JNEUROSCI.1569-10.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yajeya J, Quintana J, Fuster JM (1988) Prefrontal representation of stimulus attributes during delay tasks. II. The role of behavioral significance. Brain Res 474:222–230. 10.1016/0006-8993(88)90437-4 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
MATLAB codes used to analyze the data are available from author M.R. Data supporting the findings of this study are available from the corresponding authors on reasonable request and will be fulfilled by M.R.