Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2024 Jan 26;121(5):e2312898121. doi: 10.1073/pnas.2312898121

A disinhibitory circuit mechanism explains a general principle of peak performance during mid-level arousal

Lola Beerendonk a,b,1,2, Jorge F Mejías a,c,1,2, Stijn A Nuiten a,b,d, Jan Willem de Gee a,c, Johannes J Fahrenfort e,f, Simon van Gaal a,b,2
PMCID: PMC10835062  PMID: 38277436

Significance

When people are repeatedly performing the exact same task, their performance fluctuates over time. One of the causes of this behavioral variability is spontaneous fluctuations in arousal. According to the Yerkes–Dodson law, task performance is optimal at moderate levels of species’ arousal, with impaired performance at very low or high arousal levels. However, until now, the evidence supporting this law as a general mechanism in human decision-making is mixed, and a neural mechanism that may explain the inverted U-shaped arousal–performance relationship is lacking. We show that the Yerkes–Dodson law is a general law that holds for human observers across decision-making tasks and settings. Furthermore, we present a simple and neurobiologically plausible mechanistic model that can explain its existence.

Keywords: task performance, decision-making, spontaneous arousal, computational modeling

Abstract

Perceptual decision-making is highly dependent on the momentary arousal state of the brain, which fluctuates over time on a scale of hours, minutes, and even seconds. The textbook relationship between momentary arousal and task performance is captured by an inverted U-shape, as put forward in the Yerkes–Dodson law. This law suggests optimal performance at moderate levels of arousal and impaired performance at low or high arousal levels. However, despite its popularity, the evidence for this relationship in humans is mixed at best. Here, we use pupil-indexed arousal and performance data from various perceptual decision-making tasks to provide converging evidence for the inverted U-shaped relationship between spontaneous arousal fluctuations and performance across different decision types (discrimination, detection) and sensory modalities (visual, auditory). To further understand this relationship, we built a neurobiologically plausible mechanistic model and show that it is possible to reproduce our findings by incorporating two types of interneurons that are both modulated by an arousal signal. The model architecture produces two dynamical regimes under the influence of arousal: one regime in which performance increases with arousal and another regime in which performance decreases with arousal, together forming an inverted U-shaped arousal–performance relationship. We conclude that the inverted U-shaped arousal–performance relationship is a general and robust property of sensory processing. It might be brought about by the influence of arousal on two types of interneurons that together act as a disinhibitory pathway for the neural populations that encode the available sensory evidence used for the decision.


Repeated presentations of the same noisy sensory input often lead to varying perceptual decisions (1, 2). One of the factors that contribute to this behavioral variability is the arousal state of the brain, which fluctuates spontaneously over time, causing variability in neuronal responsiveness to sensory stimulation (also often referred to as brain state) (36). The main drivers of these arousal fluctuations are the catecholaminergic and cholinergic neurotransmitter systems originating from the locus coeruleus and the basal forebrain, respectively. These neuromodulators have widespread effects throughout the brain, from controlling global cortical dynamics (e.g., low frequency power ref. 7) to tuning the gain of sensory processing (810). Pupil dilation reflects the activity of neuromodulatory nuclei and can therefore be used to non-invasively approximate fluctuations in arousal state (1116).

Although changes in arousal state affect decision-making abilities in both humans and mice (1722), it remains unclear which level of arousal supports optimal performance. In mice, decision-making seems optimal at intermediate levels of arousal (7, 8), reflecting an inverted U-shaped relationship, in line with the well-known Yerkes–Dodson law (23). In humans, however, evidence is mixed. While some studies indeed show U-shaped arousal–performance associations in humans (1820, 22, 24, 25), others report linear associations (17, 18, 26).

Here, we addressed these mixed results in the (human) literature and aimed to improve our understanding of the neural mechanisms underlying the arousal–performance association, which is not well understood in any species. To do so, we collected a large dataset containing six behavioral tasks that differed in terms of the decision type (detection versus discrimination response) and the sensory modality being recruited (audition versus vision) using a within-subject design (Fig. 1A). This dataset allowed us to explore the shape of the relationship between naturally occurring spontaneous fluctuations in pre-stimulus pupil-linked arousal and perceptual decision-making in humans. Aside from different sensory modalities and decision types, all tasks had comparable trial structures and behavioral accuracy was matched across all tasks (see Methods for details). To anticipate our findings, we observed that the arousal–performance relationship was clearly inverted U-shaped, regardless of the sensory modality or decision type at hand.

Fig. 1.

Fig. 1.

Behavioral tasks and binning analysis. (A) General task overview. (B) Example of the continuous pupil signal of three trials with baseline windows [−0.5 to 0 s] divided over five bins. Shaded areas indicate the baseline interval that was used to calculate baseline pupil size. (C) Example data of one session of the auditory detection task of a sample participant that show that bins were determined for each task run (block) separately. Black vertical lines indicate the mean baseline pupil size of each depicted run. Each dot represents the baseline pupil size of a trial, and the color of the dots indicates bin membership (five bins).

After having established a robust inverted U-shaped relationship between spontaneous pupil-linked arousal fluctuations and task performance, and given that inverted U-shaped relationships have been observed in both mice and humans (7, 8, 1820, 22, 24, 25), we aimed to increase our understanding of the arousal–performance relationship by developing a minimal network model reproducing the observed results. We reasoned that explanations that rely on biological constraints common to both species are most prominent candidates to explain its underlying mechanisms. Optimal signal detection properties have been linked with maintaining a certain excitation-inhibition balance in neural circuits (2730) and in mice, a balanced competition between inhibition and excitation (or rather, disinhibition) has been found in circuits with pyramidal neurons and inhibitory cells expressing parvalbumin (PV), somatostatin (SST), and vasoactive intestinal peptide (VIP) (3135). As such circuitry is also present in human and non-human primates (36, 37), mechanisms based on this inhibition/disinhibition balance could provide an explanation for the observed phenomenon across species.

To provide a mechanism that might explain the inverted U-shaped arousal–performance relationship in our detection and discrimination tasks, we built a minimal computational model, adapted from well-known population-based models of neural dynamics used to describe performance on decision-making tasks (38). Our model describes in detail the temporal evolution of the global synaptic conductances corresponding to N-methyl-d-aspartate (NMDA) and gamma-aminobutyric-acid (GABA) receptors of two competing excitatory populations that are indirectly modulated by an arousal signal. Based on our computational model, we conclude that the inverted U-shaped arousal–performance relationship might result from the influence of arousal on two types of interneurons that together act as a disinhibitory pathway for the neural populations that encode sensory evidence.

Results

Participants (N = 28) took part in three all-day (9am to 4pm) experimental sessions in which six different tasks were performed (see Methods for details). In brief, in the auditory domain, participants either discriminated between high and low pitch tones in one task or they detected a low pitch tone versus noise in another (present/absent judgment). In the visual domain, two versions of a visual Gabor detection task and two versions of a visual Gabor orientation discrimination task were performed. Trials of all tasks followed a similar structure with a baseline period, stimulus interval, response interval, and a variable inter-trial interval (ITI) (Fig. 1A), although visual tasks were generally faster paced than auditory tasks (see Methods for details). All tasks were titrated to reach 75% accuracy (the mean accuracy across tasks was 76.3%). First, we tested whether the overall relationship between pupil-indexed arousal and perceptual decision-making was linearly or U-shaped, irrespective of specific task parameters. To this end, we combined the data of all six behavioral tasks, harnessing the large statistical power of our complete dataset (3,570 trials per participant on average). This analysis across tasks gives us a general indication of the arousal–performance relationship, irrespective of sensory modality and decision type. For each run separately, the trials were distributed across twenty equally populated bins based on the average pupil size in the pre-stimulus baseline period (500-ms window preceding the target, Fig. 1 B and C). Note that all bins contained equal numbers of trials, and that therefore the bins did not have equal widths (Fig. 1 B and C). Next, we calculated Signal Detection Theoretic sensitivity [SDTs d′ (39)] and average reaction time (RT) for each pupil bin across all tasks. We used linear mixed models and formal model comparison to assess whether the arousal–performance relationship was linear or quadratic. Using both AIC [Akaike information criterion (40)] and BIC [Bayesian information criterion (41)], we directly compared evidence for linear versus quadratic models. A difference in AIC or BIC values of more than 10 is considered evidence for the winning model to capture the data significantly better (42). In our case, both ΔAIC and ΔBIC were strongly in favor of a quadratic relationship between pre-stimulus pupil size and sensitivity (ΔAIC = 31.0, ΔBIC = 26.7) and RT (ΔAIC = 20.3, ΔBIC = 16.0). In other words, the overall arousal–performance relationship was inverted U-shaped, with the highest sensitivity and shortest RTs at intermediate levels of pupil-linked arousal (Fig. 2, “All”). The association between pupil size and sensitivity was much more pronounced than the association between pupil size and RT, which becomes especially clear when both performance measures are expressed in percent signal change (SI Appendix, Fig. S1).

Fig. 2.

Fig. 2.

Perceptual decision-making is optimal at moderate levels of pupil-linked arousal across sensory modalities and decision types. Perceptual sensitivity (d′, Top row) and RT (Bottom row) for small to large baseline pupils for all tasks combined, auditory tasks, visual tasks, detection tasks, and discrimination tasks, from left to right respectively. We show polynomial regression lines for first- and second-order fits when significantly different from 0 (average regression line of all participants, shading reflects SEM, see Methods). Error bars on the data points represent SEM of the within-subject variation across tasks.

We performed several control analyses to further understand these results. First, it has been argued that the (quadratic) relationship between pupil-linked arousal and performance may be driven by time on task (43), but we found no evidence for this. After regressing out the influence of time on task on pre-stimulus pupil size, the quadratic relationship remained intact (Methods and SI Appendix, Fig. S2). Next, we performed three additional analyses to test whether the quadratic arousal–performance relationship was driven by significant behavioral or pupil events on the previous trial. Specifically, 1) we excluded all trials that followed erroneous responses (23.1% of all trials), possibly upregulating arousal, 2) we performed the analyses after regressing out task-evoked pupil responses on the previous trial (3, 44, 45), and 3) we tested whether the quadratic arousal–performance relationship would also be revealed if the task-evoked pupil response in the previous trial was used as the measure of the pupil-linked arousal state. These analyses further solidified our main conclusions and indicated that the inverted U-shaped relationship between pre-stimulus (baseline) pupil size was not driven by response errors or task-evoked pupil responses on the previous trial (SI Appendix, Fig. S3). Taken together, these control analyses indicate that the inverted U-shaped arousal–performance relationship described here (Fig. 2) is likely driven by spontaneous fluctuations in pupil-indexed arousal.

Next, we addressed whether the arousal–performance relationship depends on the decision type (detection versus discrimination) or the sensory modality (visual versus auditory). To test this, we performed polynomial regression for each sensory modality and decision type separately, averaging across tasks that fell into each category (see Methods for details). This allowed us to step aside from specific experimental manipulations of each of the six tasks individually, which will be reported elsewhere (in the future) (see e.g., ref. 46, and focus on generalities across these tasks, namely which sensory modality was recruited (visual or auditory) and what was the task to be performed (discrimination or detection). Additionally, this ensured enough trials per analysis to reliably compute regressions across pupil bins. To further warrant statistical power, these analyses of subsets of the data were performed with five pupil size bins to account for lower trial numbers after splitting the data (Fig. 1 B and C).

Similar to the first analysis, perceptual sensitivity (d′) was highest and RT lowest at intermediate levels of pupil-linked arousal for both auditory and visual tasks (auditory d′ ß2: t(26) = −2.79, P = 0.005, d = −0.55, visual d′ ß2: t(27) = −3.88, P < 0.001, d = −0.75; auditory RT ß2: t(26) = 2.41, P = 0.01, d = 0.47; visual RT ß2: t(27) = 3.26, P = 0.002, d = 0.63), and for both detection and discrimination tasks (detection d′ ß2: t(27) = −3.96, P < 0.001, d = −0.76, discrimination d′ ß2: t(27) = −2.39, P = 0.01, d = −0.46; detection RT ß2: t(27) = 2.65, P = 0.006, d = 0.51, discrimination RT ß2: t(27) = 2.05, P = 0.03, d = 0.39; Fig. 2). Again, the association between pupil size and sensitivity was much more pronounced than the association with RTs (SI Appendix, Fig. S1). Note that two out of four RT analyses additionally showed positively linear relationships (auditory RT ß1: t(26) = 3.29, P = 0.003, d = 0.70; discrimination RT ß1: t(27) = 2.30, P = 0.03, d = 0.43), further elaborated on in the Discussion. Interestingly, the pupil-sensitivity relationship appeared to be mainly driven by an effect of pre-stimulus pupil size on hit rate, rather than false alarm rate (SI Appendix, Fig. S4), in line with observations in mice (7, 8). In sum, the inverted U-shaped arousal–performance relationship can be observed for both decision types and for both sensory modalities (most prominently for d′), despite general differences in the overall timing of these tasks (visual tasks were faster paced). Therefore, optimal performance at moderate levels of spontaneous pupil-linked arousal appears to be a general and robust characteristic of perceptual decision-making.

Even though the inverted U-shaped arousal–performance relationship has been reported before, it remains unclear how it can emerge from neural interactions. In principle, inverted U-shapes would require a change in the dynamical regime of the underlying cortical circuits, and these are not well known. One plausible candidate mechanism is the interplay between different interneuron types expressing PV, SST, and VIP, since it seems to play a role in disinhibitory and paradoxical dynamics (defined as a counterintuitive response of a circuit to a given external input) in cortical circuits (35, 4750). We therefore explore whether the interplay between PV, SST, and VIP can explain the nonlinear responses to arousal input observed.

We built a firing rate-based computational model of a cortical circuit performing a detection and/or discrimination task (38) under the influence of a hypothetical arousal signal that is highly correlated with pupil size. The circuit describes the continuous temporal evolution of biophysical variables—such as firing rates and synaptic conductances of neural populations (see Methods for more details) and describes the interactions between two excitatory neural populations (EA and EB; encoding the two behavioral choices A and B) mediated by synapses with NMDA receptors (Fig. 3A). The model incorporates a non-selective population of PV interneurons, whose role is to provide a baseline level of inhibition and to mediate the competition between both excitatory populations. In the absence of any modulatory signal or further components, this constitutes a well-studied model (38) which reflects competitive dynamics between choices A and B. More precisely, both excitatory populations receive sensory input, which propels them to ramp up their own activity and suppress the activity of the other excitatory population—a process mediated by the inhibitory population PV. This gives rise to a winner-take-all decision process (38), which corresponds in our case to a detection (A: present, B: absent) or a discrimination task (choice A, choice B). A decision is made when the firing rate of one of the excitatory populations reaches a certain predefined threshold (Fig. 3B).

Fig. 3.

Fig. 3.

Computational model explaining decision-making behavior. (A) Schematic description of the model, with excitatory populations EA and EB encoding two possible choices in a decision-making task (choice A and B), which is also modulated by a common arousal signal to VIP and SST cells, which are also specific to choice A or B. Connectivity follows electrophysiological evidence and previous modeling approaches. Arrows indicate excitatory connections, dots indicate inhibitory connections. (B) Example of the excitatory activity of a decision, in which population EA (dark blue) wins over population EB (light blue) when its firing rate reaches the decision threshold. (C) Performance of the model, defined as the sensitivity (d′) for 300 trials, as a function of the normalized input, signal strength. (D) Mean RT of the model as a function of input, signal strength. (E) Performance of the model as a function of the strength of the arousal signal. (F) Mean RT versus arousal strength. Results in (E and F) averaged over 3,000 trials. (G) Simulation of a ~400-s session, corresponding to 266 consecutive trials, in which the level of arousal (blue line) slowly fluctuates over time and modulates the probability of making a correct choice (green dots). The optimal arousal level is indicated with a solid horizontal line. Dashed lines indicate SD of the arousal signal and shaded areas indicate that arousal is within one SD of the optimal arousal level. (H) Schematic description of how the interaction between arousal, VIP and SST cells brings about the inverted U-shaped relationship between arousal and perceptual decision-making. For panel (BD), the arousal level is 0.4. For panels (B) and (EG), the signal strength is 0.1.

To incorporate the effects of arousal modulation, which were not considered in previous work (38), our model here also incorporates populations of SST and VIP cells, one of each per behavioral choice, as these cells have been found to be choice specific (51). Following experimental evidence, VIP, and SST cells form a connectivity motif which may disinhibit the activity of the corresponding excitatory populations (31) and receive top-down input from other brain areas. We assumed that these neural populations act as a disinhibitory pathway to excitatory cells (31), although we also considered that the dynamics of the VIP-SST-excitatory cell circuit can switch from disinhibition to inhibition depending on the input level and other conditions (35). Finally, we considered that sensory input was delivered to the excitatory populations and that a hypothetical arousal signal reached both VIP and SST populations (similar to a top-down modulatory signal).

An example trial of the model in the absence of any modulation by the arousal signal, and therefore equivalent to the situation modeled in previous work (38), is shown in Fig. 3B, where the competition between both excitatory populations is won by choice A as the corresponding firing rate EA reaches the threshold of 15 spikes/s. To verify that the model showed basic detection behavior, we first varied the signal strength of the sensory input. Sensitivity and chronometric curves of the model output choices (Fig. 3 C and D), reflected that input detections became easier and were quicker for trials with higher signal strength, similar to the case for decision-making tasks (52) and in agreement with previous models (38).

When we simulated the model for different strengths of the arousal signal arriving at SST and VIP populations, while keeping signal strength constant as in the human experiments, we observed that the sensitivity of the model followed an inverted U-shape with respect to the arousal level (Fig. 3E). In the model, this behavior emerges due to the existence of two dynamical regimes, one for low arousal levels (i.e., arousal strength smaller than 0.4), and another for high arousal levels (larger than 0.4) (Fig. 3H; also see SI Appendix, Fig. S6). For low arousal levels, the system is typically in a suboptimal state—in the sense that the firing rates evoked by the sensory input are rarely elevated enough to trigger a detection decision. As the arousal level increases, the VIP activity inhibits the firing of SST cells, which in turn disinhibits the excitatory populations. This provides an additional input, which makes it easier for the circuit to correctly detect the sensory signal, and the sensitivity overall increases. This trend is maintained until arousal levels of ~0.4 are reached in the model, corresponding to optimal detection, when VIP cells are strongly driven by the arousal signal and high or even saturated firing rate levels of VIP firing are reached. For higher arousal levels, VIP inhibition to SST is not able to compensate for the excitatory effects of the arousal signal to SST. Therefore, the net effect of the arousal signal in SST cells now becomes excitatory, leading to the inhibition of the excitatory populations. This leads to a decrease in performance with further increasing arousal levels as the circuit returns to a suboptimal state, thus completing the inverted U-shape. See SI Appendix, Fig. S6 for a graphical depiction of the inhibitory strengths provided by SST and VIP at each of these conditions.

As easier trials (i.e., with stronger sensory input) lead to quicker RTs in the model, we also obtained the corresponding U-shaped relationship between RTs and arousal levels (Fig. 3F) that we also found experimentally. In a realistic setting in which the arousal level would slowly fluctuate across trials within a given session, this mechanism would lead to changes in the detection probability across the session (Fig. 3G). Due to the symmetry that the model assumes between excitatory populations A and B (Fig. 3A), the results of the model are the same for the case of a discrimination task, in which the model must react to input for either excitatory population A or B.

Discussion

We showed that perceptual variability of repeated identical near-threshold stimuli was predicted by ongoing fluctuations in cortical arousal, as indexed by pupillometry. More specifically, we demonstrated that the arousal–performance relationship is inverted U-shaped, with optimal performance at moderate levels of pupil-indexed arousal. This nonlinear relationship does not appear to depend on the sensory modality (vision/audition) or decision type (discrimination/detection) at hand and was also robust to potential differences in task-related factors, such as time on task (i.e., fatigue), potential upregulation of arousal by decision errors, task-evoked pupil responses on the previous trial, and general differences in the overall timing of these tasks. Therefore, the inverted U-shaped arousal–performance relationship appears to reflect a general, flexible, and robust property of sensory processing in humans.

To provide a potential neural mechanism explaining these observations, we built a simple computational model that incorporated different types of interneurons, VIP and SST, which are both modulated by an arousal signal. In the case of humans and other primates, similar action may be expected from interneurons expressing calbindin and calretinin, which are thought to be analogous to SST and VIP in rodents (53) and similarities between both circuits have been well documented (36). When we simulated different arousal strengths, this architecture produced two dynamical regimes that depend strongly on the firing rate of the interneurons. In the first regime, VIP firing increases under the influence of increasing arousal, thereby inhibiting SST firing and disinhibiting the firing of the excitatory populations, leading to increasing performance with increasing arousal. Conversely, in the second regime, performance decreases with increasing arousal as a result of saturated VIP firing, disinhibition of SST firing, and subsequent inhibition of the excitatory populations (Fig. 3H). Together, these regimes form the inverted U-shaped relationship between arousal and perceptual decision-making that we observed in our empirical data.

The computational model proposed to explain the observed experimental findings constitutes only a first step toward a complete understanding of the mechanisms underlying the Yerkes–Dodson law. The model also relies on several hypotheses, which will need to be tested in future experimental studies, although some hypotheses are already partially supported by existing data. For example, our model requires the presence of a disinhibitory mechanism involving SST and VIP cells. It is unclear which brain region is responsible for the inverted U-shape relationship, but SST and VIP cells are found throughout the neocortex (54) and such disinhibitory mechanisms have been found across multiple areas, including visual and somatosensory cortex (31, 47, 49). We also assumed that arousal signals should be able to modulate the firing of SST and VIP cells, which is plausible given that these cell types are known to be modulated by top-down signals (31). Finally, the saturation of VIP activity is crucial for our mechanism to replicate the inverted U-shape. Electrophysiological and optogenetics studies have shown that VIP cells indeed display spike frequency adaptation (55, 56), which should be able to limit the response of VIP cells receiving large arousal signals. In addition, strong short-term synaptic depression effects have been found in GABAergic cells in layer 2/3 (57), where VIP cells are the most common interneuron type (32, 58). Previous computational work has shown that the combination of spike frequency adaptation (or other types of neuron-level adaptation) and short-term synaptic depression is highly effective to drastically reduce signal transmission and generate saturation of activity (5961), suggesting that VIP saturation is a biologically plausible scenario in this context.

More precise descriptions of the influence of arousal signals on task performance will be necessary to extend and improve our current model in the future. For example, it is well known that acetylcholine and catecholamines have a wide range of modulatory effects on task performance, which are not explicitly included in our present model. Likewise, we have focused here on the effects of arousal signals on SST and VIP cells, which constitutes a simplification. In this approach, we do not consider how arousal-related signals, such as cholinergic modulation for example, affect the physiological properties of pyramidal cells, such as excitability (62, 63), spike frequency adaptation (6466) and modulation of both GABAergic and glutamatergic transmission (67). Future modeling work should aim to incorporate some of these properties and investigate their potential effects on the Yerkes–Dodson law.

Besides providing a mechanistic explanation for our experimental results, our computational model also makes some testable predictions. As these predictions involve the activity of VIP and SST cells, we propose that optogenetic techniques used in mice performing tasks during different arousal conditions (68) would be a potential way to validate these predictions experimentally. First, we predict that inhibitory and disinhibitory dynamics mediated by SST and VIP cells play a fundamental role in the emergence of optimal levels of arousal for decision-making tasks, and interfering with these mechanisms should abolish (or modify) the inverted U-shaped relationship. Second, we anticipate that VIP firing will be moderate for low arousal levels and, due to VIP saturation, will be elevated and similar for both intermediate and high arousal levels. Likewise, SST will display moderate firing rates for low and high arousal levels, and low firing rates for intermediate arousal levels. We expect that further development of this model will provide additional insight into the role of arousal in detection and discrimination tasks, for example by extending this model to include more cortical areas—in line with recent advances (68, 69).

Our results bear a strong resemblance to various findings of an inverted U-shaped relationship between pre-stimulus oscillatory power dynamics in the 8 to 12 Hz (alpha) band and perceptual decision-making (70, 71). Indeed, pupil size and pre-stimulus alpha power are believed to be two sides of the same coin: High arousal states go hand in hand with alpha power suppression and pupil dilation (6, 11), and alpha power and pupil diameter appear to be coupled during quiet wakefulness (72). Yet, there also appear to be dissociations between the effects of pre-stimulus alpha power and pupil-linked arousal on behavior (16), and there is evidence that suggests that the relationship between alpha power and pupil size is nonlinear (18, 73). Moreover, a set of studies indicates that pre-stimulus alpha oscillations drive decision bias by modulating neuronal excitability (7477), rather than perceptual sensitivity as we report here. Future work should aim to bridge the gap between these parallel fields that investigate arousal through pre-stimulus pupil size and pre-stimulus oscillatory dynamics to further our understanding of the influence of the cortical state on perceptual decision-making.

The relationship between arousal and sensitivity in our study was clearly inverted U-shaped across both sensory modalities and decision types. For RTs, however, we observed additional linear relationships for two out of four task variations (note that arousal-RT associations were less strong than for d′ however). We cannot completely rule out the possibility that the arousal-RT relationship depends on sensory modality or decision type, and future work is needed to address the issue. In fact, although the observed quadratic arousal-sensitivity relationship was clearly robust to task parameters, some previous studies in humans have also reported linear effects. In mice, primarily U-shaped functions are observed (7, 8, 45): Mice perform best at an auditory detection task during quiet wakefulness, and worse during states of low and high arousal (including locomotion) (8). In humans, however, the shape of the arousal–performance relationship appears to be less clear. Some studies have suggested that increased arousal linearly improves perceptual decision-making (17, 18, 26), while others report optimal performance at intermediate levels of arousal (1820, 22, 24, 25).

Although we have not been able to find clear consistencies in the available literature causing this dichotomy, we identify (at least) four possible explanations. First, it is possible that the studies that report linear effects employed easier tasks than we did here [or less perceptually complex tasks (78)]. The seminal work by Yerkes and Dodson (23) predicts a quadratic effect of arousal on performance for difficult tasks, but a (positive) linear effect for easy tasks. There is hardly any behavioral evidence for the difference in the shape of arousal–performance relationships between easy and difficult tasks in the domain of perception, but this issue deserves further investigation (78). Second, studies reporting linear effects might investigate different parts or only a limited range of the arousal spectrum. To illustrate, it may be that if one only samples the low end of the arousal spectrum (low to medium arousal), one may end up with a positive linear relationship between arousal and performance (hence missing the downward going part for high arousal states). Third, it is possible that the quadratic relationship that we report only holds for spontaneous fluctuations in arousal during task performance while sitting still—and not for other arousal levels, such as pharmacologically manipulated states (26) or arousal levels ranging from quiet wakefulness to locomotion (24). Fourth, quadratic relationships are not always considered, for instance when only two arousal states are compared (21). This may relate to a potential lack of statistical power (e.g., number of trials) that is necessary to dissect the data in (enough) bins to properly uncover the quadratic relationship between arousal and decision behavior.

In conclusion, our findings attribute a general role to spontaneous arousal fluctuations in the modulation of perceptual decision-making. Independent of task parameters, behavioral performance appears to be optimal at intermediate levels of pupil-linked arousal. Based on a computational model, we propose that this inverted U-shaped arousal–performance relationship might result from the influence of arousal on VIP and SST interneurons that together act as a disinhibitory pathway for the neural populations that encode the available sensory evidence.

Methods

Subjects.

For this study, 30 right-handed Dutch-speaking male participants (aged between 18 and 30 y) were recruited from the University of Amsterdam. Because this study involved a pharmacological manipulation, all participants underwent extensive medical and psychological screening to rule out any medical or mental illnesses. All participants gave written consent for participation and received monetary compensation for participation. This study was approved both by the local ethics committee of the University of Amsterdam and the Medical Ethical Committee of the Amsterdam Medical Centre. All participants provided written informed consent after explanation of the experimental protocol. Two participants decided to withdraw from the experiment after the first experimental session. The data from these participants have been excluded from further analyses, resulting in N = 28. Due to technical malfunction, one participant did not perform the auditory tasks, resulting in N = 27 for the auditory data.

Procedure.

Participants took part in three experimental sessions that were separated by at least 1 wk each. On each session, participants received one of three drugs in a random and counterbalanced order: placebo, atomoxetine, or donepezil, but here we only analyze the placebo data. In the first 2 h of each session, participants performed auditory discrimination (sessions 1 and 2) and detection tasks (session 3). In the second half of all sessions, participants performed four versions of visual detection and discrimination tasks (all sessions; the order of the tasks was randomized between participants).

After arriving at the testing site, participants received either placebo (2 sessions) or donepezil (1 session). Right after oral ingestion of the pharmaceutical, participants would perform an auditory staircase task. The staircase procedure lasted approximately 30 min and was immediately followed by the main auditory task, which lasted approximately 1.5 h. After the auditory task, participants then received placebo (2 sessions) or atomoxetine (1 session) to ensure that both donepezil and atomoxetine plasma concentrations would peak at the start of the visual tasks (4 h after the start of the session). The pharmacological intervention is not of interest to this work; therefore, we only analyze the placebo data of the visual tasks. As donepezil plasma concentrations peak after approximately 4 h, we assume that donepezil blood concentration levels were negligible during the auditory tasks (i.e., the first 2 h after donepezil ingestion). For the auditory tasks, we thus analyze the data from all sessions. Below, we provide a brief description of the behavioral tasks, for more details on the pharmacological manipulation, visual tasks, and additional results, please see ref. 46. During all tasks, participants were seated 80 cm from a computer monitor (69 × 39 cm, 60 Hz, 1,920 × 1,080 pixels) in a darkened, sound-isolated room. To minimize head movements, participants rested their heads on a head-mount with a chinrest. All tasks were programmed in Python 2.7 using PsychoPy (79) and in-house scripts.

Behavioral Tasks

Auditory detection and discrimination tasks.

Participants performed two different auditory tasks on separate sessions. The first task was an auditory discrimination task (sessions 1 and 2), in which participants had to discriminate between a high pitch tone (50% of trials) and a low pitch tone (50% of trials) against a background of auditory noise (Fig. 1A). The other task was an auditory detection task (session 3) in which participants had to indicate whether they deemed a target tone to be present or absent alongside auditory noise that was presented on every trial (50% target present trials; Fig. 1A). Besides performing the discrimination or detection task, participants were asked to report the confidence (low or high) in each perceptual decision that they made (not analyzed here). The perceptual decisions and confidence reports were simultaneously given by pressing one of four buttons (A, S, K, and L) on the keyboard. Because the auditory stimuli in these tasks do not inherently convey any direction (such as the visual stimuli discussed below), the meaning of the buttons with respect to the perceptual decisions was counterbalance across subjects. The buttons’ mapping to confidence was the same for all participants (A and L for confident, S and K for not confident).

The auditory stimulus set consisted of dynamic noise and two pure tones (sine waves). The dynamic noise (so-called temporally orthogonal ripple combinations; TORCs) (8, 80) was present on every trial and was always played at the same volume (61.1 dB). On top of the dynamic noise, there was a low pitch sine wave (300 Hz) and a high pitch sine wave (350 Hz) in the discrimination task and only the low pitch tone was used for the detection task. For the discrimination task, the mean tone volume was 3.22% of the TORC volume. For the detection task, the mean tone volume was 3.75% of the TORC volume. Auditory stimuli were binaurally presented through over-ear headphones. The volume of the tones was determined per subject by means of a staircase procedure aimed at 75% correct (Staircasing procedure). To ensure that performance remained around 75%, the volume of the target tones was automatically adjusted between runs. If performance in a run deviated between 5 and 10% of target performance (75%), the volume of the tones was adjusted with 5% of the current tone volume on the following run. If performance deviated more than 10% from target performance, target volume was adjusted with 10%.

Each auditory trial started with a baseline interval (600 ms), followed by the stimulus interval (500 ms) in which the auditory noise was always presented (Fig. 1A). Depending on the task at hand, a target tone was or was not presented for the entire duration of the stimulus interval. After the offset of the sounds, participants had 2.5 s to respond (response interval). The response interval was aborted at response or after 2.5 s if no response was given, followed by a variable ITI (randomly drawn from a uniform distribution between 3 and 4 s).

Participants performed 560 trials of the auditory tasks per session, distributed over four runs of 140 trials that lasted approximately 12 min each. As participants performed the auditory discrimination task on sessions 1 and 2, and the auditory detection task only on session 3, we have collected double the number of trials for the auditory task (1,120 trials).

During the auditory tasks, participants viewed a computer screen with a gray background and a black fixation shape in the center (81).

Visual tasks.

Participants performed four visual tasks. Two of these were variations of a discrimination task, in which participants were asked to discriminate the orientation of a Gabor patch hidden in dynamic visual noise as being rotated clockwise (CW: 45°, 50% of trials) or counter-clockwise (CCW: −45°, 50% of trials; Fig. 1A). The other two tasks were detection tasks in which participants had to indicate whether they believed a Gabor patch (CCW or CW) to be present or absent (50% present trials). These two detection tasks differed in terms of response bias manipulation (conservative versus liberal, see below). Besides these main task characteristics, there were some additional instructions and manipulations that are discussed below. During all visual tasks, gaze position was measured online to ensure that participants were fixating. Trials during which gaze position diverted >1.5° from fixation on the horizontal axis were excluded from further analyses.

Cued visual discrimination task.

The first visual discrimination task was an adaptation of the Posner cueing task (82). Target stimuli were presented unilaterally for 200 ms, on either the left or the right side of the monitor (50% left, 50% right). Prior to target presentation (−1,300 ms), a spatial cue that was predictive of the target stimulus location was presented for 300 ms (80% cue validity). Participants were instructed to use this spatial cue to covertly shift their attention toward the cued location. Participants could respond until 1,400 ms after the onset of stimulus presentation by pressing one of two buttons on the keyboard (S for CCW tilted Gabor patches, K for CW). A variable ITI randomly drawn from a uniform distribution between 250 and 350 ms started directly after a response or the end of the response window if no response was given. Participants performed 560 trials per session of this task, distributed over two runs of 280 trials.

Uncued visual discrimination task.

The other discrimination task was a classical visual discrimination task. Target stimuli were presented centrally for 200 ms and there were no further manipulations. Besides indicating the orientation of the target stimulus, participants were also asked to report the confidence in their decision. Participants were instructed beforehand to distribute their confidence reports evenly, to prevent participants from only reporting low confidence answers under this challenging task. Orientation and confidence reports were simultaneously given by pressing one of four buttons on the keyboard (A for high-confident CCW-tilted Gabor patches, S for low-confident CCW, K for low-confident CW, and L for high-confident CW). Again, participants had a 1,400-ms time window to respond (response window aborted at response), followed by a 250 to 350 ms ITI. In total, participants performed 600 trials of the uncued visual discrimination task per session, distributed over two runs of 300 trials.

Liberal and conservative visual detection task.

During the liberal visual detection task, target stimuli were presented centrally for 200 ms as well. The noise stimulus (a circle containing dynamic noise) was presented on every trial, but target stimuli were only shown on 50% of trials. Participants were instructed to report whether they saw a target stimulus or not, by pressing S (for target absent) or K (for target present). Target stimuli were orientated CCW and CW as in the discrimination tasks, but this orientation was not task-relevant and could thus be ignored. The response window was again 1,400 ms from stimulus onset, followed by a 250 to 350 ms ITI. Response bias was manipulated toward more liberal answers by means of negative auditory feedback in the form of a buzzer sound after missed target stimuli (i.e., misses in signal detection theory), presented immediately after the response. Participants performed 480 trials (two runs of 240 trials) of the liberal visual detection task on each session.

The conservative detection task was like the liberal detection task, with the only difference being that response bias was manipulated toward more conservative responses. In this case, negative feedback was provided after falsely reporting the presence of a target stimulus (i.e., false alarms in signal detection theory). Again, participants performed 480 trials (two runs of 240 trials) per session of this task.

Because the buzzer sounds might have increased participants’ arousal and pupil size on the following trial, we also performed the analyses described below after excluding trials that followed a buzzer sound. With this, we removed 12.4% of trials of the liberal and conservative detection tasks (or 3.3% of all trials). Removing these trials from further analyses did not change the results (SI Appendix, Fig. S5).

Staircasing procedure.

We titrated performance on all tasks to 75% correct. To this end, the volume of the target tones or the opacity of the Gabor patches (i.e., the signal strength) was varied according to the weighted transformed up/down method proposed by Kaernbach (83), while the (auditory or visual) noise was kept constant. In short, the signal strength was increased with one step after erroneous responses, and decreased with three steps after a correct response (i.e., “1-up-3-down staircase”).

Each experimental session started with a staircase procedure to find a good estimate of each subject’s auditory discrimination (sessions 1 and 2) or detection (session 3) threshold. Participants performed three runs of 60 trials each. We interlaced two Kaernbach staircases (30 trials each) to improve the threshold estimate. The first run started with relatively high signal strength so that participants could get acquainted with the stimuli. The second and third runs started with the mean signal strength of the reversals (excluding the first reversal) of the first and second runs, respectively. After finishing all staircase runs, the signal strength for the main auditory tasks that would supposedly lead to 75% performance was calculated by averaging all reversals (excluding the first reversals) across the two interlaced staircases.

The auditory staircase task was similar to the main auditory tasks, but with a few differences. The staircase task had shorter ITIs (1,000 ms) and the task required a response on each trial. Participants did not have to indicate the confidence in their decisions during the staircase task, and they were stimulated to withdraw from responding until the offset of the target sounds by flashing the fixation dot in red if they had answered too quickly.

The visual tasks were titrated to 75% performance in a similar fashion in a separate intake session (46).

Eye-Tracking Acquisition and Preprocessing.

Gaze position and pupil size were measured with an EyeLink 1000 eye tracker (SR Research, Canada) during the experiment at 500 Hz. Nine-point calibration was performed at the start of each run to ensure high data quality. Furthermore, we used a head-mount with chinrest to minimize participants’ head movements. Participants were instructed to move their heads as little as possible and to try to refrain from blinking during the trials.

The pupillometry data of all tasks were preprocessed in the exact same way. Pupil traces were lowpass filtered at 10 Hz, blinks were linearly interpolated and the effects of blinks and saccades on pupil diameter were removed via finite impulse-response deconvolution (84). We removed all trials for which the eyes were closed during the baseline period from further analyses.

Data Analysis.

Analysis for data collapsed across all tasks.

To assess the overall shape of the relationship between prestimulus pupil-linked arousal and perceptual decision-making, we first combined the data of all tasks (excluding all pharmacologically manipulated data). To quantify prestimulus pupil-linked arousal, we took the minimally preprocessed pupil traces of all tasks, and we calculated the average pupil size in the baseline window (−500 to 0 ms) leading up to each stimulus presentation (Fig. 1B). Throughout this manuscript, we use the term pupil-linked arousal to refer to spontaneous fluctuations in arousal state as measured by pupil size (9, 11, 12). Note that for the cued visual discrimination task, we also took the 500 ms before target presentation (not cue presentation) as the baseline window. We excluded all trials for which the eyes were closed during the baseline window, as well as trials for which pupil size during the baseline window was smaller or larger than three SDs from a subjects’ mean baseline pupil size. Next, for each run of each task separately, we assigned each trial to one of twenty equally sized bins based on the average prestimulus pupil size (note that we use five bins for the analyses performed separately for each sensory modality and decision type). The binning procedure was performed per run because it is not possible to assess whether pupil size differences between runs are the result of arousal fluctuations or differences between the exact head location (even though we used a head-mount). Therefore, we are looking at arousal fluctuations that occur within experimental runs (Fig. 1C, shown for five bins).

After having assigned all trials to twenty bins, we calculated SDTs [d′ (39)] and average RT as our dependent measures of perceptual decision-making for each run. We also calculated the mean pupil size for each bin to replace (equally spaced) bin numbers with (not necessarily equally spaced) mean prestimulus pupil size values, to do justice to the true relationship between pupil size and decision behavior. Next, we subsequently averaged over runs, sessions (in the case of auditory discrimination, for which we have two sessions), and tasks, leaving us with mean d′ and RT for 20 bins for each subject. To assess the shape of the relationship between arousal and perceptual decision-making, we next used linear mixed models (Mixed linear models).

Mixed linear models.

We used a mixed linear modeling approach implemented in the Python-package Statsmodels (85) to quantify the dependence of behavioral sensitivity and RT on pupil size bin (86). Specifically, we fitted two mixed models to test whether pupil response bin predominantly exhibited a monotonic effect (first-order), or a non-monotonic effect (second-order) on the behavioral metric of interest (y). The fixed effects were specified as:

Model 1:yβ01+β1P1, [1]
Model 2:yβ01+β1P1+β2P2, [2]

with β as regression coefficients and P as the average baseline pupil size in each bin. We included the maximal random effects structure justified by the design (87): Intercepts and pupil size coefficients could vary with participant. The mixed models were fitted through restricted maximum likelihood estimation. The two models were then formally compared based on the Akaike information criterion (AIC) (40) and Bayesian Information Criterion (BIC) (41).

Analyses performed separately for each sensory modality and decision type.

After having assessed that the shape of the overall relationship between pupil-linked arousal and perceptual decision-making was quadratically shaped, we investigated whether this relationship also held for the different sensory modalities and decision types in our dataset. We collapsed the data of all tasks over the relevant features [i.e., auditory (2 tasks), visual (4 tasks), detection (3 tasks), discrimination (3 tasks)]. We next treated the data as before (Analysis for data collapsed across all tasks), but this time we assigned the trials to five equally populated bins (instead of 20) to compensate for the lowered power after splitting the data (Fig. 1 B and C). Next, we performed polynomial regression (Polynomial regression) to assess the shape of the relationship between arousal and perceptual decision-making in the different sensory modalities and decision types in our dataset.

Polynomial regression.

To assess the shape of the relationship between arousal and perceptual decision-making (quadratic or linear), we performed second-order polynomial regression. Because we were merely interested in the shape of the relationship, and to minimize the influence of pupil size differences between subjects on our results, we first normalized our data in the pupil dimension (i.e., essentially centering the data around a common mean). Next, we modeled the relationship between our observed behavior (d′ and RT) and prestimulus pupil size as a negative quadratic relationship, and as a (unsigned) linear relationship using ordinary least squares linear regression with freely varying intercepts. We extracted the relevant beta coefficient from each model (ß1 for the linear model and ß2 for the quadratic model) for each subject and tested whether the coefficients were significantly different from 0 using one-sample t tests (α = 0.05). We first tested the significance of the linear model (two-sided), followed by the quadratic model. After having established that the overall relationship between prestimulus pupil and sensitivity/RT was negatively/positively quadratically shaped, respectively, we performed one-sided tests for the quadratic beta coefficients. To further investigate the effect of pupil-linked arousal on sensitivity, we repeated the polynomial regression analyses with hit rate and false alarm rate as the dependent variables (SI Appendix, Fig. S4).

Regressing out time-on-task.

To test whether the (quadratic) relationship between pupil-linked arousal and performance could be (partly) driven by time-on-task (43), we linearly regressed time on task (trial number, per run) out of the pupil size data. After that, we repeated the linear mixed model analysis with 20 bins and with data averaged across all tasks to test the shape of the relationship between prestimulus pupil size and perceptual decision-making after regressing out time-on-task (SI Appendix, Fig. S2).

Regressing out evoked pupil response on the previous trial.

To test whether the (quadratic) relationship between pupil-linked arousal and performance could be (partly) driven by the evoked pupil response on the previous trial, we linearly regressed the preceding evoked pupil response out of the pupil size data. The evoked pupil response was calculated as percent signal change relative to the average pupil size of a given run. The stimulus-evoked pupil response was defined as the maximum pupil size in the 1 s (for the visual tasks) or 2 s (for the auditory tasks) after stimulus presentation as compared to the average baseline pupil size before the target (baseline: −500 to 0 ms before target onset). After that, we repeated the Linear Mixed Model analysis with 20 bins and with data averaged across all tasks to test the shape of the relationship between pre-stimulus pupil size and perceptual decision-making after regressing out the evoked pupil response to the preceding trial (SI Appendix, Fig. S3).

Binning analysis based on the evoked pupil response on the previous trial.

To further test whether the (quadratic) relationship between pupil-linked arousal and performance could be (partly) driven by the evoked pupil response on the previous trial, we repeated the binning procedure described above (20 bins), but this time the bins were made based on the evoked pupil response on the previous trial. The stimulus-evoked pupil response was again defined as described above. After this binning procedure, linear mixed model analysis was performed as before to assess the shape of the relationship between the evoked pupil response on the preceding trial and performance on the current trial (SI Appendix, Fig. S3).

Computational Model.

We considered a continuous-firing-rate-based, population-based model of neural dynamics to describe a generic decision-making task (38), which we adapt here for our detection/discrimination task. The model describes in detail the temporal evolution of the global synaptic conductances corresponding to the NMDA and GABA receptors of two competing excitatory populations and one inhibitory (PV) population (Fig. 3A). The model is described by the following equations:

dSAdt=-SAτN+γ1-SArA, [3]
dSBdt=-SBτN+γ1-SBrB, [4]
dSCdt=-SCτG+γIrC. [5]

Above, SA and SB correspond, respectively, to the NMDA conductances of selective excitatory populations A and B, and SC corresponds to the GABAergic conductance of the inhibitory population. The parameters in these equations take the following values: τN = 60 ms, τG = 5 ms, γ = 1.282, and γI = 2. The variables rA, rB, and rC are the mean firing rates of the two excitatory populations and one inhibitory population, respectively. We obtain their values by solving, at each time step, the transcendental equation ri=ϕi(Ii) , with ϕ being a transfer function of the population (specified below) and Ii being the total input current to population “i”, givfen by

IA=JsSA+JcSB+JEISC+I0A+ISSTA+xAt+IsensoryA(t), [6]
IB=JcSA+JsSB+JEISC+I0B+ISSTB+xBt+IsensoryB(t), [7]
IC=JIESA+JIESB+JIISC+I0C+xC(t). [8]

The parameters Js and Jc are the self- and cross-coupling synaptic terms between excitatory populations, respectively. JEI is the coupling from the inhibitory populations to any of the excitatory ones, JIE is the coupling from any of the excitatory populations to the inhibitory one, and JII is the self-coupling strength of the inhibitory population. The parameters I0i with i = A, B, C are background inputs to each population. Parameters in these equations take the following values: Js = 0.49 nA, Jc = 0.0107 nA, JIE = 0.3597 nA, JEI = −0.31 nA, JII = −0.12 nA, I0A = I0B = 0.3294 nA, and I0C = 0.26 nA. Note that these last two parameters correspond to the background input that pyramidal (A, B) and PV cells (C) received from other circuits in the brain. Parameter values have been taken or slightly adapted from previous work (38, 69). The term IiSST denotes the input to each excitatory population from its corresponding SST population (see details about this term below).

The term xi(t) with i = A, B, C is an Ornstein–Uhlenbeck process, which introduces some level of stochasticity in the system. It is given by

τnoisedxidt=-xi+τnoiseσiξit. [9]

Here, ξi(t) is a Gaussian white noise, the time constant is τnoise = 2 ms, and the noise strength is σA, B = 0.03 nA for excitatory populations and σC = 0 for the inhibitory one.

The last term in Eqs. 6 and 7 represents the external sensory input arriving at both populations. Assuming a detection task in which the subject has to detect the sensory stimulus A, the input is given by

IsensoryAt=μ0, IsensoryBt=0. [10]

The average stimulus strength is μ0 = 0.01326 unless specified otherwise (such as in Fig. 3 C and D, where the performance and RT are studied for different stimulus strength values). The stimulus is present during the whole duration of the trial (for simulations in which multiple trials are simulated one after the other, as in Fig. 3G, the values for these currents can be switched depending on which input, A or B, is present in each trial). The discrimination task can be simply stimulated by alternating the population which receives the input in the equation above, and therefore provides equivalent results.

The transfer function ϕi(t) which transforms the input into firing rates takes the following form for the excitatory populations:

ϕA,BI=12aI-b1-exp[-d(aI-b)]. [11]

The values for the parameters are a = 135 Hz/nA, b = 54 Hz, and d = 0.308 s. For the inhibitory population, a similar function can be used, but for convenience, we choose a threshold-linear function:

ϕCI=1gI c1I-c0+r00,30. [12]

The notation x0,30 denotes a minimum value of 0 (rectification) and a maximum value at 30 spikes/s (saturation). The values for the parameters are gI = 4, c1 = 615 Hz/nA, c0 = 177 Hz, and r0 = 5.5 spikes/s. It is sometimes useful for simulations (although not a requirement) to replace the transcendental equation ri=ϕi(Ii) by its analogous differential equation, of the form

τrdridt=-ri+ϕi(Ii). [13]

The time constant can take a typical value of τr = 2 ms. Note that, even though our model is only able to model continuous firing rates, the used of spike-based units such as Hertz or spikes/s is justified as our model is an extension of previous work (38) which is a mean-field equivalent to a spiking neural network model, and therefore, it provides accurate estimations of activity mapped to spikes/s.

In addition to the two selective excitatory populations and the PV population, our model also includes the effects of top-down input mediated by selective VIP and SST populations. To introduce this, we assume that the firing rate activity of VIP and SST cells is determined, respectively, by the following equations:

rVIPi=αVIP(Ibg+zIarousal)+βVIP0,20, [14]
rSSTi=αSSTgSST(Ibg+z Iarousal)+JVIP rVIPi+βSST0,20. [15]

The subindex “i” indicates the associated selective excitatory populations (Fig. 3A). Parameter values are αVIP = 50, βVIP = 0, αSST = 20, βSST = 32, gSST = 2, JVIP = −0.1, z = 0.1, and Ibg = 0.36. Note that βSST and βVIP correspond to the background currents received by SST and VIP cells from other circuits of the brain and that the background current for SST is particularly high so that the elevated spontaneous activity of SST can be reproduced (31). Iarousal indicates the arousal level of the model (assumed 0.4 in Fig. 3 BD and variable in Fig. 3 EG). Finally, we link the firing rate of the SST populations in Eq. 15 with the input to excitatory populations given in Eqs. 6 and 7 by assuming that:

ISSTi=JSST rSSTi. [16]

The synaptic strength is given by JSST = –0.001.

For each simulated trial (such as the one displayed in Fig. 3B), we consider that the circuit makes a detection when the firing rate of either the excitatory population A (or B, in the case of a discrimination task) reaches a threshold of 15 spikes/s. The duration of the trial is set to Ttrial = 1.5 s.

Supplementary Material

Appendix 01 (PDF)

Acknowledgments

This research was supported by an ERC Starting Grant from the H2020 European Research Council (ERC STG 715605 to S.v.G) and a Research Talent Grant from the Dutch Research Council (NWO; 406.17.531 to L.B. and S.v.G.).

Author contributions

L.B., J.F.M., S.A.N., J.J.F., and S.v.G. designed research; L.B. and S.A.N. performed research; J.W.d.G. contributed new reagents/analytic tools, methodology; L.B. and J.F.M. analyzed data; S.A.N. and J.W.d.G. edited the manuscript; J.J.F. project supervision; edited the manuscript; S.v.G. project supervision; and L.B., J.F.M., and S.v.G. wrote the paper.

Competing interests

The authors declare no competing interest.

Footnotes

This article is a PNAS Direct Submission.

Contributor Information

Lola Beerendonk, Email: l.beerendonk@uva.nl.

Jorge F. Mejías, Email: j.f.mejias@uva.nl.

Simon van Gaal, Email: s.vangaal@uva.nl.

Data, Materials, and Software Availability

The data of all experiments, analysis scripts, as well as the computational modeling scripts can be found at https://osf.io/fcext/ (88).

Supporting Information

References

  • 1.Wyart V., Koechlin E., Choice variability and suboptimality in uncertain environments. Curr. Opin. Behav. Sci. 11, 109–115 (2016). [Google Scholar]
  • 2.Rahnev D., Denison R. N., Suboptimality in perceptual decision making. Behav. Brain Sci. 41, e223 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Waschke L., Kloosterman N. A., Obleser J., Garrett D. D., Behavior needs neural variability. Neuron 109, 751–766 (2021). [DOI] [PubMed] [Google Scholar]
  • 4.Greene A. S., Horien C., Barson D., Scheinost D., Constable R. T., Why is everyone talking about brain state? Trends Neurosci. 46, 508–524 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.McCormick D. A., Nestvogel D. B., He B. J., Neuromodulation of Brain State and Behavior. Annu. Rev. Neurosci. 43, 391–415 (2020). [DOI] [PubMed] [Google Scholar]
  • 6.Aston-Jones G., Cohen J. D., An integrative theory of locus coeruleus-norepinephrine function: Adaptive gain and optimal performance. Annu. Rev. Neurosci. 28, 403–450 (2005). [DOI] [PubMed] [Google Scholar]
  • 7.McGinley M. J., et al. , Waking state: Rapid variations modulate neural and behavioral responses. Neuron 87, 1143–1161 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.McGinley M. J., David S. V., McCormick D. A., Cortical membrane potential signature of optimal states for sensory signal detection. Neuron 87, 179–192 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Reimer J., et al. , Pupil fluctuations track fast switching of cortical states during quiet wakefulness. Neuron 84, 355–362 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Vinck M., Batista-Brito R., Knoblich U., Cardin J. A., Arousal and locomotion make distinct contributions to cortical activity patterns and visual encoding. Neuron 86, 740–754 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Reimer J., et al. , Pupil fluctuations track rapid changes in adrenergic and cholinergic activity in cortex. Nat. Commun. 7, 13289 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Joshi S., Li Y., Kalwani R. M., Gold J. I., Relationships between pupil diameter and neuronal activity in the locus coeruleus, colliculi, and cingulate cortex. Neuron 89, 221–234 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.de Gee J. W., et al. , Dynamic modulation of decision biases by brainstem arousal systems. Elife 6, e23232 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Murphy P. R., O’Connell R. G., O’Sullivan M., Robertson I. H., Balsters J. H., Pupil diameter covaries with BOLD activity in human locus coeruleus. Hum. Brain Mapp. 35, 4140–4154 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Strauch C., Wang C.-A., Einhäuser W., Van der Stigchel S., Naber M., Pupillometry as an integrated readout of distinct attentional networks. Trends Neurosci. 45, 635–647 (2022). [DOI] [PubMed] [Google Scholar]
  • 16.Lloyd B., de Voogd L. D., Mäki-Marttunen V., Nieuwenhuis S., Pupil size reflects activation of subcortical ascending arousal system nuclei during rest. Elife 12, e84822 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Murphy P. R., Vandekerckhove J., Nieuwenhuis S., Pupil-linked arousal determines variability in perceptual decision making. PLoS Comput. Biol. 10, e1003854 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Podvalny E., King L. E., He B. J., Spectral signature and behavioral consequence of spontaneous shifts of pupil-linked arousal in human. Elife 10, e68265 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.van Kempen J., et al. , Behavioural and neural signatures of perceptual decision-making are modulated by pupil-linked arousal. Elife 8, e42541 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Neske G. T., Nestvogel D., Steffan P. J., McCormick D. A., Distinct waking states for strong evoked responses in primary visual cortex and optimal visual detection performance. J. Neurosci. 39, 10044–10059 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Xu Y., et al. , Effects of alertness on perceptual detection and discrimination. bioRxiv [Preprint] (2023). 10.1101/2023.03.21.533623 (Accessed 13 April 2023). [DOI]
  • 22.Waschke L., Tune S., Obleser J., Local cortical desynchronization and pupil-linked arousal differentially shape brain states for optimal sensory performance. Elife 8, e51501 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Yerkes R. M., Dodson J. D., The relation of strength of stimulus to rapidity of habit-formation. J. Comp. Neurol. Psychol. 18, 459–482 (1908). [Google Scholar]
  • 24.Bullock T., Elliott J. C., Serences J. T., Giesbrecht B., Acute exercise modulates feature-selective responses in human cortex. J. Cogn. Neurosci. 29, 605–618 (2017). [DOI] [PubMed] [Google Scholar]
  • 25.Murphy P. R., Robertson I. H., Balsters J. H., O’connell R. G., Pupillometry and P3 index the locus coeruleus-noradrenergic arousal function in humans. Psychophysiology 48, 1532–1543 (2011). [DOI] [PubMed] [Google Scholar]
  • 26.Gelbard-Sagiv H., Magidov E., Sharon H., Hendler T., Nir Y., Noradrenaline modulates visual perception and late visually evoked activity. Curr. Biol. 28, 2239–2249.e6 (2018). [DOI] [PubMed] [Google Scholar]
  • 27.Tian G., Li S., Huang T., Wu S., Excitation-inhibition balanced neural networks for fast signal detection. Front. Comput. Neurosci. 14, 79 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Vogels T. P., Abbott L. F., Gating multiple signals through detailed balance of excitation and inhibition in spiking networks. Nat. Neurosci. 12, 483–491 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Rubin R., Abbott L. F., Sompolinsky H., Balanced excitation and inhibition are required for high-capacity, noise-robust neuronal selectivity. Proc. Natl. Acad. Sci. U.S.A. 114, E9366–E9375 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Joglekar M. R., Mejias J. F., Yang G. R., Wang X.-J., Inter-areal balanced amplification enhances signal propagation in a large-scale circuit model of the primate cortex. Neuron 98, 222–234.e8 (2018). [DOI] [PubMed] [Google Scholar]
  • 31.Pfeffer C. K., Xue M., He M., Huang Z. J., Scanziani M., Inhibition of inhibition in visual cortex: The logic of connections between molecularly distinct interneurons. Nat. Neurosci. 16, 1068–1076 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Rudy B., Fishell G., Lee S., Hjerling-Leffler J., Three groups of interneurons account for nearly 100% of neocortical GABAergic neurons. Dev. Neurobiol. 71, 45–61 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Pakan J. M., et al. , Behavioral-state modulation of inhibition is context-dependent and cell type specific in mouse visual cortex. Elife 5, e14985 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Litwin-Kumar A., Rosenbaum R., Doiron B., Inhibitory stabilization and visual coding in cortical circuits with multiple interneuron subtypes. J. Neurophysiol. 115, 1399–1409 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Garcia Del Molino L. C., Yang G. R., Mejias J. F., Wang X.-J., Paradoxical response reversal of top-down modulation in cortical circuits with three interneuron types. Elife 6, e29742 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Kooijmans R. N., Sierhuis W., Self M. W., Roelfsema P. R., A quantitative comparison of inhibitory interneuron size and distribution between mouse and macaque V1, using calcium-binding proteins. Cereb. Cortex Commun. 1, tgaa068 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Kooijmans R. N., et al. , Full hemisphere human immunohistochemistry for calcium binding proteins Parvalbumin, Calbindin and Calretinin–Color corrected images (2020), 10.25493/C5KW-DMH. [DOI]
  • 38.Wong K.-F., Wang X.-J., A recurrent network mechanism of time integration in perceptual decisions. J. Neurosci. 26, 1314–1328 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Green D. M., Swets J. A., Signal Detection Theory and Psychophysics (Wiley, New York, 1966). [Google Scholar]
  • 40.Akaike H., A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19, 716–723 (1974). [Google Scholar]
  • 41.Schwarz G., Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978). [Google Scholar]
  • 42.Burnham K. P., Anderson D. R., Multimodel inference. Sociol. Methods Res. 33, 261–304 (2004). [Google Scholar]
  • 43.van den Brink R. L., Murphy P. R., Nieuwenhuis S., Pupil diameter tracks lapses of attention. PLoS One 11, e0165274 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.de Gee J. W., Knapen T., Donner T. H., Decision-related pupil dilation reflects upcoming choice and individual bias. Proc. Natl. Acad. Sci. U.S.A. 111, E618–E625 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.de Gee J. W., et al. , Mice regulate their attentional intensity and arousal to exploit increases in task utility. bioRxiv [Preprint] (2022). 10.1101/2022.03.04.482962 (Accessed 6 March 2023). [DOI]
  • 46.Nuiten S. A., De Gee J. W., Fahrenfort J. J., van Gaal S., Catecholaminergic neuromodulation and selective attention jointly shape perceptual decision making. Elife 12, RP87022 (2023), 10.7554/elife.87022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Lee S., Kruglikov I., Huang Z. J., Fishell G., Rudy B., A disinhibitory circuit mediates motor integration in the somatosensory cortex. Nat. Neurosci. 16, 1662–1670 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Pakan J. M., et al. , Correction: Behavioral-state modulation of inhibition is context-dependent and cell type specific in mouse visual cortex. Elife 6, e31708 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Dipoppa M., et al. , Vision and locomotion shape the interactions between neuron types in mouse visual cortex. Neuron 98, 602–615.e8 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Tsodyks M. V., Skaggs W. E., Sejnowski T. J., McNaughton B. L., Paradoxical effects of external modulation of inhibitory interneurons. J. Neurosci. 17, 4382–4388 (1997). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Kim D., et al. , Distinct roles of parvalbumin- and somatostatin-expressing interneurons in working memory. Neuron 92, 902–915 (2016). [DOI] [PubMed] [Google Scholar]
  • 52.Roitman J. D., Shadlen M. N., Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task. J. Neurosci. 22, 9475–9489 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Medalla M., et al. , Comparative features of calretinin, calbindin, and parvalbumin expressing interneurons in mouse and monkey primary visual and frontal cortices. J. Comp. Neurol., 10.1002/cne.25514 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Kim Y., et al. , Brain-wide maps reveal stereotyped cell-type-based cortical architecture and subcortical sexual dimorphism. Cell 171, 456–469.e22 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Badrinarayanan S., Manseau F., Williams S., Brandon M. P., A characterization of the electrophysiological and morphological properties of vasoactive intestinal peptide (VIP) interneurons in the medial entorhinal cortex (MEC). Front. Neural. Circ. 15, 653116 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Canto-Bustos M., Friason F. K., Bassi C., Oswald A.-M. M., Disinhibitory circuitry gates associative synaptic plasticity in olfactory cortex. J. Neurosci. 42, 2942–2950 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Lombardi A., et al. , Recovery kinetics of short-term depression of GABAergic and glutamatergic synapses at layer 2/3 pyramidal cells in the mouse barrel cortex. Front. Cell. Neurosci. 17, 1254776 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Billeh Y. N., et al. , Systematic integration of structural and functional data into multi-scale models of mouse primary visual cortex. Neuron 106, 388–403.e18 (2020). [DOI] [PubMed] [Google Scholar]
  • 59.Mejías J. F., Torres J. J., The role of synaptic facilitation in spike coincidence detection. J. Comput. Neurosci. 24, 222–234 (2008). [DOI] [PubMed] [Google Scholar]
  • 60.Mejias J. F., Torres J. J., Emergence of resonances in neural systems: The interplay between adaptive threshold and short-term synaptic plasticity. PLoS One 6, e17255 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Torres J. J., Marro J., Mejias J. F., Can intrinsic noise induce various resonant peaks? New J. Phys. 13, 053014 (2011). [Google Scholar]
  • 62.Cole A. E., Nicoll R. A., Characterization of a slow cholinergic post-synaptic potential recorded in vitro from rat hippocampal pyramidal cells. J. Physiol. 352, 173–188 (1984). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Cole A. E., Nicoll R. A., The pharmacology of cholinergic excitatory responses in hippocampal pyramidal cells. Brain Res. 305, 283–290 (1984). [DOI] [PubMed] [Google Scholar]
  • 64.Madison D. V., Nicoll R. A., Actions of noradrenaline recorded intracellularly in rat hippocampal CA1 pyramidal neurones, in vitro. J. Physiol. 372, 221–244 (1986). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Madison D. V., Nicoll R. A., Cyclic adenosine 3’,5’-monophosphate mediates beta-receptor actions of noradrenaline in rat hippocampal pyramidal cells. J. Physiol. 372, 245–259 (1986). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Barkai E., Hasselmo M. E., Modulation of the input/output function of rat piriform cortex pyramidal cells. J. Neurophysiol. 72, 644–658 (1994). [DOI] [PubMed] [Google Scholar]
  • 67.Hasselmo M. E., The role of acetylcholine in learning and memory. Curr. Opin. Neurobiol. 16, 710–715 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Klaver L. M. F., et al. , Spontaneous variations in arousal modulate subsequent visual processing and local field potential dynamics in the ferret during quiet wakefulness. Cereb. Cortex 33, 7564–7581 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Mejías J. F., Wang X.-J., Mechanisms of distributed working memory in a large-scale network of macaque neocortex. Elife 11, e72136 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Zhang Y., Ding M., Detection of a weak somatosensory stimulus: Role of the prestimulus mu rhythm and its top-down modulation. J. Cogn. Neurosci. 22, 307–322 (2010). [DOI] [PubMed] [Google Scholar]
  • 71.Linkenkaer-Hansen K., Nikulin V. V., Palva S., Ilmoniemi R. J., Palva J. M., Prestimulus oscillations enhance psychophysical performance in humans. J. Neurosci. 24, 10186–10190 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Montefusco-Siegmund R., et al. , Alpha EEG activity and pupil diameter coupling during inactive wakefulness in humans. eNeuro 9, ENEURO.0060-21.2022 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Pfeffer T., et al. , Circuit mechanisms for the chemical modulation of cortex-wide network interactions and behavioral variability. Sci. Adv. 7, eabf5620 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Iemi L., Chaumon M., Crouzet S. M., Busch N. A., Spontaneous neural oscillations bias perception by modulating baseline excitability. J. Neurosci. 37, 807–819 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Kloosterman N. A., et al. , Humans strategically shift decision bias by flexibly adjusting sensory evidence accumulation. Elife 8, e37321 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Samaha J., Iemi L., Haegens S., Busch N. A., Spontaneous brain oscillations and perceptual decision-making. Trends Cogn. Sci. 24, 639–653 (2020). [DOI] [PubMed] [Google Scholar]
  • 77.Iemi L., et al. , Ongoing neural oscillations influence behavior and sensory representations by suppressing neuronal excitability. Neuroimage 247, 118746 (2022). [DOI] [PubMed] [Google Scholar]
  • 78.Sörensen L. K. A., Bohté S. M., Slagter H. A., Scholte H. S., Arousal state affects perceptual decision-making by modulating hierarchical sensory processing in a large-scale visual system model. PLoS Comput. Biol. 18, e1009976 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Peirce J. W., PsychoPy–Psychophysics software in Python. J. Neurosci. Methods 162, 8–13 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Atiani S., Elhilali M., David S. V., Fritz J. B., Shamma S. A., Task difficulty and performance induce diverse adaptive patterns in gain and shape of primary auditory cortical receptive fields. Neuron 61, 467–480 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Thaler L., Schütz A. C., Goodale M. A., Gegenfurtner K. R., What is the best fixation target? The effect of target shape on stability of fixational eye movements Vision Res. 76, 31–42 (2013). [DOI] [PubMed] [Google Scholar]
  • 82.Posner M. I., Orienting of attention. Q. J. Exp. Psychol. 32, 3–25 (1980). [DOI] [PubMed] [Google Scholar]
  • 83.Kaernbach C., Simple adaptive testing with the weighted up-down method. Percept. Psychophys. 49, 227–229 (1991). [DOI] [PubMed] [Google Scholar]
  • 84.Knapen T., et al. , Cognitive and ocular factors jointly determine pupil responses under equiluminance. PLoS One 11, e0155574 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Seabold S., Perktold J., “Statsmodels: Econometric and statistical modeling with python” in Proceedings of the 9th Python in Science Conference (SciPy, 2010), 10.25080/majora-92bf1922-011 (30 June 2023). [DOI] [Google Scholar]
  • 86.de Gee J. W., et al. , Pupil-linked phasic arousal predicts a reduction of choice bias across species and decision domains. Elife 9, e54014 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Barr D. J., Levy R., Scheepers C., Tily H. J., Random effects structure for confirmatory hypothesis testing: Keep it maximal. J. Mem. Lang. 68, 255–278 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Beerendonk L., et al. , A disinhibitory circuit mechanism explains a general principle of peak performance during mid-level arousal. Open Science Framework. https://osf.io/fcext/. Deposited 6 January 2024. [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix 01 (PDF)

Data Availability Statement

The data of all experiments, analysis scripts, as well as the computational modeling scripts can be found at https://osf.io/fcext/ (88).


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES