Abstract
It is commonly believed that visual short-term memory (VSTM) consists of a fixed number of “slots” in which items can be stored. An alternative theory in which memory resource is a continuous quantity distributed over all items seems to be refuted by the appearance of guessing in human responses. Here, we introduce a model in which resource is not only continuous but also variable across items and trials, causing random fluctuations in encoding precision. We tested this model against previous models using two VSTM paradigms and two feature dimensions. Our model accurately accounts for all aspects of the data, including apparent guessing, and outperforms slot models in formal model comparison. At the neural level, variability in precision might correspond to variability in neural population gain and doubly stochastic stimulus representation. Our results suggest that VSTM resource is continuous and variable rather than discrete and fixed and might explain why subjective experience of VSTM is not all or none.
Keywords: working memory, Bayesian inference, attention, estimation, change localization
Thomas Chamberlin famously warned scientists against entertaining only a single hypothesis, for such a modus operandi might lead to undue attachment and “a pressing of the facts to make them fit the theory” (ref. 1, p. 840). For half a century, the study of short-term memory limitations has been dominated by a single hypothesis, namely that a fixed number of items can be held in memory and any excess items are discarded (2–5). The alternative notion that short-term memory resource is a continuous quantity distributed over all items, with a lower amount per item translating into lower encoding precision, has enjoyed some success (6–8), but has been unable to account for the finding that humans often seem to make a random guess when asked to report the identity of one of a set of remembered items, especially when many items are present (9). Specifically, if resource were evenly distributed across items (6, 10), observers would never guess. Thus, at present, no viable continuous-resource model exists.
Here, we propose a more sophisticated continuous-resource model, the variable-precision (VP) model, in which the amount of resource an item receives, and thus its encoding precision, varies randomly across items and trials and on average decreases with set size. Resource might correspond to the gain of a neural population pattern of activity encoding a memorized feature. When gain is higher, a stimulus is encoded with higher precision (11, 12). Variability in gain across items and trials is consistent with observations of single-neuron firing rate variability (13–15) and attentional fluctuations (16, 17).
We tested the VP model against three alternative models (Fig. 1). According to the classic item-limit (IL) model (4), a fixed number of items is kept in memory, and memorized items are recalled perfectly. In the equal-precision (EP) model (6, 10), a continuous resource is evenly distributed across all items. The slots-plus-averaging (SA) model (9) acknowledges the presence of noise but combines it with the notion of discrete slots. Resource consists of a few discrete chunks, each of which affords limited precision to the encoding of an item. When there are fewer items than chunks, an item might get encoded using multiple chunks and thus with higher precision. To compare the four models, we used two visual short-term memory (VSTM) paradigms, namely delayed estimation (7) and change localization, each of which we applied to two feature dimensions, color and orientation (Fig. 2). We found that the VP model outperforms the previous models in each of the four experiments and accounts, at each set size, for the frequency that observers appear to be guessing. Thus, the VP model poses a serious challenge to models in which VSTM resource is assumed to be discrete and fixed.
Fig. 1.
Resource allocation in models of VSTM. Each box represents an item. Set size is 2 (Left) or 5 (Right). In this example, the number of “slots” or “chunks” is 3 in the IL and SA models.
Fig. 2.
Trial procedures. (A) Experiment 1: delayed estimation of color. Subjects scroll through all possible colors to report the remembered color in the marked location. (B) Experiment 2: delayed estimation of orientation. (C) Experiment 3: color change localization. (D) Experiment 4: orientation change localization.
Theory
VSTM Encoding and Variable Precision.
An observer memorizes N simultaneously presented stimuli. The task-relevant feature is orientation or color, both of which are circular variables in our experiments. Each stimulus is encoded with precision J, which is formally defined as Fisher information (18). We assume that the observer’s internal measurement of a stimulus is noisy and follows a Von Mises (circular normal) distribution,
![]() |
where I0 is the modified Bessel function of the first kind of order 0 and the concentration parameter κ is uniquely determined by J through
(SI Text). For a variable with a Gaussian distribution, J would be equal to inverse variance. A higher J produces a narrower distribution p(x | s, J) (Fig. 3A). In the VP model, J is variable across items and trials and we assume that it is drawn, independently across items and trials, from a gamma distribution with mean
and scale parameter τ (Fig. 3A). The measurement is then described by a doubly stochastic process,
. We further assume that
depends on set size, N, in power-law fashion,
(Fig. 3B). The free parameters
, α, and τ are fitted to subject data.
Fig. 3.
Theory. (A) (Upper) In the VP model, precision, J, is variable and assumed to follow a gamma distribution (here with τ = 1). (Lower) Von Mises noise distributions corresponding to three values of precision and s = 0. (B) Example probability distributions over precision at different set sizes in the VP model. Here, mean precision (dashed lines) was taken inversely proportional to set size (α = 1). In the EP model, these distributions would be delta functions. (C) Decision process in the Bayesian model of change localization.
Models for Delayed Estimation.
In experiments 1 and 2, observers estimated the value of a remembered stimulus (Fig. 2 A and B). The stimulus estimate, denoted
, is equal to the measurement, x. In the IL model, the measurement of a remembered stimulus is noiseless but only K items (the “capacity”) are remembered (or all N when N ≤ K), producing a guessing rate of 1 − K/N for N > K. In the SA model, K chunks of resource are allocated and the estimate distribution has two components. When the tested item has no chunks, the observer guesses and the estimate distribution is uniform; otherwise, it is a Von Mises distribution with κ determined by the number of chunks. In the EP model, the estimate distribution is Von Mises as in Eq. 1, but with precision J equal across items and across trials with the same N and dependent on N as
. In the VP model, the estimate distribution is a mixture of many Von Mises distributions, each with a different value of κ:
(Fig. S1A). In all models, we assume that the observer’s response is equal to the estimate
plus zero-mean Von Mises response noise with concentration parameter κr. Model details can be found in SI Text.
Models for Change Localization.
In experiments 3 and 4, observers sequentially viewed two displays, which were identical except that one stimulus changed between them. Observers reported where the change occurred (Fig. 2 C and D). The stimuli in the first display and the magnitude of the change were all drawn independently from a uniform distribution. In each model, stimuli are encoded in the same way as in delayed estimation, but the decision-making stage is different (Fig. 3C). We denote the measurements of the stimuli in the first and second displays by vectors x and y, respectively, and the corresponding concentration parameters by a vector κ. In the EP and VP models, the observer has access to all N pairs of measurements, but in the SA model only to K of them (or N when N ≤ K). The statistical structure of the task-relevant variables is shown in Fig. S1C. In all models with noisy encoding, the observer’s decision process is modeled as Bayesian inference. The Bayesian decision rule is to report the location L for which the posterior probability of change occurrence is largest, which is equivalent to the quantity
being largest (SI Text).
Psychophysics and Model Comparison
Experiment 1: Delayed Estimation of Color.
To compare the models, we first performed a delayed-estimation experiment (7). Observers briefly viewed and memorized the colors of N discs (N = 1, … , 8) and reported the color of a randomly chosen target disk by scrolling through all possible colors (Fig. 2A). Following other authors (9), we fitted to the observer’s estimation errors a mixture of a Von Mises distribution and a uniform distribution (see Fig. S2 for an example). We refer to the mixture proportion of the Von Mises component as w and to its circular SD as CSD. Note that this fitting procedure does not constitute a model, but is simply a way of summarizing the data into two descriptive statistics. It would be premature to interpret w as the probability that an item was encoded and 1 − w as the guessing rate, as suggested in ref. 9, because such an interpretation is meaningful only if the true error distribution is a uniform+Von Mises mixture, which we argue here is not the case. We verified that observers did not report colors of nontarget discs (Fig. S3; a different response modality, namely clicking on a color wheel, did produce nontarget reports). For each model, we generated synthetic datasets of the same size as the subject datasets, using the maximum-likelihood estimates of the parameters obtained from the subject data (Table S1), and then fitted the uniform+Von Mises mixture to these synthetic data. The resulting model predictions, averaged over subjects, are shown in Fig. 4A (for individual-subject fits, see Fig. S4). Consistent with previous results (9), we find a significant main effect of set size on both w [one-way repeated-measures ANOVA; F(7, 84) = 42.1, P < 0.001] and CSD [F(7, 84) = 4.60, P < 0.001]. This result rules out both the EP model, which predicts w close to 1 at each set size (the slight deviation is an artifact of the limited number of trials), and the IL model, which predicts that CSD is constant. The SA and VP models explain the data better, with the VP model having the lowest root mean-square (RMS) error (Fig. 4A). In the SA model, capacity K equals 4.00 ± 0.34 (mean ± SEM), in line with earlier work (9). In the VP model, the power α equals 1.33 ± 0.14 (Fig. S5A).
Fig. 4.
(A and B) Parameters w and CSD obtained from fitting a mixture of a uniform and a Von Mises distribution to the estimation errors in experiment 1 (A) and experiment 2 (B). Here and elsewhere, circles and error bars represent data (mean and SEM) and shaded areas model predictions (SEM). Root mean-square error (RMSE) was computed across all set sizes and all subjects. The lowest RMSE in each comparison is indicated in boldface type.
There is a clear intuition for why the VP model, but not the EP model, accounts for the decrease of w with set size. Because of trial-to-trial variability in precision, the target item sometimes, by chance, receives so little resource that the estimate on that trial is grouped into the uniform distribution, even though it was not a “real” guess. When set size is larger, mean precision is lower, resulting in more probability mass near zero precision (Fig. 3B) and a higher apparent guessing rate. Thus, it is not necessary to assume discrete resources to explain the decrease of w with set size.
To further determine which model best describes the data, we performed Bayesian model comparison (19), a principled method that automatically corrects for the number of free parameters (SI Text). We found that the log likelihood of the VP model exceeds those of the IL, SA, and EP models by respectively 15.6 ± 3.1, 12.0 ± 3.1, and 40.3 ± 6.3 points (Fig. 5A). A log-likelihood difference (or log Bayes factor) of 12.0 means that the data are e12.0 times more probable under one model than under another. At the level of individual subjects (Fig. S6A), we find that the VP model is most likely for 12 of 13 subjects, whereas SA is slightly better for one. Consistent results were obtained using the Bayesian information criterion (20) (Fig. S6B).
Fig. 5.
More delayed-estimation results. (A) Model log likelihoods relative to the VP model in experiment 1 (colors). (B) Model predictions for the residual remaining after fitting a mixture of a uniform and a Von Mises distribution to the predicted error distribution, averaged over set sizes and subjects. (C) Blue: Residual after fitting a mixture of a uniform and a Von Mises distribution to the empirical error distribution. Black: Running average over a 0.28-rad window. (D–F) Same as A–C, but for experiment 2 (orientation).
Residual in Delayed Estimation.
The VP model makes an intuitive prediction distinct from the other models. So far, we have fitted the data with a uniform+Von Mises mixture to obtain two descriptive statistics, w and CSD. The VP model postulates variability in precision, causing its predicted error distribution to be a mixture of a large number of Von Mises distributions, each with a different J. Such a mixture cannot be fitted perfectly with a uniform+Von Mises mixture and will therefore leave a residual. Using the synthetic data described above, we find that the residual predicted by the VP model, but not by other models, has a central peak and negative side lobes (Fig. 5B). The subject data show a residual of exactly this shape (Fig. 5C and Fig. S2). This result constitutes additional evidence for variability in precision.
Experiment 2: Delayed Estimation of Orientation.
To investigate the generality of these results, we replicated the experiment using orientation (Fig. 2B). The data show a significant main effect of set size on both w [one-way repeated-measures ANOVA, F(7, 35) = 32.4, P < 0.001] and CSD [F(7, 35) = 3.28, P < 0.01] (Fig. 4B and Fig. S7), again ruling out the IL and EP models. The SA and VP models explain the data better, with the VP model having the lowest RMS error (Fig. 4B). In the SA model, capacity K = 3.33 ± 0.56. In the VP model, the power α = 1.41 ± 0.15 (Fig. S5A). Bayesian model comparison shows that the VP model outperforms the IL, SA, and EP models by 103 ± 15, 52 ± 11, and 142 ± 30 log-likelihood points, respectively (Fig. 5D). The VP model is most likely for all six subjects (Fig. S6C). Results were confirmed using the Bayesian information criterion (Fig. S6D). The residual after subtracting the uniform+Von Mises mixture has the shape predicted by the VP model (Fig. 5 E and F).
Experiments 3 and 4: Change Localization.
To examine whether the VP model can account for human behavior in other VSTM tasks, we conducted two experiments in which subjects localized a change in the color or orientation of a stimulus (Fig. 2 C and D). Set size had a significant main effect on accuracy both for color [one-way repeated-measures ANOVA, F(3, 18) = 256.6, P < 0.001] and for orientation [F(3, 30) = 356.5, P < 0.001] (Figs. S8A and S9A). Magnitude of change has a significant effect on accuracy both for color [one-way repeated-measures ANOVA, F(8, 48) = 114.3, P < 0.001] and for orientation [F(8, 80) = 238.5, P < 0.001] (Fig. 6). Judged by RMS error, the VP model provides the best fits to the psychometric curves (Fig. 6). Individual-subject fits are shown in Figs. S8 and S9. In the SA model, capacity K = 2.86 ± 0.14 for color and 4.09 ± 0.39 for orientation. In the VP model, the power α = 0.974 ± 0.090 for color and 0.993 ± 0.075 for orientation (Fig. S5B). In Bayesian model comparison, the VP model outperforms the IL, SA, and EP models both for color (by 143 ± 11, 10.1 ± 2.6, and 15.0 ± 2.8 log-likelihood points) and for orientation (by 145 ± 11, 11.9 ± 2.6, and 17.3 ± 2.8 points) (Fig. 7 A and C). In both experiments, the VP model outperforms all other models for every individual subject (Fig. S10).
Fig. 6.
(A and B) Proportion correct as a function of change magnitude at each set size in experiment 3 (A) and experiment 4 (B).
Fig. 7.
More change localization results. (A) Model log likelihoods relative to the VP model in experiment 3 (colors). (B) Apparent guessing rate as a function of set size in experiment 3. (C and D) Same as A and B, but for experiment 4 (orientation).
Apparent Guessing in Change Localization.
To further distinguish the models, we computed an apparent guessing rate analogous to 1 − w in delayed estimation. We did so by fitting, at each set size separately, a Bayesian-observer model with equal, fixed precision and a guessing rate to both the subject data and the model-generated synthetic data. The EP model predicts an apparent guessing rate of zero. We found that subjects’ apparent guessing rate was significantly higher than zero at all set sizes [t(6) > 4.82, P < 0.002 and t(10) > 4.64, P < 0.001 for experiments 3 and 4, respectively] and increased with set size [F(3, 18) = 85.8, P < 0.001 and F(3, 30) = 26.6, P < 0.001, respectively]. The VP model reproduces the increase of apparent guessing rate with set size more accurately than the SA model (Fig. 7 B and D). Like for delayed estimation, the apparent guessing rate predicted by the VP model is nonzero because items are sometimes encoded with very low precision, and this happens more frequently when set size is large.
Discussion
Do Slots Exist?
Our results suggest that VSTM limitations should be conceptualized in terms of quality of encoding rather than number of items. Earlier work proposing continuous-resource models in the study of VSTM (6–8) did not model variability in resource across items and trials. Here, we have shown that when such variability is not modeled, as in the EP model, human responses in delayed estimation and change localization cannot be accounted for. By contrast, the VP model accounts for all presented data, including the existence of apparent guessing and its increase with set size, which have so far been attributed to an item limit. Thus, the VP model poses a serious challenge to the notion of slots in VSTM and might reconcile an apparent capacity of about four items with the subjective sense that we possess some memory of an entire scene: Items are never discarded completely, but their encoding quality could by chance be very low.
Most neuroimaging and EEG studies of VSTM limitations consider only the slots framework (5, 21–24) (but see refs. 25 and 26). Without testing alternative models of VSTM, these studies cannot provide evidence for the existence of slots. The VP model offers a viable alternative, and we expect that quantities in the VP model will also correlate with neural variables.
We do not expect the VP model to end the debate about the nature of VSTM limitations. Variants of both the VP model and previous models can be conceived and should be tested. Possible hybrids between the SA and VP models include SA with trial-to-trial variability in capacity K (27, 28) and VP augmented with an item limit (continuous resource in discrete slots). We expect, however, that any alternative model will have to explicitly model variability in resource across items and trials to account for the data.
Is Resource Discrete?
The SA model asserts not only that VSTM consists of slots, but also that resource comes in discrete chunks. The latter notion is difficult to reconcile with the fact that sensory noise is a graded rather than a discrete quantity. For example, stimulus contrast affects sensory noise and therefore encoding precision in a graded manner. Such continuous modulation is inconsistent with the allocation of “fixed-size, prepackaged boxes” (9) of resource, because those boxes allow for only a small, discrete number of noise levels. The VP model does not have this problem, because precision is a continuous quantity and is modulated by contrast in a continuous manner.
Neural Basis of VSTM Resource.
Previous models have not specified a neural correlate of VSTM resource. Here, we propose to identify VSTM memory resource with the gain (mean amplitude) of the neural population pattern encoding a stimulus. Several arguments support such an identification. First, for Poisson-like populations, gain is proportional to encoding precision (29). Moreover, the energy cost associated with high gain (30) could explain why working memory is limited: As set size grows larger, the energy cost gradually outweighs the benefit of encoding items with high precision. Finally, gain in visual cortical areas is modulated by attention (31–33), and attentional limitations are closely related to working memory ones (8, 34).
Neural Basis of Variability in Precision.
Although our results point to variability in encoding precision as key in describing VSTM limitations, the VP model does not specify the origin of this variability. Variations in attention and alertness are likely contributors, but stimulus-related precision differences [such as cardinal orientations being encoded with higher precision (35)] might also play a role. There is evidence that microsaccades are predictive of variability in precision during change detection (36). Variability in precision provides a behavioral counterpart to recent physiological findings of trial-to-trial and item-to-item fluctuations in attentional gain (16, 17). A consequence of gain variability is that the neural representation r of a stimulus follows a doubly stochastic process
The spike count distribution is determined by gain g, which itself is stochastic. Supporting this notion, doubly stochastic processes can well describe spike counts in lateral intraparietal cortex (LIP) (13), visual cortex (15), and other areas (14). Thus, the VP model is broadly consistent with emerging physiological findings.
Decrease of Mean Precision with Set Size.
The VP model predicts that mean precision decreases gradually with increasing set size and, if encoding precision can be identified with neural gain, that gain does as well. Extant physiological evidence is consistent with this prediction. Neuronal responses in LIP, an area associated with spatial attention, are lower to the onset of four than to that of two choice targets (37). In the superior colliculus, an area associated with covert attention, firing rates also decrease with the number of choice targets (38). Similar measurements in areas encoding short-term memories of visual stimuli remain to be made.
In both change localization experiments, we found that the mean precision decreases with set size approximately as 1/N, which would be predicted by models in which the total amount of resource is, on average, independent of set size. However, in both delayed-estimation experiments, we found a steeper decline. This result shows that the decrease of mean precision with set size is task-dependent and that the trial-averaged total amount of resource might depend on set size. Perhaps the precise relation between mean precision and set size is set by a trade-off between energy expenditure and performance. In support of this speculation, a decrease of mean precision with set size is also observed in an attentionally demanding task without a memory component (39).
Neural Decoding.
Nonhuman primate studies have begun to investigate set size effects in VSTM (36, 40–42). Advances in simultaneous recordings from large populations of single neurons, as well as in the decoding of voxel patterns in functional MRI, might soon allow for model comparison more powerful than psychophysics allows. For instance, in delayed estimation, one could conceivably obtain estimates x = (x1, … , xN) of the stimuli s = (s1, … , sN) at all N locations simultaneously. The predictions for p(x | s) made by the SA and VP models can then be compared directly. Altogether, the VP model could help to consolidate the perspectives of cognitive psychology and systems neuroscience on VSTM limitations.
Methods
Detailed experimental methods can be found in SI Text. In experiment 1 (Fig. 2A), observers memorized the colors of N discs (N = 1, … , 8) and reported the color of a randomly chosen target disk. Data of one subject were excluded, because her estimated value of w at set size 1 was extremely low (w = 0.72, compared with w > 0.97 for every other subject). A trial sequence consisted of the presentation of a fixation cross, the stimulus array, a delay period, and a response screen. Subjects responded by scrolling through all possible colors. Colors were drawn independently from a uniform distribution on a color wheel. Fourteen subjects each completed 864 trials in the scrolling condition. Experiment 2 (Fig. 2B) was identical except that stimuli were oriented Gabors. Set size was 2, 4, 6, or 8. Six subjects each completed 2,560 trials. In experiment 3 (Fig. 2C), observers were presented briefly with two displays containing N colored discs each (N = 2, 4, 6, or 8). The trial sequence consisted of the presentation of a fixation cross, the first stimulus array, a delay period, the second stimulus array, in which exactly one stimulus had changed color, and a response screen. Subjects clicked on the location of the stimulus that had changed. Colors in the first array and the magnitude of the change were drawn independently from a uniform distribution on a color wheel. Seven subjects each completed 1,920 trials. Experiment 4 (Fig. 2D) was identical except that stimuli were oriented ellipses. Eleven subjects each completed 1,920 trials.
Data Analysis.
We used maximum-likelihood fitting and Bayesian model comparison. We verified numerical robustness (Fig. S11). All methods are discussed in SI Text.
Supplementary Material
Acknowledgments
W.J.M. is supported by Award R01EY020958 from the National Eye Institute. R.v.d.B. was supported by the Netherlands Organisation for Scientific Research.
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1117465109/-/DCSupplemental.
References
- 1.Chamberlin TC. The method of multiple working hypotheses. J Geol. 1897;5:837–848. [Google Scholar]
- 2.Miller GA. The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychol Rev. 1956;63:81–97. [PubMed] [Google Scholar]
- 3.Cowan N. The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behav Brain Sci. 2001;24:87–114. doi: 10.1017/s0140525x01003922. discussion 114–185. [DOI] [PubMed] [Google Scholar]
- 4.Pashler H. Familiarity and visual change detection. Percept Psychophys. 1988;44:369–378. doi: 10.3758/bf03210419. [DOI] [PubMed] [Google Scholar]
- 5.Fukuda K, Awh E, Vogel EK. Discrete capacity limits in visual working memory. Curr Opin Neurobiol. 2010;20:177–182. doi: 10.1016/j.conb.2010.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Palmer J. Attentional limits on the perception and memory of visual information. J Exp Psychol Hum Percept Perform. 1990;16:332–350. doi: 10.1037//0096-1523.16.2.332. [DOI] [PubMed] [Google Scholar]
- 7.Wilken P, Ma WJ. A detection theory account of change detection. J Vis. 2004;4:1120–1135. doi: 10.1167/4.12.11. [DOI] [PubMed] [Google Scholar]
- 8.Bays PM, Husain M. Dynamic shifts of limited working memory resources in human vision. Science. 2008;321:851–854. doi: 10.1126/science.1158023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Zhang W, Luck SJ. Discrete fixed-resolution representations in visual working memory. Nature. 2008;453:233–235. doi: 10.1038/nature06860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Shaw ML. Identifying attentional and decision-making components in information processing. In: Nickerson RS, editor. Attention and Performance. Vol VIII. Hillsdale, NJ: Erlbaum; 1980. pp. 277–296. [Google Scholar]
- 11.Seung HS, Sompolinsky H. Simple models for reading neuronal population codes. Proc Natl Acad Sci USA. 1993;90:10749–10753. doi: 10.1073/pnas.90.22.10749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Knill DC, Pouget A. The Bayesian brain: The role of uncertainty in neural coding and computation. Trends Neurosci. 2004;27:712–719. doi: 10.1016/j.tins.2004.10.007. [DOI] [PubMed] [Google Scholar]
- 13.Churchland AK, et al. Variance as a signature of neural computations during decision making. Neuron. 2011;69:818–831. doi: 10.1016/j.neuron.2010.12.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Churchland MM, et al. Stimulus onset quenches neural variability: A widespread cortical phenomenon. Nat Neurosci. 2010;13:369–378. doi: 10.1038/nn.2501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Goris RLT, Simoncelli EP, Movshon JA. 2012 Using a doubly-stochastic model to analyze neuronal activity in the visual cortex. Cosyne Abstracts (Salt Lake City) [Google Scholar]
- 16.Cohen MR, Maunsell JHR. A neuronal population measure of attention predicts behavioral performance on individual trials. J Neurosci. 2010;30:15241–15253. doi: 10.1523/JNEUROSCI.2171-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Nienborg H, Cumming BG. Decision-related activity in sensory neurons reflects more than a neuron’s causal effect. Nature. 2009;459:89–92. doi: 10.1038/nature07821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Cover TM, Thomas JA. Elements of Information Theory. New York: John Wiley & Sons; 1991. [Google Scholar]
- 19.MacKay DJ. Information Theory, Inference, and Learning Algorithms. Cambridge, UK: Cambridge Univ Press; 2003. [Google Scholar]
- 20.Schwartz GE. Estimating the dimension of a model. Ann Stat. 1978;6:461–464. [Google Scholar]
- 21.Anderson DE, Vogel EK, Awh E. Precision in visual working memory reaches a stable plateau when individual item limits are exceeded. J Neurosci. 2011;31:1128–1138. doi: 10.1523/JNEUROSCI.4125-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- 22.Todd JJ, Marois R. Capacity limit of visual short-term memory in human posterior parietal cortex. Nature. 2004;428:751–754. doi: 10.1038/nature02466. [DOI] [PubMed] [Google Scholar]
- 23.Vogel EK, Machizawa MG. Neural activity predicts individual differences in visual working memory capacity. Nature. 2004;428:748–751. doi: 10.1038/nature02447. [DOI] [PubMed] [Google Scholar]
- 24.Sauseng P, et al. Brain oscillatory substrates of visual short-term memory capacity. Curr Biol. 2009;19:1846–1852. doi: 10.1016/j.cub.2009.08.062. [DOI] [PubMed] [Google Scholar]
- 25.Magen H, Emmanouil T-A, McMains SA, Kastner S, Treisman A. Attentional demands predict short-term memory load response in posterior parietal cortex. Neuropsychologia. 2009;47:1790–1798. doi: 10.1016/j.neuropsychologia.2009.02.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Xu Y, Chun MM. Dissociable neural mechanisms supporting visual short-term memory for objects. Nature. 2006;440:91–95. doi: 10.1038/nature04262. [DOI] [PubMed] [Google Scholar]
- 27.Dyrholm M, Kyllingsbaek S, Espeseth T, Bundesen C. Generalizing parametric models by introducing trial-by-trial parameter variability: The case of TVA. J Math Psych. 2011;55:416–429. [Google Scholar]
- 28.Sims CR, Jacobs RA, Knill DC. An ideal-observer analysis of visual working memory. Psychol Rev. 2012 doi: 10.1037/a0029856. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ma WJ, Beck JM, Latham PE, Pouget A. Bayesian inference with probabilistic population codes. Nat Neurosci. 2006;9:1432–1438. doi: 10.1038/nn1790. [DOI] [PubMed] [Google Scholar]
- 30.Lennie P. The cost of cortical computation. Curr Biol. 2003;13:493–497. doi: 10.1016/s0960-9822(03)00135-0. [DOI] [PubMed] [Google Scholar]
- 31.McAdams CJ, Maunsell JH. Effects of attention on the reliability of individual neurons in monkey visual cortex. Neuron. 1999;23:765–773. doi: 10.1016/s0896-6273(01)80034-9. [DOI] [PubMed] [Google Scholar]
- 32.Treue S, Martínez Trujillo JC. Feature-based attention influences motion processing gain in macaque visual cortex. Nature. 1999;399:575–579. doi: 10.1038/21176. [DOI] [PubMed] [Google Scholar]
- 33.Salinas E, Sejnowski TJ. Gain modulation in the central nervous system: Where behavior, neurophysiology, and computation meet. Neuroscientist. 2001;7:430–440. doi: 10.1177/107385840100700512. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Awh E, Jonides J. Overlapping mechanisms of attention and spatial working memory. Trends Cogn Sci. 2001;5:119–126. doi: 10.1016/s1364-6613(00)01593-x. [DOI] [PubMed] [Google Scholar]
- 35.Girshick AR, Landy MS, Simoncelli EP. Cardinal rules: visual orientation perception reflects knowledge of environmental statistics. Nat Neurosci. 2011;14:926–932. doi: 10.1038/nn.2831. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Lara AH, Wallis JD. Capacity and precision in an animal model of short-term memory. J Vis. 2012;12:1–12. doi: 10.1167/12.3.13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Churchland AK, Kiani R, Shadlen MN. Decision-making with multiple alternatives. Nat Neurosci. 2008;11:693–702. doi: 10.1038/nn.2123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Basso MA, Wurtz RH. Modulation of neuronal activity in superior colliculus by changes in target probability. J Neurosci. 1998;18:7519–7534. doi: 10.1523/JNEUROSCI.18-18-07519.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Mazyar H, Van den Berg R, Ma WJ. 2012 Does precision decrease with set size? J Vis, in press. [Google Scholar]
- 40.Heyselaar E, Johnston K, Pare M. 2011 doi: 10.1167/11.3.11. A change detection approach to study visual working memory of the macaque monkey. J Vis 11(3):11, 1–10. [DOI] [PubMed] [Google Scholar]
- 41.Elmore LC, et al. Visual short-term memory compared in rhesus monkeys and humans. Curr Biol. 2011;21:975–979. doi: 10.1016/j.cub.2011.04.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Buschman TJ, Siegel M, Roy JE, Miller EK. Neural substrates of cognitive capacity limitations. Proc Natl Acad Sci USA. 2011;108:11252–11255. doi: 10.1073/pnas.1104666108. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.








