Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Aug 1.
Published in final edited form as: Dev Psychol. 2017 Jun 19;53(8):1474–1493. doi: 10.1037/dev0000358

The development of real-time stability supports visual working memory performance: Young children’s feature binding can be improved through perceptual structure

Vanessa R Simmering 1, Chelsey M Wood 2
PMCID: PMC5578745  NIHMSID: NIHMS880546  PMID: 28627904

Abstract

Working memory is a basic cognitive process that predicts higher-level skills. A central question in theories of working memory development is the generality of the mechanisms proposed to explain improvements in performance. Prior theories have been closely tied to particular tasks and/or age groups, limiting their generalizability. The cognitive dynamics theory of visual working memory development has been proposed to overcome this limitation. From this perspective, developmental improvements arise through the coordination of cognitive processes to meet demands of different behavioral tasks. This notion is described as real-time stability, and can be probed through experiments that assess how changing task demands impact children’s performance. The current studies test this account by probing visual working memory for colors and shapes in a change detection task that compares detection of changes to new features versus swaps in color-shape binding. In Experiment 1, 3- to 4-year-old children showed impairments specific to binding swaps, as predicted by decreased real-time stability early in development; 5- to 6-year-old children showed a slight advantage on binding swaps, but 7- to 8-year-old children and adults showed no difference across trial types. Experiment 2 tested the proposed explanation of young children’s binding impairment through added perceptual structure, which supported the stability and precision of feature localization in memory—a process key to detecting binding swaps. This additional structure improved young children’s binding swap detection, but not new-feature detection or adults’ performance. These results provide further evidence for the cognitive dynamics and real-time stability explanation of visual working memory development.

Keywords: working memory development, visual feature binding, computational model


Working memory is a fundamental cognitive process that underscores performance on a range of tasks, from following a teacher’s multi-step instructions to planning a route through a grocery store. This ability to hold past information in mind to use flexibly in service of behavior correlates with scholastic achievement (e.g., Cowan et al., 2005; Pickering & Gathercole, 2004; Raghubar, Barnes, & Hecht, 2010). Longitudinal studies show that infants’ visual memory predicts their higher-level cognitive skills up to ten years later (e.g., Rose, Feldman, & Jankowski, 2012), suggesting that the foundations of adaptive cognitive functioning may emerge very early in development. However, the processes that are shared among different measures of memory across tasks and age groups remain unknown (Simmering, 2016).

Working memory development has been addressed somewhat independently across domains, with different explanations of improvement on simple tasks (i.e., requiring only maintenance) or complex tasks (i.e., requiring manipulation or task switching). In complex tasks like reading or counting span, which require processing new information while maintaining prior information, developmental improvements likely arise through increases in processing speed, rehearsal, and/or resistance to interference. However, simple tasks that rely less on such processes still show parallel developmental improvements, and relate reliably to higher cognition (e.g., Cowan, 2013). Thus it is critical to understand performance and development even in simple memory tasks.

As a step in this direction, Simmering (2016) proposed the cognitive dynamics theory of visual working memory (VWM) development. Although this theory was developed to address increases in VWM capacity, the mechanism was proposed to be a general account of development that can be applied across domains (see General Discussion). In particular, Simmering posited that increases in real-time stability could explain developmental improvements across tasks. In contrast to long-term notions of stability, in which earlier behavior predicts outcomes later in development, real-time stability characterizes how effectively the memory system functions in response to different demands of behavioral tasks. Through implementation in a computational model, this developmental mechanisms has quantitatively captured performance on VWM tasks from infancy (5 to 13 months; Perone, Simmering, & Spencer, 2011; Perone & Spencer, 2013b, 2014), early childhood (3 to 7 years; Simmering, 2016; Simmering, Miller, & Bohache, 2015), and adulthood (e.g., Johnson, Simmering, & Buss, 2014).

Here we test the real-time stability hypothesis further, first by showing that modifying the behavioral task can reveal instability early in development, and second by augmenting stability in the moment of the task. Specifically, we tested children’s and adults’ memory for multi-feature visual stimuli (color-shape conjunctions), with particular attention to how memory is probed. These studies are not intended to contrast theories of multi-feature representation in VWM, but rather leverage this paradigm to test a proposed developmental mechanism. In the sections that follow, we describe theories of VWM development from infancy and middle childhood. We then briefly review accounts of feature binding in VWM, which have focused primarily on adults. Lastly, we combine theoretical frameworks from these domains to generate and test two specific predictions.

Theories of Visual Working Memory Development

Research on VWM development can be divided into two bodies of work: one addressing infancy (see Reznick, 2009, for review) and one focusing on middle childhood and adults (see Cowan, 2016, for review). We briefly review the dominant findings and theories in these two areas in turn. Infant studies use looking and reaching paradigms due to infants’ limited behavioral abilities. Across these methods a general pattern has emerged (see Reznick, 2009; Rose, Feldman, & Jankowski, 2004, for reviews): over development, infants form memory representations more quickly, maintain representations across longer delays, use representations more robustly in recognition or search, and represent more complex information. Theories explaining these changes focus on development of neural circuits supporting VWM (e.g., prefrontal cortex, Diamond, 1990; Ross-Sheehy, Oakes, & Luck, 2003; or parietal regions, Oakes, Messenger, Ross-Sheehy, & Luck, 2009) and related cognitive consequences (e.g., inhibition, Diamond, 1990; or object individuation, Oakes et al., 2009).

Studies of VWM during early to middle childhood typically use tasks adapted from the adult literature, such as the change detection task (e.g., Luck & Vogel, 1997). In this paradigm, a memory array is presented with a small number of simple objects, then following a short delay, a test array is presented in which the items either match the memory array or one has changed. Participants indicate whether the memory and test arrays were “same” or “different”. Most developmental studies have tested memory for colors (e.g., Cowan et al., 2005; Isbell, Fukuda, Neville, & Vogel, 2015) or shapes (e.g., Simmering et al., 2015), and have found gradual increases in performance with age (see Simmering & Perone, 2013, for review). These results are commonly interpreted as evidence for a developmental increase capacity – the number of items that can be held in memory at once.

This interpretation aligns with a “slot”-like characterization of VWM, in which capacity is conceptualized as a fixed number of discrete representations that can be encoded (see Suchow, Fougnie, Brady, & Alvarez, 2014, for review). Most theories of VWM development explicitly or implicitly endorse this perspective. For example, Cowan has proposed that the number of “chunks” that can be held in the focus of attention increases over development (e.g., Cowan, Saults, & Elliott, 2002). Through a series of manipulations and controls, Cowan and colleagues have shown that other cognitive changes, like strategies or knowledge, cannot account for developmental improvements (see Cowan, 2016, for review). Thus Cowan (2013) proposed that such improvements reflect increases in capacity itself, although he noted that the underlying source of capacity increases was unknown.

One proposed explanation of capacity increases came from studies assessing multi-feature object memory. Children (7–10 years old) and adults showed comparable performance on change detection trials requiring memory for one versus two features per object, which was interpreted as evidence for “integrated object” representations (Riggs, Simpson, & Potts, 2011; Vogel, Woodman, & Luck, 2001; but see below for conflicting evidence). Vogel et al. (2001) and Riggs et al. (2011) cited a neural synchrony model proposed by Raffone and Wolters (2001), in which different neural populations represent different features (e.g., color versus orientation), and features from the same object fire synchronously. Capacity limits come from the temporal resolution of synchronous firing: as the number of objects increases, differentiating the timing of different cell assemblies that represent each object becomes more difficult. Riggs et al. extended this model to account for developmental change, proposing that the temporal resolution of synchronous firing improves over development.

As an alternative to slot-like explanations, Bays and Husain (2008) characterized VWM as a continuous pool of resources that could be divided among an unlimited number of representations. From this view, adults’ change detection performance declines as the number of items increases because representations decrease in precision (see Suchow et al., 2014, for review). Evidence for this view came from a delayed estimation task (Wilken & Ma, 2004) in which participants recalled an item’s feature value (i.e., clicking on a continuous color wheel, or adjusting the orientation of a response bar, to match the remembered color or orientation) rather than judging “same” or “different” as in change detection. Pooling responses across many trials provides an estimate of the fidelity of memory representations, and results indicate that representational precision decreases as the number of items in memory increases (e.g., Bays & Husain, 2008).

Three studies have assessed developmental changes in precision with this paradigm. Burnett Heyes and colleagues found that 7- to 13-year-old children showed the same load-precision trade off as adults, and with better precision in older children (Burnett Heyes, Zokaei, van der Staaij, Bays, & Husain, 2012), and that precision improved with age when testing the same children two years later (Burnett Heyes, Zokaei, & Husain, 2016). Probabilistic modeling of responses suggested that age-related improvements related to decreased noise in memory representations. In contrast, however, a study comparing younger (7–9 years) versus older (10–12 years) groups of children found no age-related change in precision, but rather a decrease in incorrect-target responses (i.e., recalling a feature from an un-cued item; Sarigiannidis, Crickmore, & Astle, 2016). Together these results indicate developmental improvements in children’s performance, but conflicting analyses make it unclear whether they reflect increasing precision, better selection from memory, or both.

Across these domains of research, behavioral evidence suggests that VWM improves in multiple ways from infancy through middle childhood. During infancy, memory representations are built more quickly, are maintained and used more robustly, and increase in complexity with development. These changes have generally been attributed to brain maturation. During childhood, both capacity and precision of memory increase, but theories have been designed to address only one of these characteristics. As a way to bridge across these tasks and age groups to provide a more comprehensive theory, Simmering (2016) proposed a dynamic systems approach to understanding how VWM functions and develops, which we describe next.

The Cognitive Dynamics Theory of Visual Working Memory Development

The cognitive dynamics theory of VWM development is a dynamic systems approach proposed to bridge from infant development to adulthood (Simmering, 2016). Prior theories have been disconnected largely due to the different types of tasks used to assess VWM in infants, children, and adults. Dynamic systems approaches are well-suited to reconciling such task differences through their foundational concepts. In particular, dynamic systems theories conceptualize cognition and behavior as part of a larger system including endogenous and exogenous factors, with the same contributions driving change across timescales (Fogel & Thelen, 1987; Smith & Thelen, 2003). Importantly, no single component of the system has priority in explaining behavior and development, meaning the structure of task is just as important to consider as the structure of the cognitive system. Historically, theories of cognition and development sought to devise behavioral tasks that tap into particular cognitive constructs, without considering how the task itself creates the behavior it measures (see Smith, Thelen, Titzer, & McLin, 1999, for an illustration using the A-not-B paradigm).

In the context of VWM development, a dynamic systems approach can unite results from looking paradigms in infancy with tasks designed to assess capacity in children and adults. The cognitive dynamics theory specifically emphasizes the continuity and inter-dependence of processing within these different tasks (Simmering, 2016), as opposed to classic information processing descriptions of memory systems (e.g., Atkinson & Shiffrin, 1968) that posit separable stages of attention, encoding, storage, and retrieval. In the cognitive dynamics theory, rather than considering a separate encoding process that precedes storage, the same temporally-continuous processes form, maintain, and use memory representations in service of behavior (Johnson et al., 2014; Simmering, 2016).

The cognitive dynamics theory has been formalized into a computational model to illustrate the explanation of behavior across tasks and development. Computational modeling is a valuable approach to understanding cognitive processes, but is under-represented in developmental theories (Simmering, Triesch, Deàk, & Spencer, 2010). Computational approaches face a number of challenges in reaching a broad developmental audience, as they must be constructed to address the specific characteristics of the task(s) of interest. This specificity can be a double-edged sword: it allows for incremental extensions to predict closely related behaviors (e.g., Simmering & Patterson, 2012), but requires substantial modification to encompass more distant tasks and phenomena (see Simmering, 2016, for discussion). Thus, to be most effective, computational approaches must balance specificity and generality to advance our understanding of cognitive and developmental processes (Simmering & Spencer, 2008).

The current investigation strikes this balance by using models as a theoretical framework, showing general predictions without specific implementation of the tasks. The cognitive dynamics theory of VWM development is instantiated in a dynamic neural field architecture, in which features are represented in continuous neural fields along metrically-specific dimensions (i.e., location, color, orientation). Nodes within these fields are connected such that local excitation supports activation among similarly-tuned nodes, which allows for localized “peaks” of activation to form as representations of features. The architecture to simulate VWM tasks includes two excitatory layers coupled to a shared inhibitory layer (Johnson & Simmering, 2015). The excitatory layers simulate perceptual processing versus working memory representation through different strength of excitatory and inhibitory connections. In particular, weaker connectivity in the perceptual layer leads to input-driven representations: activation remains above threshold only in the presence of input, reflecting perceptual processing of visual information. Stronger connectivity in the working memory layer produces self-sustaining representations: peaks maintain above-threshold activation after input is removed, reflecting memory for prior visual information. Importantly, activation in these layers interacts continuously throughout the task, with excitation projecting from the perceptual layer to the inhibitory and working memory layers, as well as from working memory to the inhibitory layer, with inhibition projecting back into both excitatory layers (Johnson & Simmering, 2015). Through the balance between excitation and inhibition, items compete for representation, resulting in a limited capacity (cf. Franconeri, Alvarez, & Cavanagh, 2013). This is not a hard limit as in slot-like conceptualizations, but rather varies according to task demands (see Johnson et al., 2014, for illustrative simulations).

These interactions among layers provide an implicit mechanism for comparison between items held in memory and new inputs. Specifically, items (peaks) in the working memory layer are maintained through local excitation (within the layer) and lateral inhibition (from the inhibitory layer). Because the inhibitory layer also projects to the perceptual layer, activation at the corresponding values along the metric dimension (i.e., features similar to those held in working memory) is relatively suppressed. Conceptually, this reflects reduced perceptual processing of familiar items. By contrast, novel items (i.e., feature values not held in working memory) produce a novelty signal through strong activation in the perceptual layer. When integrated with a fixation system, this mechanism of comparison can account for habituation (Perone & Spencer, 2013b) and novelty preferences (e.g., Perone & Spencer, 2014) in looking paradigms. When coupled to a “same”/“different” response system, this same mechanism can explain children’s and adults performance in the change detection task (e.g., Simmering, 2016) and a single-item color discrimination task (Simmering & Patterson, 2012). Thus, the dynamic model architecture specifies the real-time processes of encoding, maintenance, and comparison required to recognize familiarity and detect novelty across VWM tasks.

The cognitive dynamics theory incorporates a second key concept from dynamics systems in the integration across timescales (Fogel & Thelen, 1987; Smith & Thelen, 2003). From this perspective, understanding how behavior emerges in the moment of a task can provide insight into how behavior changes over development, as both reflect the interacting components of the system. Specifying the processes that support behavior across tasks in a model can provide a test of potential mechanisms to explain development (see Simmering & Schutte, 2015, for discussion). A single type of developmental change in the dynamic model—strengthening connectivity within and between layers—can account for improvements in habituation (Perone & Spencer, 2013b), visual paired comparison (Perone et al., 2011; Perone & Spencer, 2013a, 2014), change detection (Simmering, 2016; Simmering et al., 2015) and color discrimination (Simmering & Patterson, 2012), plus a range of spatial memory tasks (Simmering, Schutte, & Spencer, 2008). This change in connectivity formalizes another central concept from dynamic systems theory: dynamic stability (which we term “real-time” stability to contrast with long-term stability).

The concept of real-time stability is most easily illustrated through motor development, for example, observing an infant learning to reach or walk. Early forms of behavior are uncoordinated, unpredictable, unreliable—collectively, unstable. Through repetition in the coordination of the processes supporting the behaviors, they become more predictable and reliable over time, more broadly applied (e.g., grasping objects of different sizes, walking in different shoes), and more resistance to disturbance (e.g., not dropping an object if something contacts the hand, not falling on uneven surfaces). Simmering (2016) extended this notion to VWM, conceptualizing real-time stability as the collective improvement in the formation, maintenance, and use of representations in service of behavior. Importantly, the continuous processing instantiated in the dynamic model show how increases in real-time stability could arise through strengthening neural connectivity (see Simmering, 2016, for details).

The cognitive dynamics theory was proposed to bridge research across tasks and age groups to provide a more comprehensive account of how VWM functions and develops (Simmering, 2016). This theory and model formalization have generated specific predictions across a range of visuospatial tasks over development (see Simmering & Schutte, 2015, for review). The current paper tests the real-time stability hypothesis further in the context of feature binding in VWM. Specifically, we test the prediction that unstable memory early in development leads to poor localization of features, which impairs feature binding in VWM. In the next section, we review the major theories of feature binding to provide further context to our predictions.

Feature Binding in Visual Working Memory

There are three notions of binding in visual cognition. The first addresses how features from different dimensions (e.g., color and shape) may combine in object representations (e.g., Treisman & Gelade, 1980). The second concerns how features are bound to locations, such as the relative locations of two colors on an object (e.g., Dessalegn & Landau, 2008) or multiple colored squares within a change detection display (e.g., Cowan, Naveh-Benjamin, Kilb, & Saults, 2006). Third is contextual binding, in which object representations are linked to the learning context (e.g., Hollingworth, 2007). We investigated the first type of feature binding, and therefore limit our review to related studies, although it is possible that all three rely on similar processes.

The question of how multiple features are represented within an object arose from electrophysiological evidence that different feature dimensions (e.g., color and shape) are represented by different neural populations in visual cortex. This distribution of feature representations leads to the “binding problem” in multi-object representation (see Treisman, 1996, for review), with ambiguous correspondence between features and objects. The dominant explanation to solve this problem is feature integration theory (Treisman & Gelade, 1980), in which visual selective attention is sequentially allocated to objects, creating “object files” that link features through their shared spatial location. The original theory focused on perception but was later extended to memory: Wheeler and Treisman (2002) proposed that individual features were remembered in independent, capacity-limited stores, but that the correspondence between features could be remembered if relevant to the task.

Feature integration theory contrasts with the “integrated objects” hypothesis from Luck & Vogel (1997; see also Vogel et al., 2001) and Raffone and Wolters (2001), in which slot-like representations are unlimited in the number of features they can contain. Luck and Vogel demonstrated that adults’ performance was comparable across change detection conditions with one to four features per object. Object features from different dimensions (e.g., color and shape) could be detected by maintaining independent feature stores, which is consistent with both integrated objects and feature integration theory. However, Luck and Vogel tested objects comprising two feature values from the same dimension (i.e., squares with different internal versus external colors) and again found performance comparable to single-color objects. This was stronger evidence that features were bound into integrated objects, but this effect replicates only under narrow conditions (see Xu, 2006, for review).

As described above, Raffone and Wolters (2001) proposed that synchronous neural firing could account for object-based capacity limits, with different neural ensembles coding each feature, and the timing of oscillations binding features corresponding to the same object. Critically, studies supporting this explanation used test arrays that differed from memory arrays through a feature change (Figure 1A; Luck & Vogel, 1997; Riggs et al., 2011; Vogel et al., 2001). In contrast, Wheeler and Treisman (2002) proposed a more stringent test which probed feature binding, in which same feature values are present in both memory and test arrays, but changes occur in their correspondence across objects (Figure 1B). Wheeler and Treisman found that adults performed worse on this change type versus detecting new features. The neural synchrony account did not specify how items in memory are compared to the test array, leaving it unclear whether it could account for this finding.

Figure 1.

Figure 1

Sample set size 3 change detection trials testing memory for multi-feature objects with (A) changes to new features or (B) binding swaps. Note that stimuli are not drawn to scale.

Further evidence has challenged the notion of object files, with features remembered together or not at all (Wheeler & Treisman, 2002). Cowan, Blume and Saults (2013) tested adults’ memory for colors and shapes across a range of attentional demands and test configurations, and concluded that memory capacity was limited by the number of features, not objects, and that attention modulated which features were stored. Similarly, Fougnie and Alvarez (2011) tested memory for multi-feature objects in a recall paradigm (see also Bays, Wu, & Husain, 2011; Schneegans & Bays, 2017) and a four-alternative forced-choice paradigm. Results suggested independence of the features in memory, with adults sometimes correctly recalling or recognizing only one feature from an object, contrasting with object files.

An alternative conceptualization of feature representation that may account for this range of results with some integration and some independence of features. Franconeri et al. (2013) proposed ‘map architectures’ to account for capacity limits and feature binding, through representations that compete within maps corresponding to features and/or locations. Different features are represented in different maps, but may be linked through common locations. These maps align conceptually with two-dimensional dynamic neural fields proposed by Johnson, Spencer, and Schöner (2008). Through the common spatial correspondence, features belonging to the same object are linked as part of the same representation. Johnson et al. noted that this implementation differs from feature integration theory in the explicit link between features and space (cf. Ashby, Prinzmetal, Ivry, & Maddox, 1996).

Schneegans, Spencer, and Schöner (2015) developed the two-dimensional dynamic neural fields architecture further, as shown in Figure 2. Two-dimensional fields representing features in space are coupled to one-dimensional fields (1D-color and 1D-space to the 2D-color-space field; 1D-orientation and 1D-space to the 2D-orientation-space field). Stimuli are presented as inputs to the one-dimensional fields, which then form localized peaks of activation to represent features and locations in the scene. Above-threshold activation from the one-dimensional fields projects to the corresponding dimensions in the two-dimensional fields, which produces activation peaks localized along both dimensions to specify which features are present at which locations (see “hot spots” in Figure 2).

Figure 2.

Figure 2

Representation of objects in space, color, and orientation in the working memory fields of the Dynamic Field Theory architecture, corresponding to example memory (A) and test (B) arrays in the change detection task. Stimuli are shown in gray boxes. White panels show one-dimensional representation and blue panels show two-dimensional representations. Individual features or objects are shown as localized peaks (curves above the zero threshold) in one-dimensional fields and as red “hot spots” in two-dimensional fields. Black circles in B indicate the changes in representation corresponding to the feature swap in the stimuli.

Schneegans et al. (2015) described how this architecture could support the detection of changes in features and locations, not only with respect to new values along these dimensions (Figure 1A), but also to new correspondence between familiar values (Figure 1B). Changes to new features in the one-dimensional architecture are detected by virtue of new feature representations building in the excitatory perceptual layer, as described above. The same mechanism can detect new feature values in multi-feature object representations, in parallel for each dimension. When features swap between objects, however, no new feature values are present in the test array; instead, changes must be detected in the feature-space binding. Comparison of color-space fields between Figure 2A and 2B shows how the two-dimensional representations differ when colors swapped between objects. Through this mechanism, features appear integrated by virtue of being bound to the same spatial location, and a swap in features between objects is novel (Figure 1B). At the same time, however, the features remain independent through their representation in separate fields, which would allow one feature of an object to be recalled while another is forgotten (as in Bays et al., 2011; Fougnie & Alvarez, 2011). Further theoretical and empirical work is needed to determine whether there are critical differences between conceptual feature maps (Franconeri et al., 2013) and two-dimensional dynamic neural fields (Schneegans et al., 2015), but at present both appear to account for a broader range of effects than integrated objects/neural synchrony or feature integration theory.

In summary, studies have provided evidence for both integration and independence of features in VWM, with feature integration theory (Wheeler & Treisman, 2002) remaining the dominant account of binding, albeit with some detractors (e.g., Bays et al., 2011; Cowan et al., 2013; Fougnie & Alvarez, 2011; Johnson, Hollingworth, & Luck, 2008). Representations of features and space within two-dimensional dynamic neural fields (Schneegans et al., 2015) provide a potential architecture to account for results showing integration versus independence of features across tasks (cf. Franconeri et al., 2013). In the next section, we consider implications of this two-dimensional representation in the context of the real-time stability hypothesis of VWM development.

Implications of Real-Time Stability for Feature Binding in Visual Working Memory

The cognitive dynamics theory of VWM development posits that increasing real-time stability accounts for multiple changes in VWM from infancy through early childhood and into adulthood (Simmering, 2016). Although this theory and corresponding model have thus far only been applied to single-feature memory (Johnson & Simmering, 2015), recent work by Schneegans et al. (2015) provides the context to consider developmental changes in multi-feature representations. The illustrations of the two-dimensional architecture in Figure 2 were generated using parameters fitting adults’ performance. Weaker excitatory and inhibitory connections to simulate early development produce memory representations that are less stable and consequently less precise (Simmering & Schutte, 2015). Combining this developmental change with evidence that precision decreases when adults remember multiple features together (Fougnie, Asplund, & Marois, 2010), we predict that children’s multi-feature representations will be particularly imprecise. Change detection performance should show two consequences of these imprecise representations. First, errors will be more common in younger children, as items (peaks) held in memory may “drift” or “die out” during the delay; this is the case in single-feature representations (Simmering & Schutte, 2015), and could be slightly exaggerated for multi-feature representations. Second, location-based correspondence of features should be doubly impaired, as imprecise representations along two dimensions (i.e., feature and location) are combined.

This second consequence of imprecise memory in early childhood leads to the behavioral prediction we test here: young children should find it harder to detect feature swaps between objects than new-feature changes. In the dynamic model framework, detecting changes in feature binding requires accurate representation of each feature’s location within the scene (Schneegans et al., 2015), as these novelty signals come from new locations for familiar features (Figure 2). With imprecise feature localization, changes in feature locations (i.e., binding swaps between objects) are more likely to be missed, resulting in increased errors for this change type. Thus, this architecture and developmental mechanism predict that detection of changes in feature binding should be particularly impaired early in development. We test this first prediction in Experiment 1 by comparing different types of changes in younger (3–4 years) versus older (5–6 years in Experiment 1a, 7–8 years in Experiment 1b) children and adults. We chose these age groups based on prior studies showing significant improvements in the specificity of color and location representations during early childhood (Ortmann & Schutte, 2010; Simmering et al., 2015; Simmering & Patterson, 2012; Simmering & Spencer, 2008). This study is the first to test children younger than 7 years with multi-feature objects in change detection, and to specifically compare change types in children.

Experiment 1a

Method

Participants

Data from 117 participants are included for analyses: 57 younger children (range = 3.47–4.76 years, M = 3.96, SD = 0.32; 26 girls) and 60 older children (range = 4.96–6.11 years, M = 5.41, SD = 0.36; 28 girls). An additional 35 children (23% of enrolled sample; see Table 1 for division by age and condition) participated but were excluded from analyses due to: ending early (i.e., no blocks above set size two, 18); failing to understand or comply with instructions (9); poor performance in set size one (described below; 5); equipment failure (2); or experimenter error (1). Note that the higher rate of exclusions for younger children was due to the difficulty of the change detection task. Twenty-seven of the younger children excluded here were due to insufficient performance levels (i.e., ending early, failing to understand or meet minimum levels of performance); previous studies reported similar exclusions in this age range (Simmering, 2012, 2016; Simmering et al., 2015). Although this exclusion rate is worrisome, the likely outcome is that the younger children retained in analyses were more cognitively advanced. This possibility works against our hypothesis—that younger children will show specific impairments—and therefore makes these analyses a stronger test than if lower-performing children were included.

Table 1.

Number of Participants from Experiment 1a who were Included versus Excluded from Analyses

Included Excluded
All trials (SS1–SS5) Up to SS4 Up to SS3 TOTAL ≤ SS2 Did not comply/ understand a 3SD below M in SS1 Equip. failure Exp. error TOTAL
Total younger 22 11 24 57 (43) 15 8 4 1 0 28 (18)
 New-feature condition 13 4 13 30 (16) 10 3 1 0 0 14 (4)
 Bind-swap condition 9 7 11 27 (27) 5 5 3 1 0 14 (14)
Total older 50 8 2 60 (28) 3 1 1 1 1 7 (2)
 New-feature condition 24 3 2 29 (14) 2 1 1 1 0 5 (2)
 Bind-swap condition 26 5 0 31 (14) 1 0 0 0 1 2 (0)
TOTAL 72 19 26 117 (71) 18 9 5 2 1 35 (20)

Note. SS = set size. For the analyses reported in text, all children listed in the “Included” columns were included in the first two sets of analyses (Set Size One, Set Sizes Two and Three); only children in the “All trials (SS1–SS5)” column were included in the third set of analyses. Numbers in parentheses in the “Total” columns indicate how many children participated at the University of Iowa; all other children participated at the University of Wisconsin–Madison.

a

See text for descriptions of how children’s understanding and/or compliance was judged

Children were recruited at two large Midwestern universities (60% at University of Iowa, 40% at University of Wisconsin–Madison; see Table 1 for breakdown by age groups and conditions) through databases of families interested in research participation, and received a small gift following participation. Families were primarily from white middle-class backgrounds. Parents reported normal or corrected-to-normal visual acuity and no history of colorblindness.

Apparatus and Procedure

Children were randomly assigned to one of two conditions with different types of change trials: new feature (Figure 1A) or binding swaps (Figure 1B). Children were tested individually in windowless laboratory rooms with dim incandescent overhead lighting. Parents either remained in an adjacent waiting room, or sat in a chair behind the child if they accompanied the child to the testing room.

The experimenter explained the task to children as a card-matching using flashcards (7.62 cm x 7.62 cm) that showed set sizes one, two, and three (cf. Simmering, 2012). The first flashcard was placed on the right or left side (alternating across trials) of a large sheet of cardstock in the experimenter’s lap. The card was shown for approximately 2 s and the child was instructed to “Look at the picture and remember what it shows”. The card was removed for a brief delay, then a second card shown in the same location and the experimenter asked if the card matched the one from before. After the child responded, the experimenter placed both cards side-by-side and praised or corrected as needed. Nine flashcard trials were presented in the same order, with the first three showing set size one, next three showing set size two, and final three showing set size three. Within each set size, a color change was presented first, then no-change, then a shape change. Because binding swaps are not possible in set size one, these cards were identical across conditions and showed changes to a new feature. On the flashcards showing changes, the experimenter explicitly pointed out how the change was achieved (e.g., “see this star is different from the square” for a new-feature shape change, “see the triangle is green here but black there”, “see how the circle is red and the spiral is blue here, but there the circle is blue and the spiral is red” for a binding-swap).

The experimenter judged each child’s understanding based on whether they responded correctly, expressed uncertainty, and/or commented during the task explanation. If the child provided an incorrect answer following correction, the experimenter repeated the task explanation. Once the child understood, the experimenter initiated the computerized task on either an 18″ CRT display connected to a Macintosh G4 computer (n = 69, at University of Iowa), using Matlab 5.2 (Mathworks, Inc.) Psychophysics Toolbox version 2 (Brainard, 1997; Pelli, 1997), or a 15.4″ widescreen Dell Latitude E6500 laptop computer (n = 46, at University of Wisconsin–Madison) using Matlab 13 Psychophysics Toolbox version 3 (Kleiner, Brainard, & Pelli, 2007). Stimulus presentation appeared the same size on both screens with participants seated approximately 60 cm from the display.

Stimuli comprised 64 possible color-shapes combinations (Figure 3; cf. Wheeler & Treisman, 2002), presented on a gray background; no color or shape was repeated within an array. Each item subtended approximately 2° × 2° of visual angle. To facilitate the card-matching description (cf. Simmering, 2012), stimuli were presented within a gray rectangular frame (~12° tall by 10° wide) centered in the left or right half of the screen, alternating across trials. Items were positioned on an invisible circle (~6° radius) centered within the frame. The first item’s position was randomly selected from one of five equally-spaced locations along the circle; subsequent items were positioned in neighboring locations along the circle. Item positions remained constant within a given block.

Figure 3.

Figure 3

The combination of eight colors and eight shapes used for stimuli. RGB values for colors were: gray background (150, 150, 150); white (255, 255, 255); yellow (255, 255, 0); green (0, 255, 0); blue (0, 0, 255); violet (238, 130, 238); red (255, 0, 0); cyan (0, 255, 255); black (0, 0, 0).

As in Simmering (2012), each trial began with a 2 s memory array followed by a 900 ms delay, then a test array remained visible until a response was entered. Children responded verbally and the experimenter pressed the corresponding key; a chime played following correct responses. The first block was practice and included four trials (two change, two no-change) each in set sizes two and three, presented in random order. After the practice block, test trials were blocked by set size and presented in the same order for all children (2, 1, 3, 4, 5; following Simmering, 2016; Simmering et al., 2015). Children were offered breaks between blocks to prevent fatigue. Each test block included six change and six no-change trials in random order. On no-change trials, the memory and test arrays were identical. On change trials in the new-feature change condition, one item changed to have either a new color or shape (half of the change trials each). For the binding condition, the same change trials were used in set size one (as binding swaps were impossible). On change trials in remaining set sizes for the binding-swap condition, two items would swap color-shape pairings. On half of the change trials, two colors swapped locations; on the other half, two shapes swapped locations. Participation lasted 15–20 minutes.

Throughout the computerized task, the experimenter would note on a session sheet if the child was not following instructions. Children were excluded as noncompliant for looking away frequently during test trials, refusing to provide responses, providing only one response type on 10 or more consecutive trials, or describing an inappropriate strategy (e.g., “I’m only watching the bottom one”, “I’m only remembering the colors”, “I don’t have to see it to know the answer”). Sessions were video recorded for later viewing by a second experimenter to confirm the child did not comply or understand instructions before excluding their data from analyses.

Method of Analysis

Children’s proportions of correct responses were tabulated separately for change versus no-change trials in each set size. As noted in the Participants section (see Table 1), not all children completed all set sizes, which precludes analysis of performance across all set sizes for all children. We considered two ways to accommodate this limitation: using the highest K estimate (a common metric for estimating capacity; Pashler, 1988) from each child, or conducting separate analyses with all versus a subset of children.

Simmering (2012) used Kmax to accommodate the different numbers of set sizes completed by different children. This method produces a single estimate per individual by estimating K (Pashler, 1988; see Appendix for formula) in each completed set size, then using the largest estimate across set sizes as Kmax (cf. Olsson & Poom, 2005; Todd & Marois, 2005). Simmering found close correspondence between statistical analyses of mean Kmax and overall percent correct in 3- to 7-year-olds. There are, however, notable limitations with this approach, which we list in the Appendix. Alternatives to K include using proportion correct separately for change and no-change trials, or calculating measures to combine performance across trial types (e.g., d′ or A′). This requires considering difficulty according to set size, which can be accomplished by including set size as a repeated measure in analyses if all participants complete the same set sizes. However, that was not the case with our sample, leading us to conduct separate analyses for smaller versus larger set sizes (comparing all versus a subset of children, respectively). Although both of these methods—calculating Kmax versus comparing across set sizes for different sets of participants—have limitations, we felt the latter approach allowed the most data inclusion overall (see Appendix for additional analyses).

We selected accuracy (A′) as our dependent measure, using the following equations (Aaronson & Watts, 1987):

IfHFA:A=12+{[(H-FA)×(1+H-FA)]÷[4×H×(1-FA)]}IfH<FA:A=12-{[(FA-H)×(1+FA-H)]÷[4×FA×(1-H)]}

This measure follows a range comparable to proportion correct (0 to 1, chance = .50), but incorporates performance on change trials (H = hit rate/correct responses) and no-change trials (FA = false alarms/incorrect responses) into a single value per set size. Performance is typically higher on no-change trials, but this difference is not central to our research question. Our specific question concerns hit rates across conditions, but using A′ rather than only hits can adjust for different levels of correct rejections across groups and allows for comparison of the change types between groups (see Appendix for trial type analyses).

Before conducting the primary analyses of interest, we used set size one performance to evaluate whether participants understood and were sufficiently engaged in the task. Previous studies using single-feature stimuli have shown small differences on set size one performance between 3 and 5 years, leading us to consider age groups separately. Children’s set size one accuracy was generally high (Myounger = .92, SD = .13; Molder = .97, SD = .05), but five children performed more than three standard deviations below the age group mean, leading to their exclusion (see Table 1).

Results

Figure 4A shows children’s accuracy across set sizes separated by condition and age groups. Performance declined across set sizes, with older children generally performing better than younger children, and a larger separation between conditions for younger children. As shown in Table 1, many younger children and a few older children did not complete set sizes four and five. Thus we conducted two sets of analyses: one with all participants including data from set sizes one through three, and one with a subset of participants including data from set sizes two through five.

Figure 4.

Figure 4

Mean accuracy (A′) across set sizes (SS), conditions, and age groups in (A) Experiment 1a and (B) Experiment 1b. Chance equals .50. Error bars show 95% confidence intervals of the mean. *Set sizes four and five include data from only a subset of participants (see text for details).

As described above, change trials in set size one could not include binding swaps, so these trials were identical across conditions; thus, we analyzed these trials separately from set sizes two through five. An ANOVA with condition (new-feature, binding-swap) and age group (younger, older) as between-subjects factors on mean set size one A′ revealed a significant Condition x Age Group interaction (F1, 113 = 9.67, p = .002, η2p = .079). This interaction was driven by older children in the binding-swap condition performing significantly better than those in the new-feature condition (t58 = 2.63, p = .011, d = 0.57; all reported t-tests are two-tailed), whereas younger children in the new-feature condition performed non-significantly better than those in the binding-swap condition (t41.4 = 1.90, p = .064, d = 0.39, correcting for unequal variance). This indicates small differences in performance between conditions, despite the identical trials.

Next we compared performance across age groups and conditions in set sizes two and three. Prior studies have shown age-related improvements within these set sizes (e.g., Simmering, 2016), allowing for comparison of our manipulation even with this restricted analysis. We analyzed mean A′ in an ANOVA with set size (two, three) as a within-subjects factor, and condition and age group as between-subjects factors. This revealed a significant main effects of set size (F1, 113 = 36.71, p < .001, η2p = .245), which reflects overall higher accuracy in set size two (M = .88) than three (M = .77). The ANOVA also showed a significant main effect of age group (F1, 113 = 15.70, p < .001, η2p = .122), which was subsumed by a significant Condition x Age Group interaction (F1, 113 = 10.00, p = .002, η2p = .081). We followed up on this interaction with separate one-way ANOVAs testing for condition effect in each age group. Younger children in the binding-swap condition (M = .75) performed significantly worse than those in the new-feature condition (M = .82; F1, 55 = 5.46, p = .023, η2 = .090). By contrast, older children in the binding-swap condition (M = .88) performed significantly better than those in the new-feature condition (M = .84; F1, 58 = 4.50, p = .038, η2 = .072). Thus, the younger children showed the predicted impairment in detecting binding swaps, whereas older children showed the opposite pattern.

Lastly, we analyzed data from the subset of children who completed all five set sizes (see Table 1), which included most of the older children (50 out of 60) but fewer than half of the younger children (22 out of 57). Although this sample imbalance is not ideal, it can indicate the robustness of the set size two and three analyses. The younger children who completed the entire task may differ from children who ended earlier, such that children excluded from this analysis were less cognitively advanced than those included. Consistent with this expectation, mean performance in set sizes two and three was higher in children who completed all set sizes (Mbind-swap = .82, Mnew-feat = .86), versus those who ended earlier (Mbind-swap = .72, Mnew-feat = .79). This possible difference in subsets of younger children works against our predicted effect, in that the developmental difference between age groups would be minimized in this analysis.

We analyzed mean A′ in an ANOVA with set size (two, three, four, five) as a within-subjects factor, and condition and age group as between-subjects factors. This revealed a significant main effect of set size, (F2.32, 157.98 = 45.92, p < .001, η2p = .403; Greenhouse-Geisser corrected for sphericity violation). Tukey HSD follow-ups (p < .05) on set size showed significant differences in each pair-wise comparison (MSS2 = .89, MSS3 = .83, MSS4 = .71, MSS5= .63). The ANOVA also showed a significant main effect of age group (F1, 68 = 6.10, p = .016, η2p = .082), reflecting overall higher accuracy in older (M = .78) versus younger children (M = .73). The Condition x Age Group interaction did not reach significance (p = .105), but to compare with our results from set sizes two and three, we conducted planned one-way ANOVAs comparing conditions in each age group. The condition effect was significant for older children (Mbind-swap = .81, Mnew-feat = .76; F1, 48 = 4.26, p = .044, η2 = .082) but not for younger children (Mbind-swap = .71, Mnew-feat = .74; F1, 20 = 3.70, p = .069, η2 = .156), similar to the analyses of set sizes two and three. The non-significant effect for younger children likely reflects the relatively small sample limited to higher-performing young children.

Discussion

Results from Experiment 1a showed a general age difference, with older children out-performing younger children, replicating prior studies (Simmering, 2012, 2016; Simmering et al., 2015). Both the primary analysis including all children (set sizes two and three) and the analysis of only the subset completing all five set sizes showed a condition cross-over between age groups. Younger children in the binding-swap condition performed worse than those in the new-feature condition, although this effect did not reach significance in the smaller sample. In general, however, these results are consistent with the prediction derived from the cognitive dynamics theory, that unstable memory representations would specifically impair younger children’s feature binding due to the need to remember each feature’s location.

The opposite effect in older children’s performance, with higher accuracy in the binding-swap versus new-feature condition, was not predicted. We see at least three possible explanations for this effect. First, it could be a spurious finding that resulted from the variable performance that children show in this task. Note that this could also be the case for the predicted effect in the younger age group, which leads us to test that prediction further in Experiment 2. Second, due to the between-subjects nature of our comparisons, it is possible that children (in one or both age groups) in the two conditions differed systematically in their cognitive abilities, despite random assignment. Results from set size one (in which trials did not differ across conditions) were consistent with this interpretation, although those differences were numerically quite small.

Lastly, of most theoretical interest is the possibility that the nature of our manipulation inadvertently made binding swaps easier to detect than new-feature changes for older children. As Figure 1B shows, binding swaps were accomplished by switching features between two objects, which resulted in both objects differing from the memory array. As such, change trials in the binding-swap condition included two changed items, or two possible change signals per trial (compared to only one in the new-feature condition, Figure 1A). Some studies with adults counter-act this effect by including two new feature values on feature change trials (e.g., Johnson, Hollingworth, & Luck, 2008; Wheeler & Treisman, 2002), but we avoided this approach to remain consistent with previous change detection studies with children. Note that this explanation of older children’s performance assumes they can localize features precisely enough to detect the binding swaps. By contrast, if younger children’s performance is impaired through imprecise localization, as proposed by the cognitive dynamics theory, the two new objects created through binding swaps would not appear novel (leading to the lower rates of detection).

As a preliminary test of this third possibility – that binding swaps were easier for older children to detect – we conducted a follow-up experiment with 7- to 8-year-old children and adults. This allowed us to compare performance through this later developmental period to determine the longer trajectory of this effect.

Experiment 1b

Method

Participants

Data from 30 older children (range = 6.89–8.40 years, M = 7.37, SD = 0.40; 11 girls) and 28 adults (range = 18.46–22.44 years, M = 19.69, SD = 0.99; 9 women) are included for analyses. Seven additional children participated but were excluded due to noncompliance (1), ending early (i.e., not completing all five set sizes; 4), or experimenter errors (2). Child participants were recruited as in Experiment 1a at University of Wisconsin–Madison. Adult participants were recruited through an introductory psychology course at University of Iowa and received research participation credit. Participants were primarily from white, middle-class backgrounds. Adult participants and parents of child participants reported no history of colorblindness and normal or corrected-to-normal visual acuity.

Apparatus, Procedure, and Method of Analysis

The apparatus, procedure, and method of analysis were identical to Experiment 1a. We again used set size one performance as a threshold for inclusion. Adults made no errors in set size one (M = 1.00, SD = .00). Children’s set size one accuracy was quite high (M = .99, SD = .02), and all children performed within three standard deviations of the mean.

Results

Figure 4B shows participants’ accuracy across set sizes separated by condition and age groups. Performance was overall higher than in Experiment 1a, with smaller differences between conditions that varied by set size. Our primary interest in this experiment was whether these older children and adults showed the same pattern as 5- to 6-year-olds in Experiment 1a: better performance in the binding-swap versus new-feature condition. To this end, we compared mean A′ in an ANOVA with set size (two, three, four, five) as a within-subjects factor and age group (7–8 years, adults) and condition (binding-swap, new-feature) as between-subjects factors. This analysis revealed significant main effects of set size (F2.34, 126.56 = 23.12, p < .001, ηp2 = .300; Greenhouse-Geisser corrected) and age group (F1, 54 = 19.37, p < .001, ηp2 = .264), which were subsumed by a significant Set Size x Age Group interaction (F2.34, 126.56 = 4.65, p = .008, ηp2 = .079; Greenhouse-Geisser corrected). As Figure 4B shows, this interaction was driven by more similar performance between age groups in smaller set sizes, with children’s performance dropping at higher set sizes. Most critical to our research question, however, were effects of condition, none of which reached significance (ps > .163).

As Figure 4B shows, adults’ performance was uniformly high, which could mask condition differences. We therefore tested only children’s data, across Experiments 1a and 1b, in parallel to the Experiment 1a analyses, and report only effects of condition as that was of primary theoretical interest. First we compared all children in only set sizes two and three. An ANOVA with set size as a within-subjects factor and age group (3–4 years, 5–6 years, 7–8 years) and condition as between-subjects factors yielded a significant Condition x Age Group interaction (F2, 141 = 5.94, p = .005, η2p = .074), comparable to Experiment 1a. We added to our previous follow-up analyses by conducting a one-way ANOVA testing for condition effect in 7- to 8-year-old children, which did not reach significance (p = .181; Mbind-swap = .90, Mnew-feat = .93).

Lastly, we compared performance between the children from Experiment 1a who completed all five set sizes and the children from Experiment 1b, again reporting only effects of condition. An ANOVA with set size as a within-subjects factor and age group (3–4 years, 5–6 years, 7–8 years) and condition as between-subjects factors yielded no significant effects of condition (ps > .122). Thus, evidence for a binding-swap advantage was limited to the effects in Experiment 1a.

Discussion

Across Experiments 1a and 1b, results indicated differences between conditions in set sizes 2 and 3, with 3-to 4-year-old children in the new-feature condition performed significantly worse than those in the binding-swap condition, but 5- to 6-year-old children in the binding-swap condition performed better than those in the new-feature condition; the effect with older children held at higher set sizes as well. This advantage for binding swaps was not found in 7- to 8-year-old children or adults, suggesting that the effect with 5- to 6-year-old children was potentially a small or spurious effect, or limited to a relatively brief period of development. Prior studies that have reported deficits in adults’ binding performance typically use paradigms in which the object positions all changed between the memory and test arrays, and new-feature change trials include two new features (e.g., Johnson, Hollingworth, & Luck, 2008; Wheeler & Treisman, 2002). These aspects of the design likely led to the differences between the current and prior results, which could have implications for theories of feature binding (see Cowan et al., 2013, for related discussion). Further testing of these issues is warranted to understand the limits of feature binding across age groups, but is beyond the scope of this paper. Rather, we return to the predicted finding from Experiment 1a: younger children’s specific deficit in detecting binding swaps. Although the results with younger children were consistent with our prediction, the inconsistent effects with the older age group indicate the need to interpret results with caution. Thus, we conducted an additional experiment to test our central prediction further.

The impaired detection of binding swaps in early childhood was predicted due to poor localization of features. The dynamic model framework posits that binding swaps are detected by virtue of the new feature-location correspondence (Schneegans et al., 2015), which requires relatively precise localization of each feature. Prior studies of spatial working memory show substantial changes in location memory precision during early childhood, as well as a transition in spatial recall biases occurring around 4 to 5 years of age (Simmering & Schutte, 2015)—the same age at which we found the transition in binding performance in Experiment 1a. The correspondence between these transitions in spatial working memory and feature binding underscores our assumption that detecting binding swaps depends critically on the precision of the spatial localization of features.

To test this link between spatial localization and feature binding more directly, our next experiment aimed to support young children’s localization of features through perceptual structure in the task space. Spatial memory studies have shown that adding perceptual structure to otherwise “empty” space modulates the direction and magnitude errors (Schutte & Spencer, 2010; Simmering & Spencer, 2007) and improves position discrimination (Simmering, Spencer, & Schöner, 2006). Furthermore, Schutte and Spencer (2010) showed that adding perceptual structure in spatial recall modulated young children’s performance to a more mature pattern. Thus, more structure from the environment supported the stability of spatial memory in the moment of the task.

We extend these findings to predict that adding perceptual structure to the change detection task will enhance the precision of young children’s feature localization, which will in turn augment their ability to detect binding swaps. Because young children’s poorer localization was revealed specifically through impaired binding swap detection, we predict that the improvement will be limited to those trials and that age group. We test this prediction in Experiment 2 by adding a grid to the stimulus displays, such that each object appears within a unique box on the screen. If these boxes help stabilize memory for the feature-location correspondence, producing more precise feature localization, then young children should be more sensitive to the changes in feature positions on binding swap trials. For trials on which feature localization is not as critical—new-feature changes and no-change trials—we predict that the grid will not significantly affect performance.

A secondary goal of Experiment 2 was to replicate the difference found in younger children in Experiment 1a using a within-subject design, due to the inconsistent effects found across age groups in Experiment 1. Note that this is not a direct replication, as we changed the comparison from between- to within-subjects, but we believe the comparison in Experiment 2 is preferable due to the potential for group differences to contribute to the between-subjects differences (despite random assignment, as discussed in Experiment 1a). We also tested a group of adult participants in a higher set size to investigate whether a more difficult task might reveal a difference in detecting new-feature changes versus binding swaps (unlike Experiment 1b, in which adults’ performance was near ceiling). The addition of this age group also allows us to test our prediction that any benefit from perceptual structure would be limited to early childhood. Thus we included two age groups—young children and adults—both tested in within-subjects comparisons of binding-swap and new-feature changes with either a standard (no-grid) display or with a grid around the items.

Experiment 2

This experiment had two goals. First, the standard (no-grid) condition was designed to replicate young children’s impairment on binding swaps, and adults’ comparable performance across change types, by testing three types of trials within-subjects: no-change, new-feature changes, and binding swaps. Second, the grid-added condition was designed to test whether augmenting feature localization would specifically improve young children’s detection of binding swaps. We predicted that, in the no-grid condition, young children would perform worse on binding swaps versus new-feature changes (replicating Experiment 1a) but adults would perform comparably (replicating Experiment 1b), and in the grid-added condition, only young children’s detection of binding swaps would improve relative to the no-grid condition. The specificity of our prediction that perceptual structure will influence only young children’s binding swap detection arises from our proposed explanation for the deficit observed in Experiment 1: poor localization of features due to unstable memory representations.

Method

Participants

Data from 48 younger children (range = 3.59–4.97 years, M = 4.42, SD = 0.38; 24 girls) and 34 adults (range = 18.29–21.46 years, M = 19.42, SD = 0.73; 25 women) are included in analyses. An additional 12 children and 2 adults participated but are excluded due to ending early (3 children), experimenter error (1 adult), or failing to understand/comply with instructions (9 children). Children were recruited as in Experiment 1b and adults were recruited through an undergraduate course at University of Wisconsin–Madison and received extra credit for participating. Participants were primarily from white middle-class backgrounds. Adult participants and parents of child participants reported normal or corrected-to-normal visual acuity and no history of colorblindness.

Apparatus and Procedure

The apparatus and procedure were identical to Experiment 1 with the following exceptions. First, stimulus presentation was controlled by Microsoft PowerPoint 2010, and stimuli were presented within the cells of a 3x2 grid (Figure 5), although the black lines were not visible in the no-grid condition. Second, both new-feature changes and binding swaps were included, and during the task explanation participants were explicitly told that both types of changes would occur. Third, because both types of changes were included, the number of change trials was increased from six to eight per trial block (four new-feature changes, four binding swaps) and the number of no-change trials was decreased to four to avoid increasing the total number of trials required. Fourth, we did not provide feedback on accuracy of responses; rather, children were praised uniformly during the task (adults received no feedback). This change was partly to reduce demands on the experimenters evaluating correctness in the moment of the task (as informative feedback could not be automated in PowerPoint), and also motivated by the fact that the higher number of change trials would likely lead to more errors for children overall (cf. Experiment 1 change trials, shown in the Appendix). Fifth, to avoid biasing children’s attention to only new-feature changes, we eliminated set size one (in which binding swaps are impossible). Finally, to test participants on trials that were close to typical capacity estimates for these age ranges, we used set sizes two and three for children, and set size six for adults.

Figure 5.

Figure 5

Sample stimulus presentations for Experiment 2. Note that stimuli are drawn to scale; the full grid measured 21 cm wide by 14.25 cm tall (visual angle of approximately 19.9° x 13.5°).

With these changes, sessions proceeded with the following structure. For child participants, the experimenter presented six flashcard trials to explain the task, in the following order: set size two no-change, new-feature change, binding swap; set size three no-change, binding swap, new-feature change. For adult participants, the experimenter described the task verbally. On the laptop, trials were presented in one of four pre-determined orders, each of which included 8 practice trials (4 each in set sizes two and three, in random order) followed by 24 test trials for children (the first 12 in set size two, the last 12 in set size three) or 48 test trials for adults (all in set size six). Within each block, trials were pseudo-randomly ordered such that no more than three consecutive trials had the same correct answer. The timing of stimulus presentation was controlled through PowerPoint animation, with each trial initiated by pressing the spacebar; the experimenter initiated trials for children, and adult participants advanced trials themselves. During practice trials, the experimenter could reverse the PowerPoint presentation to review a trial and ensure the child understood the task and was following instructions. For test trials, the experimenter did not give corrective feedback and did not re-present stimuli. If needed, children were reminded of the task instructions (“tell me if the pictures matched”) during the test trials.

Method of Analysis

Children’s responses were recorded by hand by the experimenter during sessions; adults recorded their own responses on a session sheet. Responses were entered into a master spreadsheet with a formula that compared them to correct responses. For each participant, we calculated three proportion correct scores per set size: no-change trials (i.e., correct rejection rate), change trials with new features (i.e., new-feature hit rate), and change trials with binding swaps (i.e., binding-swap hit rate). Due to the within-subjects comparison of change types, the analytical methods discussed in Experiment 1 (calculating Kmax or A′) were not appropriate, as both combine performance across change and no-change trials. Using only one false alarm rate (from no-change trials) with two different hit rates would conflate these measures. As such, our primary analysis compared hit rates separately in each condition. To allow comparison across conditions and with Experiment 1a, however, we also calculated two A’ scores (one using new-feature hit rate, the other using binding-swap hit rate) in each set size to compare separately.

Results

Figure 6 shows mean hit rates for new-feature changes and binding swaps and mean correct rejection rates across set sizes and age groups. As this figure shows, performance in the no-grid condition was similar to Experiment 1, with young children showing poorer detection of binding swaps than new-feature changes, and adults performing comparably across change types. To address our central research question, we conducted three sets of analyses. First, we analyzed hit rates across change types within the no-grid condition (separately for children and adults). Second, we analyzed hit rates within the grid-added condition to test the predicted effect of perceptual structure. Third, we compared performance between no-grid and grid-added conditions to test whether the effect of the grid was specific to detecting binding swaps, as predicted.

Figure 6.

Figure 6

Mean proportion correct for each trial type across set sizes (SS) and age groups in Experiment 2. Error bars show 95% confidence intervals of the mean.

First, we analyzed adults’ mean hit rates in a repeated-measures t-test, which showed no difference across change types (p = .827), in parallel to Experiment 1b. We did not compare adults’ performance directly between Experiments 1b and 2 due to the different set sizes tested, but both the between- and within-subjects comparisons suggest that adults detected binding swaps and new-feature changes at similar rates.

Next, we analyzed children’s performance in the no-grid condition by comparing mean hit rates in an ANOVA with change type (new-feature, binding-swap) and set size (two, three) as within-subjects factors. This analysis revealed significant main effects of set size (F1, 23 = 34.77, p < .001, η2p = .602) and change type (F1, 23 = 5.68, p = .026, η2p = .198) with no significant interaction (p = .699). The set size effect was driven by higher hit rates in set size two (M = .74) versus three (M = .48). The change type effect reflects better detection of new-feature changes (M = .69) than binding swaps (M = .53), in parallel to Experiment 1a.

To compare effects between experiments directly, we analyzed A′ from each change type in the no-grid condition separately (due to the common correct rejection rates) to performance in the two conditions of Experiment 1a. We conducted two-way ANOVAs with set size (2, 3) size as a within-subjects factor and Experiment (1a, 2) as a between-subjects factor, and report only effects of experiment. The analysis on binding-swap A′ yielded no main effect or interaction (ps > .559), indicating similar performance in the between-subjects version (M = .75) and the within-subjects version (M = .78). The analysis of children’s new-feature A′ also yielded no main effect or interaction (ps > .515), again indicating similar performance in the between-subjects version (M = .82) and the within-subjects version (M = .84). Thus, although the no-grid condition was not a direct replication of Experiment 1a, these analyses suggest that children performed comparably in the within- versus between-subjects tests of each change type.

For the grid-added condition, we first analyzed adults’ mean hit rates in a repeated-measures t-test, which again showed no difference across change types (p = .299). We then analyzed children’s performance by comparing mean hit rates in an ANOVA with change type and set size as within-subjects factors. This analysis revealed a significant main effect of set size (F1, 23 = 7.30, p = .013, η2p = .241), but no effect of condition (p = .917) and no interaction (p = .222). As in prior analyses, the set size effect was driven by higher hit rates in set size two (M = .71) versus three (M = .55). The non-significant condition effects follow our prediction that the added perceptual structure would eliminate the impairment in young children’s detection of binding swaps. To test the specificity of this effect further, we compared performance across conditions for each change type separately using A′. We analyzed children’s binding-swap A′ in an ANOVA with set size as a within-subjects factor and condition (no-grid, grid-added) as a between-subjects factor, and report only effects of condition. The main effect of condition was significant (F1, 46 = 5.24, p = .027, η2p = .102), with impaired performance in the no-grid (M = .78) versus grid-added condition (M = .85). The ANOVA on children’s new-feature A′ showed no condition effects (ps > .630; Mno-grid = .84, Mgrid-added = .84). For adults’ performance, independent-samples t-tests comparing conditions showed no significance difference in binding-swap A′ (p = .137; Mno-grid = .76, Mgrid-added = .81) or new-feature A′ (p = .858; Mno-grid = .77, Mgrid-added = .78).

Discussion

This experiment had two goals: first, to replicate the Experiment 1 findings with young children and adults in a within-subjects design; second, to test whether adding perceptual structure would support young children’s detection of binding swaps. Results from the no-grid condition replicated Experiment 1a, showing that young children were significantly worse at detecting binding swaps versus new-feature changes, but adults performed comparably. We explained this effect as arising from imprecise feature localization early in development due to unstable VWM representations. To test our explanation, we provided support for feature localization through added perceptual structure (presenting objects within a grid) in the grid-added condition. Previous studies showed that adding perceptual structure improved spatial memory (Simmering & Spencer, 2007; Simmering et al., 2006), with younger children showing a more mature pattern of performance with added structure (Schutte & Spencer, 2010). Our results showed that adding perceptual structure to the change detection task specifically improved young children’s detection of binding swaps, supporting our explanation that the binding impairment resulted from unstable, poorly localized features in VWM.

General Discussion

The goal of these studies was to test increasing real-time stability as a potential developmental mechanism underlying VWM improvements. In particular, we tested two predictions derived from the cognitive dynamics theory proposed by Simmering (2016) and related computational implementations (Schneegans et al., 2015; Simmering & Schutte, 2015). First, we found that young children’s detection of binding swaps was impaired relative to detection of new-feature changes (Experiments 1a and 2), whereas older children and adults showed comparable or better (only 5- to 6-year-olds in Experiment 1a) performance on binding swaps. Second, we showed that young children’s relative impairment could be eliminated through added perceptual structure, through increased stability and precision of children’s memory for feature locations within the displays, connecting our results to demonstrated developmental changes in spatial memory (cf. Schutte & Spencer, 2010). These results support the explanation that developmental improvements in VWM during early childhood reflect increases in real-time stability and have implications for theories of feature binding, VWM development, and cognitive development more generally; we consider these areas in turn below.

Our studies were not designed to differentiate competing theories of feature binding in VWM, but should be considered when evaluating such theories. As described above, Wheeler and Treisman (2002) extended feature integration theory to account for memory performance by proposing that attention binds features into “object files”, and must be maintained to support the comparison of items to detect changes in binding (but see Johnson, Hollingworth, & Luck, 2008). Our results using the standard change detection task might suggest that young children lack sufficient visual attention to form such bindings, which could explain the difference in detecting binding swaps versus new-feature changes. However, we know of no evidence that would predict a transition between 4 and 5 years in attention, which would be needed to explain the age differences from Experiment 1. Furthermore, it is not clear whether and how the addition of perceptual structure as in Experiment 2 would modulate attention, making the predicted effect unlikely to arise out of feature integration theory. This does not preclude the possibility that adding perceptual structure affected children’s attention, but it is unclear whether such an explanation could account for the specificity of the effect in Experiment 2. Thus, our results could potentially be incorporated into, but would not have been predicted by, feature integration theory.

Our results add to the literature on representation of multi-feature objects and VWM development, as it is the first study to test young children’s memory for color-shape combinations, directly comparing two types of changes. The primary goal was to contrast among proposed explanations of VWM development, which have generally not addressed how representations are used in service of behavior. The dominant accounts of improvements in VWM during early childhood include in the addition of one or more “slots” (e.g., Cowan, 2013; Riggs et al., 2011) or increased precision (Burnett Heyes et al., 2012, 2016) with age. Such accounts have not specified the processes by which items in memory are compared to the test array in change detection, precluding them from predicting our results that were specific to detecting binding swaps. Indeed, the particular descriptions of neural synchrony to explain both capacity limits and integrated object representations (Luck & Vogel, 1997; Raffone & Wolters, 2001; Riggs et al., 2011; Vogel et al., 2001) are either directly inconsistent with our results (if comparison is assumed to always be accurate), or at least under-specified to address how the type of change would affect performance. Resource-based accounts are also unclear, as they have been used to argue for independence of features (Bays et al., 2011) but again to not specify how representations are used to detect changes in feature values and/or bindings.

The influence of perceptual structure on VWM has also not been discussed in slot and resource perspectives. It is unclear whether these theories could accommodate our results through modification, but in their current states they are not sufficiently specified to generate predictions about the manipulation in Experiment 2. These theories are not necessarily incompatible with the cognitive dynamics theory; indeed, increases in real-time stability in the model produced improvements in VWM capacity and resolution over development (cf. Simmering et al., 2015; Simmering & Patterson, 2012), which could be viewed as an implementation of the characterizations from slot and resource theories. Thus, strengthening excitatory and inhibitory connectivity, and the resulting increase in real-time stability, could provide a mechanistic explanation for the cognitive improvements posited by other theories (see Simmering, 2016, for further discussion). This potential overlap between theories highlights the value of computational implementations, which can test whether a single set of processes could encompass characteristics of multiple competing theories (i.e., changes in both capacity and resolution, Simmering & Miller, 2016; see Johnson et al., 2014, for further details in the context of the slots versus resources debate in VWM).

The cognitive dynamics theory connects this same developmental mechanism from early childhood to infant VWM development. The model implementation allows for a straight-forward connection across tasks and age groups by specifying the processes that support the formation, maintenance, and use of memory representations across tasks. Simmering (2016) showed how the same processes of recognizing familiarity and detecting novelty led to correlations between looking in a preferential looking paradigm and performance in the change detection task. Perone et al. (2011) used the same developmental mechanism to account for capacity increases during infancy, with related changes in habituation (Perone & Spencer, 2013b), discrimination, and processing speed (Perone & Spencer, 2014). Together these findings suggest that real-time stability is a central characteristic of VWM that can explain performance across a range of tasks and developmental periods.

This leads to the larger question of the potential generality of real-time stability as an explanation of development across domains. Conceptually, increasing stability of cognitive processes would allow for representations of all types to be formed more quickly and accurately, endure longer, resist interference better, be retrieved more efficiently and effectively, and support behavior more reliably across task contexts (see Simmering, 2016, for computational and empirical demonstrations). For example, studies of word learning have shown effects of task context that can reveal instabilities early in development. In particular, the breadth of novel noun generalization depends on whether it is probed through “yes”/”no” questions or forced choice responses (Samuelson, Schutte, & Horst, 2009), and toddlers can correctly map a novel name to an unfamiliar object but fail to retain mappings robustly over a 5-minute delay (Horst & Samuelson, 2008). These subtle effects of task details reflect differing demands on real-time stability: less-stable representations can support simpler performance (e.g., fast-mapping) but not a more demanding test (e.g., retention). By re-conceptualizing these differences as reflecting real-time stability, a clearer picture may emerge for how additional task changes influence performance over development, such as the stabilizing influence of a familiar context on toddlers’ label generalization (Perry, Samuelson, & Burdinie, 2014) or how the spatial distribution of objects affects word learning (Axelsson, Perry, Scott, & Horst, 2016). Such a general improvement in the effectiveness of cognition could be applied across domains, but more empirical and theoretical work will be needed to test this explanation rigorously.

To develop more comprehensive theories of how working memory—or any other cognitive skill—functions to support behavior across tasks over development, it is critical to consider behavior beyond a single task type or age group. Only by extending our theories to bring together seemingly disparate effects from a range of behavioral tasks from infancy through later childhood can we begin to map out the central processes that support mature performance outside of the lab. Furthermore, we must seek out and test the potential breadth of application of general mechanisms proposed to account for developmental change in order to find the commonality across domains of study. The identification and understanding of such general real-time and developmental processes will provide a critical gateway into designing effective interventions to maximize advantageous outcomes.

Acknowledgments

Thanks to the families who participated in this research and the research assistants who aided in data collection. The work at the University of Iowa was made possible by the National Science Foundation (HSD 0527698). A subset of the data in Experiment 1a were part of the second author’s senior honors thesis, and was presented at the 68th Biennial Meeting of the Society for Research in Child Development, Denver, CO, and the 31st Annual Meeting of the Cognitive Science Society, Amsterdam, Netherlands; data from Experiment 2 were presented at the 45th Annual Meeting of the Jean Piaget Society in Toronto, Ontario. Participant recruitment and programming at the University of Wisconsin–Madison was funded by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health (R03-HD067481) and the Waisman Intellectual and Developmental Disabilities Research Center grant (P30HD03352). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the National Science Foundation. Thanks to Rob Olson for programming assistance, and to Jeffrey S. Johnson and John P. Spencer for input during the conceptualization of the project.

Appendix

This appendix includes additional analyses of data from Experiment 1a for a more complete characterization of performance and easier comparison to other studies in the literature that have reported different metrics of performance.

Analyses of K Estimates

As described in the Experiment 1a Method of Analysis, K is a common metric for estimating capacity from change detection performance, but we chose not to use it as our primary analysis due to a number of limitations. First, K cannot exceed set size, leading to a potential under-estimation of capacity if set sizes higher than a participants’ K estimate were not completed. Second, adults’ K estimates show poor split-half reliability (Pailian & Halberda, 2015), undermining its appropriateness with smaller numbers of trials. Third, the K formula is influenced more strongly by change versus no-change trials, thus different rates of performance on no-change trials may produce similar K estimates. Lastly, the K formula implies a fixed individual difference in the number of items encoded, and does not consider errors in comparison and decision processes (see Johnson et al., 2014, for discussion). Because we expect impaired detection of binding swaps to arise though the comparison process, interpreting the results as reflecting capacity differences seems disingenuous.

Despite our choice not to use K in our primary analyses, we include it here for ease of comparison with the literature. We calculated capacity estimates for each set size using Pashler’s (1988) formula, K = SS * (H – (1 – CR) ) / CR, in which SS is the set size, H is hit rate (correct responses on change trials), and CR is correct rejection rate (correct responses on no-change trials). We took the highest estimate across set sizes as each participant’s Kmax capacity estimate, which provides a single index per participant (rather than separate values per and set size) and allows for easier comparison across children who completed different numbers of set sizes (for further discussion, see Simmering, 2012; see also Olsson & Poom, 2005; Simmering, 2016; Simmering et al., 2015; Todd & Marois, 2005). As described in the Experiment 1a Method of Analysis, we felt the assumptions underlying the application of this formula did not align with our theoretical interpretation of performance. Specifically, the calculation of K would seem to imply differences in the number of items children in each condition were holding in memory. By contrast, the cognitive dynamics theory has shown that errors can arise through comparison and decision processes, not just the encoding and maintenance of items (Simmering, 2016; see also Johnson et al., 2014). Our particular manipulation across conditions was expected to impact comparison, as only the type of change differed, but we include these analyses here for completeness and to allow comparison with prior studies using this metric.

This method of analysis leads to different criteria for exclusion and inclusion relative to Experiment 1. In particular, the four children from Experiment 1b who were excluded for ending early can be included in this analysis. Additionally, three children (two 3- to 4-year-olds, one 5- to 6-year-old) and three adults had Kmax estimates equal to the highest set size they completed. This metric may have under-estimated their performance, leading us to exclude them from analyses. We calculated mean Kmax estimates separately for each age group and condition for the included participants. Similar to the analyses of A′;, younger children performed worse on binding swaps (M = 1.72, SD = .74, 95% CI [1.44 2.00]; new-feature changes: M = 2.08, SD = .78, 95% CI [1.80 2.37]). Estimates for 5- to 6-year-old children were similar across conditions (Mbind-swap = 2.28, SD = .59, 95% CI [2.07 2.49]; Mnew-feat = 2.06, SD = .58, 95% CI [1.85 2.27]), as were 7- to 8-year-old children’s (Mbind-swap = 2.60, SD = .82, 95% CI [2.21 2.99]; Mnew-feat = 2.60, SD = .63, 95% CI [2.26 2.94]) and adults’ (Mbind-swap = 3.36, SD = .67, 95% CI [2.96 3.75]; Mnew-feat = 3.31, SD = .65, 95% CI [2.97 3.65]). In general, Kmax estimates showed the typical increases with age (Simmering & Perone, 2013).

We analyzed performance with a two-way ANOVA with age group (3–4 years, 5–6 years 7–8 years, adults) and condition (new-feature, binding-swap) as between-participant factors. This analysis revealed only a significant main effect of age group (F1, 162 = 27.56, p < .001, ηp2 = .338). Planned independent-sample t-tests for each age group showed a marginal difference across conditions for 3- to 4-year-olds (t53 = 1.76, p = .084, d = 0.47) and no significant differences for 5- to 6-year-olds (p = .153), 7- to 8-year-olds (p = .992), or adults (p = .866).

Proportion Correct across Trial Types

Figure A1 shows proportion correct from Experiment 1 across set sizes, age groups, and conditions, separately for change and no-change trials. As this figure shows, performance was fairly comparable across age groups and conditions for no-change trials. On change trials, younger children in the binding-swap condition performed notably worse than those in the new-feature condition, whereas older children and adults performed was more similarly across conditions.

Here we report analyses in parallel to the A′ analyses reported in Experiment 1 (with the same participant exclusions described therein). First, we analyzed children’s set size one proportion correct from Experiment 1a in an ANOVA with trial type (change, no change) as a within-subjects factor, and condition (new-feature, binding-swap) and age group (younger, older) as between-subjects factors. This analysis revealed only a significant Condition x Age Group interaction (F1, 113 = 8.94, p = .003, η2p = .073), comparable to the effect reported on A′. This interaction was driven by older children in the binding-swap condition (M = .97) performing significantly better than those in the new-feature condition (M = .93; t58 = 2.50, p = .015, d = 0.41), whereas younger children in the new-feature condition (M = .95) performed marginally better than those in the binding-swap condition (M = .91; t55 = 1.76, p = .084, d = 1.19).

For set sizes two and three (including all participants from Experiment 1a), we analyzed proportion correct in an ANOVA with set size (two, three) and trial type as within-subjects factors, and age group and condition as between-subjects factors. Of most relevance was a significant Age Group x Condition interaction (F1, 113 = 13.86, p < .001, η2p = .109), which we followed up with analyses testing condition effects separately for each age group. As in the analyses of A′, these analyses showed a significant effect of condition for younger children (F1, 55 = 10.62, p = .002, η2 = .162) with impaired performance in the binding swap (M = .67) versus new-feature condition (M = .74). By contrast, the effect for older children approached significance (F1, 58 = 3.96, p = .051, η2 = .064) reflecting the opposite pattern from younger children (Mbind-swap = .80, Mnew-feat =.76).

The overall analysis on set size two and three proportion correct also revealed significant main effects of set size (F1, 113 = 55.69, p < .001, η2p = .330), trial type (F1, 113 = 92.93, p < .001, η2p = .451) and age group (F1, 113 = 20.70, p < .001, η2p = .115), as well as significant two-way interactions of Trial Type x Condition (F1, 113 = 6.95, p = .010, η2p = .058) and Set Size x Trial Type (F1, 113 = 10.56, p = .002, η2p = .085), which were subsumed by a significant three-way interaction of Set Size x Trial Type x Condition (F1, 113 = 4.81, p = .001, η2p = .041). To follow up on the three-way interaction, we conducted separate analyses by trial type with set size as a within-subjects factor and condition as a between-subjects factor (collapsing across age groups). The analysis on no-change trials showed only a significant main effect of set size (F1, 115 = 8.83, p = .004, η2p = .071), which was driven by better performance in set size two (M = .89) than three (M = .83). The analysis of change trials revealed significant main effects of set size (F1, 115 = 41.17, p < .001, η2p = .264) and condition (F1, 115 = 4.87, p = .029, η2 = .041). As Figure A1 shows, the set size main effect again reflected better performance in set size two (M = .72) than set size three (M = .54). The condition effect was driven by better detection of new-feature changes (M = .67) than binding swaps (M = .60).

For the subset of children who completed all five set sizes from Experiment 1a, we conducted an ANOVA on proportion correct with set size (two, three, four, five) and trial type as within-subjects factors, and age group and condition as between-subjects factors. This analysis revealed significant main effects of set size (F3, 204 = 59.19, p < .001, η2p = .465), trial type (F1, 68 = 88.17, p < .001, η2p = .565), and age group (F1, 68 = 4.16, p = .045, η2p = .058), as well as a significant Trial Type x Condition interaction (F1, 68 = 4.12, p = .046, η2p = .057). The interaction was driven by no difference across age groups on no-change trials, but significantly higher performance by older children than younger children on change trials. Tukey HSD follow-up tests (p < .05) on the set size effect showed that children’s performance differed significantly for all pair-wise comparisons except set sizes four and five (MSS2 = .80, MSS3 = .69, MSS4 = .63, MSS5= .58). Similar to the analyses reported on A′ in Experiment 1a, the Age Group x Condition interaction did not reach significance (p = .082), nor did the Trial Type x Age Group x Condition interaction (p = .077).

Lastly, we analyzed Experiment 1b participants’ proportion correct in an ANOVA with set size (two, three, four, five) and trial type as within-subjects factors, and condition and age group (7–8 year, adults) as between-subjects factors. This analysis revealed significant main effects of set size (F3, 162 = 40.82, p < .001, η2p = .431), trial type (F1, 54 = 158.45, p < .001, η2p = .746), and age group (F1, 54 = 25.01, p < .001, η2p = .317). These effects were subsumed by significant two-way interactions of Set Size x Age Group (F3, 162 = 4.92, p = .003, η2p = .084), Set Size x Condition (F3, 162 = 3.49, p = .017, η2p = .061), and Trial Type x Age Group (F1, 54 = 11.14, p = .002, η2p = .171). We followed this last interaction by comparing age groups in separate independent-samples t-tests for each trial type. These analyses revealed that adults (M = .77) performed significantly better than 7- to 8-year-old children (M = .62) on change trials (t53 = 4.98 p < .001, d = 1.25), but comparably on no-change trials (Madults = .95, M7–8y = .93, p = .276).

We followed up on the set size interactions by conducting separate independent-samples t-tests within each set size, first comparing age groups, then comparing conditions. For the age group comparisons, proportion correct did not differ in set size two (Madults = .92, M7–8y = .91, p = .671), but adults performed better than children in set sizes three (Madults = .90, M7–8 years = .79, t45.96 = 3.68, p < .001, d = 0.984, corrected for unequal variance) four (Madults = .84, M7–8y = .72, t53 = 3.95, p < .001, d = 1.042) and five (Madults = .78, M7–8y = .67, t53 = 3.12 p < .001, d = 0.836). For the condition comparisons, proportion correct differed only in set size two, with higher performance in the new-feature condition (M = .95) than the binding-swap condition (M = .88; t53 = 3.04, p = .004, d = .825; all other ps > .146). These analyses indicate that both interactions were driven by different performance in set size two than three through five.

Figure A1.

Figure A1

Mean proportion correct across set sizes (SS), trial types, conditions, and age groups in Experiment 1. Error bars show 95% confidence intervals of the mean.

References

  1. Aaronson D, Watts B. Extensions of Grier’s computational formulas for A′ and B″ to below-chance performance. Psychological Bulletin. 1987;102:439–442. [PubMed] [Google Scholar]
  2. Ashby FG, Prinzmetal W, Ivry R, Maddox WT. A formal theory of feature binding in object perception. Psychological Review. 1996;103:165–192. doi: 10.1037/0033-295x.103.1.165. [DOI] [PubMed] [Google Scholar]
  3. Atkinson RC, Shiffrin RM. Human memory: A proposed system and its control processes. In: Spence KW, editor. The Psychology of Learning and Motivation: Advances in Research and Theory. Vol. 2. New York: Academic Press; 1968. pp. 89–195. [Google Scholar]
  4. Axelsson EL, Perry LK, Scott EJ, Horst JS. Near or far: The effect of spatial distance and vocabulary knowledge on word learning. Acta Psychologica. 2016;163:81–87. doi: 10.1016/j.actpsy.2015.11.006. https://doi.org/10.1016/j.actpsy.2015.11.006. [DOI] [PubMed] [Google Scholar]
  5. Bays PM, Husain M. Dynamic shifts of limited working memory resources in human vision. Science. 2008;321:851–854. doi: 10.1126/science.1158023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bays PM, Wu EY, Husain M. Storage and binding of object features in visual working memory. Neuropsychologia. 2011;49:1622–1631. doi: 10.1016/j.neuropsychologia.2010.12.023. https://doi.org/10.1016/j.neuropsychologia.2010.12.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Brainard DH. The psychophysics toolbox. Spatial Vision. 1997;10:433–436. https://doi.org/10.1163/156856897X00357. [PubMed] [Google Scholar]
  8. Burnett Heyes S, Zokaei N, Husain M. Longitudinal development of visual working memory precision in childhood and early adolescence. Cognitive Development. 2016;39:36–44. doi: 10.1016/j.cogdev.2016.03.004. https://doi.org/10.1016/j.cogdev.2016.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Burnett Heyes S, Zokaei N, van der Staaij I, Bays PM, Husain M. Development of visual working memory precision in childhood. Developmental Science. 2012;15:528–539. doi: 10.1111/j.1467-7687.2012.01148.x. https://doi.org/http://dx.doi.org/10.1111/j.1467-7687.2012.01148.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cowan N. The Wiley Handbook on the Development of Children’s Memory. West Sussex, UK: Wiley-Blackwell; 2013. Short-term and working memory in childhood; pp. 202–229. [Google Scholar]
  11. Cowan N. Working memory maturation: Can we get at the essence of cognitive growth? Perspectives on Psychological Science. 2016;11(2):239–264. doi: 10.1177/1745691615621279. https://doi.org/10.1177/1745691615621279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cowan N, Blume CL, Saults JS. Attention to attributes and objects in working memory. Journal of Experimental Psychology: Learning, Memory, and Cognition. 2013;39(3):731–747. doi: 10.1037/a0029687. https://doi.org/10.1037/a0029687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Cowan N, Elliott EM, Saults JS, Morey CC, Mattox S, Hismjatullina A, Conway ARA. On the capacity of attention: Its estimation and its role in working memory and cognitive aptitudes. Cognitive Psychology. 2005;51:42–100. doi: 10.1016/j.cogpsych.2004.12.001. https://doi.org/10.1016/j.cogpsych.2004.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Cowan N, Naveh-Benjamin M, Kilb A, Saults JS. Life-Span Development of Visual Working Memory: When is feature binding difficult? Developmental Psychology. 2006;42(6):1089–1102. doi: 10.1037/0012-1649.42.6.1089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Cowan N, Saults JS, Elliott EM. Advances in Child Development and Behavior. Vol. 29. Elsevier; 2002. The search for what is fundamental in the development of working memory; pp. 1–49. [DOI] [PubMed] [Google Scholar]
  16. Dessalegn B, Landau B. More than meets the eye: the role of language in binding and maintaining feature conjunctions. Psychological Science. 2008;19:189–195. doi: 10.1111/j.1467-9280.2008.02066.x. https://doi.org/10.1111/j.1467-9280.2008.02066.x. [DOI] [PubMed] [Google Scholar]
  17. Diamond A. Development and neural bases of AB and DR. In: Diamond A, editor. The Development and Neural Bases of Higher Cognitive Functions. New York: National Academy of Sciences; 1990. pp. 267–317. [DOI] [PubMed] [Google Scholar]
  18. Fogel A, Thelen E. Development of early expressive and communicative action: Reinterpreting the evidence from a dynamic systems perspective. Developmental Psychology. 1987;23(6):747–761. https://doi.org/10.1037/0012-1649.23.6.747. [Google Scholar]
  19. Fougnie D, Alvarez GA. Object features fail independently in visual working memory: Evidence for a probabilistic feature-store model. Journal of Vision. 2011;11(12):3. doi: 10.1167/11.12.3. https://doi.org/10.1167/11.12.3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Fougnie D, Asplund CL, Marois R. What are the units of storage in visual working memory? Journal of Vision. 2010;10(12):27. doi: 10.1167/10.12.27. https://doi.org/10.1167/10.12.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Franconeri SL, Alvarez GA, Cavanagh P. Flexible cognitive resources: Competitive content maps for attention and memory. Trends in Cognitive Science. 2013;17(3):134–141. doi: 10.1016/j.tics.2013.01.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Hollingworth A. Object-position binding in visual memory for natural scenes and object arrays. Journal of Experimental Psychology: Human Perception and Performance. 2007;33:31–47. doi: 10.1037/0096-1523.33.1.31. [DOI] [PubMed] [Google Scholar]
  23. Horst J, Samuelson LK. Fast mapping but poor retention by 24-month-old infants. Infancy. 2008;13(2):128–157. doi: 10.1080/15250000701795598. https://doi.org/0.1080/15250000701795598. [DOI] [PubMed] [Google Scholar]
  24. Isbell E, Fukuda K, Neville HJ, Vogel EK. Visual working memory continues to develop through adolescence. Frontiers in Psychology. 2015;6:696. doi: 10.3389/fpsyg.2015.00696. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Johnson JS, Hollingworth A, Luck SJ. The role of attention in the maintenance of feature bindings in visual short-term memory. Journal of Experimental Psychology: Human Perception and Performance. 2008;34:41–55. doi: 10.1037/0096-1523.34.1.41. https://doi.org/10.1037/0096-1523.34.1.41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Johnson JS, Simmering VR. the DFT Research Group, editor. Integrating perception and working memory in a three-layer dynamic field architecture. In: Schöner G, Spencer JP, editors. Dynamic thinking: A primer on dynamic field theory. New York, NY: Oxford University Press; 2015. pp. 151–168. [Google Scholar]
  27. Johnson JS, Simmering VR, Buss AT. Beyond slots and resources: Grounding cognitive concepts in neural dynamics. Attention, Perception, & Psychophysics. 2014;76(6):1630–1654. doi: 10.3758/s13414-013-0596-9. https://doi.org/10.3758/s13414-013-0596-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Johnson JS, Spencer JP, Schöner G. Moving to a higher ground: the dynamic field theory and the dynamics of visual cognition. New Ideas in Psychology. 2008;26:227–251. doi: 10.1016/j.newideapsych.2007.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Kleiner M, Brainard DH, Pelli DG. What’s new in psychtoolbox-3? Perception 36 ECVP Abstract Supplement 2007 [Google Scholar]
  30. Luck SJ, Vogel EK. The capacity of visual working memory for features and conjunctions. Nature. 1997;390:279–281. doi: 10.1038/36846. https://doi.org/10.1038/36846. [DOI] [PubMed] [Google Scholar]
  31. Oakes LM, Messenger IM, Ross-Sheehy S, Luck SJ. New evidence for rapid development of color-location binding in infants’ visual short-term memory. Visual Cognition. 2009;17:67–82. doi: 10.1080/13506280802151480. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Olsson H, Poom L. Visual memory needs categories. Proceedings of the National Academy of Science USA. 2005;102(24):8776–8780. doi: 10.1073/pnas.0500810102. https://doi.org/10.1073/pnas.0500810102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Ortmann MR, Schutte AR. The relationship between the perception of axes of symmetry and spatial memory during early childhood. Journal of Experimental Child Psychology. 2010;107:368–376. doi: 10.1016/j.jecp.2010.05.004. https://doi.org/10.1016/j.jecp.2010.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Pailian H, Halberda J. The reliability and internal consistency of one-shot and flicker change detection for measuring individual differences in visual working memory capacity. Memory & Cognition. 2015;43(3):397–420. doi: 10.3758/s13421-014-0492-0. https://doi.org/10.3758/s13421-014-0492-0. [DOI] [PubMed] [Google Scholar]
  35. Pashler H. Familiarity and visual change detection. Perception and Psychophysics. 1988;44:369–378. doi: 10.3758/bf03210419. https://doi.org/10.3758/BF03210419. [DOI] [PubMed] [Google Scholar]
  36. Pelli DG. The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision. 1997;10:437–442. https://doi.org/10.1163/156856897X00366. [PubMed] [Google Scholar]
  37. Perone S, Simmering VR, Spencer JP. Stronger neural dynamics capture changes in infants’ visual working memory capacity over development. Developmental Science. 2011;14:1379–1392. doi: 10.1111/j.1467-7687.2011.01083.x. https://doi.org/10.1111/j.1467-7687.2011.01083.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Perone S, Spencer JP. Autonomous visual exploration creates developmental change in familiarity and novelty seeking behaviors. Frontiers in Cognitive Science. 2013a;4:648. doi: 10.3389/fpsyg.2013.00648. https://doi.org/10.3389/fpsyg.2013.00648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Perone S, Spencer JP. Autonomy in action: Linking the act of looking to memory formation in infancy via dynamic neural fields. Cognitive Science. 2013b;37:1–60. doi: 10.1111/cogs.12010. https://doi.org/10.1111/cogs.12010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Perone S, Spencer JP. The co-development of looking dynamics and discrimination performance. Developmental Psychology. 2014;50(3):837–852. doi: 10.1037/a0034137. https://doi.org/10.1037/a0034137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Perry LK, Samuelson LK, Burdinie JB. Highchair philosophers: The impact of seating context-dependent exploration on children’s naming biases. Developmental Science. 2014;17(5):757–765. doi: 10.1111/desc.12147. https://doi.org/10.1111/desc.12147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Pickering SJ, Gathercole SE. Distinctive working memory profiles in children with special educational needs. Educational Psychology. 2004;24(3):393–408. https://doi.org/10.1080/0144341042000211715. [Google Scholar]
  43. Raffone A, Wolters G. A cortical mechanism for binding in visual working memory. Journal of Cognitive Neuroscience. 2001;13:766–785. doi: 10.1162/08989290152541430. [DOI] [PubMed] [Google Scholar]
  44. Raghubar KP, Barnes MA, Hecht SA. Working memory and mathematics: A review of developmental, individual difference, and cognitive approaches. Learning and Individual Differences. 2010;20:110–122. https://doi.org/10.1016/j.lindif.2009.10.005. [Google Scholar]
  45. Reznick JS. Working memory in infants and toddlers. In: Courage ML, Cowan N, editors. The development of memory in infancy and childhood. Hove, East Sussex, UK: Psychology Press; 2009. [Google Scholar]
  46. Riggs KJ, Simpson A, Potts T. The development of visual short-term memory for multifeature items during middle childhood. Journal of Experimental Child Psychology. 2011;108:802–809. doi: 10.1016/j.jecp.2010.11.006. https://doi.org/10.1016/j.jecp.2010.11.006. [DOI] [PubMed] [Google Scholar]
  47. Rose SA, Feldman JF, Jankowski JJ. Infant visual recognition memory. Developmental Review. 2004;24:74–100. doi: 10.1037/0012-1649.39.3.563. [DOI] [PubMed] [Google Scholar]
  48. Rose SA, Feldman JF, Jankowski JJ. Implications of infant cognition for executive functions at age 11. Psychological Science. 2012;23(11):1345–1355. doi: 10.1177/0956797612444902. https://doi.org/10.1177/0956797612444902. [DOI] [PubMed] [Google Scholar]
  49. Ross-Sheehy S, Oakes LM, Luck SJ. The development of visual short-term memory capacity in infants. Child Development. 2003;74:1807–1822. doi: 10.1046/j.1467-8624.2003.00639.x. [DOI] [PubMed] [Google Scholar]
  50. Samuelson LK, Schutte AR, Horst JS. The dynamic nature of knowledge: Insights from a dynamic field model of children’s novel noun generalization. Cognition. 2009;110:322–345. doi: 10.1016/j.cognition.2008.10.017. https://doi.org/10.1016/j.cognition.2008.10.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Sarigiannidis I, Crickmore G, Astle DE. Developmental and individual differences in the precision of visuospatial memory. Cognitive Development. 2016;39:1–12. doi: 10.1016/j.cogdev.2016.02.004. https://doi.org/10.1016/j.cogdev.2016.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Schneegans S, Bays PM. Neural architecture for feature binding in visual working memory. The Journal of Neuroscience. 2017:3493–16. doi: 10.1523/JNEUROSCI.3493-16.2017. https://doi.org/10.1523/JNEUROSCI.3493-16.2017. [DOI] [PMC free article] [PubMed]
  53. Schneegans S, Spencer JP, Schöner G. the DFT Research Group, editor. Integrating “what” and “where”: Visual working memory for objects in a scene. In: Schöner G, Spencer JP, editors. Dynamic thinking: A primer on dynamic field theory. New York, NY: Oxford University Press; 2015. pp. 197–226. [Google Scholar]
  54. Schöner G, Spencer JP the DFT Research Group. Dynamic thinking: A primer on dynamic field theory. New York, NY: Oxford University Press; 2015. [Google Scholar]
  55. Schutte AR, Spencer JP. Filling the gap on developmental change: Tests of a dynamic field theory of spatial cognition. Journal of Cognition and Development. 2010;11:328–355. https://doi.org/10.1080/15248371003700007. [Google Scholar]
  56. Simmering VR. The development of visual working memory capacity in early childhood. Journal of Experimental Child Psychology. 2012;111:695–707. doi: 10.1016/j.jecp.2011.10.007. https://doi.org/10.1016/j.jecp.2011.10.007. [DOI] [PubMed] [Google Scholar]
  57. Simmering VR. Working memory capacity in context: Modeling dynamic processes of behavior, memory, and development. Monographs of the Society for Research in Child Development. 2016;81:7–148. doi: 10.1111/mono.12249. https://doi.org/10.1111/mono.12202. [DOI] [PubMed] [Google Scholar]
  58. Simmering VR, Miller HE. Developmental improvements in visual working memory resolution and capacity share from a common source. Attention, Perception, & Psychophysics. 2016;78:1538–1555. doi: 10.3758/s13414-016-1163-y. https://doi.org/10.3758/s13414-016-1163-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Simmering VR, Miller HE, Bohache K. Different developmental trajectories across feature types support a dynamic field model of visual working memory development. Attention, Perception, & Psychophysics. 2015;77(4):1170–1188. doi: 10.3758/s13414-015-0832-6. https://doi.org/10.3758/s13414-015-0832-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Simmering VR, Patterson R. Models provide specificity: Testing a proposed mechanism of visual working memory capacity development. Cognitive Development. 2012;27(4):419–439. doi: 10.1016/j.cogdev.2012.08.001. https://doi.org/10.1016/j.cogdev.2012.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Simmering VR, Perone S. Working memory capacity as a dynamic process. Frontiers in Developmental Psychology. 2013;3:567. doi: 10.3389/fpsyg.2012.00567. https://doi.org/10.3389/fpsyg.2012.00567. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Simmering VR, Schutte AR. the DFT Research Group, editor. Developmental dynamics: The spatial precision hypothesis. In: Schöner G, Spencer JP, editors. Dynamic thinking: A primer on dynamic field theory. New York, NY: Oxford University Press; 2015. pp. 251–270. [Google Scholar]
  63. Simmering VR, Schutte AR, Spencer JP. Generalizing the dynamic field theory of spatial cognition across real and developmental time scales. Brain Research. 2008;1202:68–86. doi: 10.1016/j.brainres.2007.06.081. https://doi.org/10.1016/j.brainres.2007.06.081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Simmering VR, Spencer JP. Carving up space at imaginary joints: Can people mentally impose spatial category boundaries? Journal of Experimental Psychology: Human Perception & Performance. 2007;33:871–894. doi: 10.1037/0096-1523.33.4.871. https://doi.org/10.1037/0096-1523.33.4.871. [DOI] [PubMed] [Google Scholar]
  65. Simmering VR, Spencer JP. Generality with specificity: The dynamic field theory generalizes across tasks and time scales. Developmental Science. 2008;11(4):541–555. doi: 10.1111/j.1467-7687.2008.00700.x. https://doi.org/10.1111/j.1467-7687.2008.00700.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Simmering VR, Spencer JP, Schöner G. Reference-related inhibition produces enhanced position discrimination and fast repulsion near axes of symmetry. Perception & Psychophysics. 2006;68:1027–1046. doi: 10.3758/bf03193363. https://doi.org/10.3758/BF03193363. [DOI] [PubMed] [Google Scholar]
  67. Simmering VR, Triesch J, Deàk G, Spencer JP. A dialogue on the role of computational modeling in developmental science. Child Development Perspectives. 2010;4:152–158. doi: 10.1111/j.1750-8606.2010.00134.x. https://doi.org/10.1111/j.1750-8606.2010.00134.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Smith LB, Thelen E. Development as a dynamic system. Trends in Cognitive Sciences. 2003;7(8):343–348. doi: 10.1016/s1364-6613(03)00156-6. https://doi.org/10.1016/S1364-6613(03)00156-6. [DOI] [PubMed] [Google Scholar]
  69. Smith LB, Thelen E, Titzer R, McLin D. Knowing in the context of acting: The task dynamics of the A-not-B error. Psychological Review. 1999;106:235–260. doi: 10.1037/0033-295x.106.2.235. [DOI] [PubMed] [Google Scholar]
  70. Suchow JW, Fougnie D, Brady TF, Alvarez GA. Terms of the debate on the format and structure of visual memory. Attention, Perception, & Psychophysics. 2014;76(7):2071–2079. doi: 10.3758/s13414-014-0690-7. https://doi.org/10.3758/s13414-014-0690-7. [DOI] [PubMed] [Google Scholar]
  71. Todd JJ, Marois R. Posterior parietal cortex activity predicts individual differences in visual short-term memory capacity. Cognitive, Affective, & Behavioral Neuroscience. 2005;5(2):144–155. doi: 10.3758/cabn.5.2.144. https://doi.org/10.3758/CABN.5.2.144. [DOI] [PubMed] [Google Scholar]
  72. Treisman A. The binding problem. Current Opinion in Neurobiology. 1996;6:171–178. doi: 10.1016/s0959-4388(96)80070-5. [DOI] [PubMed] [Google Scholar]
  73. Treisman A, Gelade G. A feature-integration theory of attention. Cognitive Psychology. 1980;12:97–136. doi: 10.1016/0010-0285(80)90005-5. https://doi.org/10.1016/0010-0285(80)90005-5. [DOI] [PubMed] [Google Scholar]
  74. Vogel EK, Woodman GF, Luck SJ. Storage of features, conjunctions, and objects in visual working memory. Journal of Experimental Psychology: Human Perception and Performance. 2001;27:92–114. doi: 10.1037//0096-1523.27.1.92. [DOI] [PubMed] [Google Scholar]
  75. Wheeler M, Treisman AM. Binding in short-term visual memory. Journal of Experimental Psychology: General. 2002;131:48–64. doi: 10.1037//0096-3445.131.1.48. https://doi.org/10.1037/0096-3445.131.1.48. [DOI] [PubMed] [Google Scholar]
  76. Wilken P, Ma WJ. A detection theory account of change detection. Journal of Vision. 2004;4(12):1120–1135. doi: 10.1167/4.12.11. [DOI] [PubMed] [Google Scholar]
  77. Xu Y. Understanding the object benefit in visual short-term memory: The roles of feature proximity and connectedness. Perception & Psychophysics. 2006;68(5):815–828. doi: 10.3758/bf03193704. https://doi.org/10.3758/BF03193704. [DOI] [PubMed] [Google Scholar]

RESOURCES