Category learning is shaped by the multifaceted development of selective attention

Layla Unger; Vladimir M Sloutsky

doi:10.1016/j.jecp.2022.105549

. Author manuscript; available in PMC: 2023 Jun 17.

Published in final edited form as: J Exp Child Psychol. 2022 Sep 15;226:105549. doi: 10.1016/j.jecp.2022.105549

Category learning is shaped by the multifaceted development of selective attention

Layla Unger ¹, Vladimir M Sloutsky ¹

PMCID: PMC10276342 NIHMSID: NIHMS1893963 PMID: 36116317

Abstract

Categories are a fundamental building block of cognition that simplify the multitude of entities we encounter into equivalence classes. By simplifying this barrage of inputs, categories support reasoning about and interacting with their members. For example, despite differences in size, color, and other features, we can treat members of the category of dogs as equivalent, and thus generalize information about any given dog to other dogs. Simplifying entities into categories in adulthood is supported by selective attention, in which people focus on category-relevant attributes, while filtering out category-irrelevant attributes. However, much category learning takes place in infancy and early childhood, when selective attention undergoes substantial development. We designed two experiments to disentangle the contributions of the focusing and filtering aspects of selective attention to category learning over development. Experiment 1 provided evidence that learning simple categories was accompanied by selective attention in both 4- and 5- year-old children and adults. Experiment 2 provided evidence that only focusing contributed to selective attention in 4-year-olds, whereas both focusing and filtering contributed to selective attention in 5-year-olds and adults. Thus, category learning may recruit different aspects of selective attention across development.

Keywords: category learning, cognitive development, selective attention, focusing, filtering, inhibition

One of the fundamental roles of cognition is to simplify and distill the barrage of inputs available to our senses. For example, if we encounter something that is furry, four-legged, brown with white patches, about two feet tall, and has some twigs caught in its fur, we can more readily determine how to interact with it if we simplify this input by categorizing it as a dog. Adults simplify entities in the world into categories by learning to selectively attend to category-relevant attributes that members of the same category share (e.g., shape), rather than irrelevant attributes that vary between them (e.g., color and pattern) (Deng & Sloutsky, 2016; Goldstone & Steyvers, 2001; Rehder & Hoffman, 2005). However, many of the categories that populate adult knowledge were originally learned during infancy and early childhood, when selective attention undergoes substantial development (Plude, Enns, & Brodeur, 1994). How is category learning over the course of development shaped by the development of selective attention?

Understanding how selective attention shapes the development of category learning involves tackling the challenging question of what selective attention is. Does learning to selectively attend to relevant input consist of focusing on relevant input (i.e., enhancing its processing), or filtering out (i.e., suppressing/inhibiting) irrelevant input? Alternatively, is selective attention zero-sum, so that focusing on some input collaterally filters out other input? Although researchers have grappled with these questions for decades, neuroimaging research has provided mounting evidence that focusing and filtering may be separate processes (Andersen & Müller, 2010; Bridwell & Srinivasan, 2012; Gazzaley, Cooney, McEvoy, Knight, & D’esposito, 2005; Gulbinaite, Johnson, de Jong, Morey, & van Rijn, 2014; Polk, Drake, Jonides, Smith, & Smith, 2008). Moreover, as discussed below, these processes may follow different developmental trajectories. However, little research to date has investigated the development of focusing and filtering when people must learn what to attend to, as they must when they learn categories. In what follows, we review this research, and present a set of studies designed to illuminate how the development of focusing and filtering shapes the development of category learning.

Selective Attention and its Development

A growing body of neuroimaging evidence suggests that by adulthood, selective attention is supported by dissociable focusing and filtering processes. For example, an fMRI study conducted by Gazzaley et al. (2005) found that relative to neutral (passive) viewing, the activity evoked by a stimulus (such as a face) was both enhanced when participants were instructed to attend to the stimulus (evidence of focusing), and suppressed when participants were instructed to ignore it (evidence of filtering; see also Andersen & Müller, 2010; Bridwell & Srinivasan, 2012; Polk et al., 2008). Moreover, focusing may occur without filtering, and vice versa. For example, using an innovative EEG paradigm, Gulbinaite et al. (2014) found that responding to a target in the presence of distractors was associated with focusing on the target in individuals with low working memory, and with filtering the distractors in individuals with high working memory.

Extensive evidence attests that selective attention in adults emerges following protracted development over infancy, childhood and adolescence (Lane & Pearson, 1982; Plude et al., 1994). However, only some of this research has attempted to disentangle focusing and filtering. Research that has explicitly investigated the development of focusing has yielded evidence that it develops early, starting in infancy. Much of this evidence consists of findings that stimulus processing is enhanced in infants when the stimulus appears in an attentionally cued location. For example, when a stimulus appears in a location that was recently cued, infants look at it more quickly (Hood, Willen, & Driver, 1998; Johnson, Posner, & Rothbart, 1994; Richards, 2000), show enhanced neural responses to it (Richards, 2000), and show stronger subsequent evidence of recognizing it as familiar (Reid & Striano, 2005; Reid, Striano, Kaufman, & Johnson, 2004). Similarly, stimulus processing is enhanced when the infant is in an attentive state (Kopp & Lindenberger, 2011; Richards, 2003).

In contrast, research on filtering has yielded conflicting results. Conflicting findings are particularly notable in research spanning early childhood, from approximately 3 to 7 years of age. For example, conflicting results have been found from the negative priming paradigm, in which filtering is inferred when participants respond worse to targets that were recently distractors (and that could thus have been filtered) versus those that were not. This research has yielded evidence for both the early emergence of filtering (from at least age 3; Chevalier & Blaye, 2008; Pritchard & Neumann, 2004), and substantial increases in filtering with age (from age 7 to adulthood; Pritchard & Neumann, 2011; Tipper, Bourque, Anderson, & Brehaut, 1989). Studies of the development of filtering have also measured filtering from the retrieval induced forgetting (RIF) paradigm. In this paradigm, participants first study category-item pairs such as FRUIT-banana and FRUIT-apple, then retrieve memory for only a subset of pairs for a category. Filtering is then assessed based on worse memory for the non-retrieved pairs. Evidence from this paradigm is also equivocal: Although some RIF studies suggest that filtering changes little with development from at least age 7 to adulthood (Zellner & Bäuml, 2005), other studies have provided evidence that filtering increases across an earlier span of development, from age 4 to 7 (Aslan & Bäuml, 2010).

Taken together, research to date suggests that focusing develops early. In contrast, prior research provides conflicting evidence for the development of filtering. Critically, much of this research does not tackle the development of selective attention when people must learn what to attend to. Illuminating this development is important because people often must learn what input is most relevant to their goals through experience, as is the case when people learn categories. Specifically, attentional cueing, negative priming, and other paradigms often used to study selective attention typically involve some explicit cue or instruction that directs participants’ attention to some characteristic of their input, such as an instruction to make responses based on color. In contrast, when a child encounters a dog, a parent or other knowledgeable agent may label it as a “dog” while the child attends to it, but may rarely highlight the characteristics of dogs that are important for making them dogs and not something else (Bergey, Morris, & Yurovsky, 2020; Ran, Kirby, Naigles, & Rowe, 2022). Instead, characteristics relevant to membership in the category of dogs must be learned from experience, such as from consistently encountering dogs while hearing the label “dog” (Smith, Jones, Landau, Gershkoff-Stowe, & Samuelson, 2002; Smith & Yu, 2008). In the next section, we review research that has investigated the development of selective attention in category learning, and highlight how it does not disentangle the contributions of focusing and filtering.

Role of Selective Attention in Category Learning

A handful of findings suggest that the contribution of selective attention to category learning increases over development, particularly over the course of early childhood from approximately 4 to 7 years of age (Best, Yim, & Sloutsky, 2013; Deng & Sloutsky, 2015, 2016). Most of these studies have used a paradigm in which children and adults learn categories with dimensions that vary in relevance to category membership, such that one dimension is deterministically associated with (i.e., perfectly predictive of) category membership, multiple dimensions are probabilistically associated with (i.e., imperfectly predictive of) with category membership, and the remaining dimensions are irrelevant. The results of these studies suggest that attention is distributed across deterministic and probabilistic dimensions in young children (e.g., age 4), and becomes increasingly selectively oriented to just the deterministic dimension with age (e.g., from age 6–7 to adulthood). These findings suggest that, with age, people tend to increasingly orient selective attention primarily towards dimensions that they learn to be most diagnostic of category membership. However, this research did not investigate attention to completely irrelevant dimensions, and thus leaves open the possibility that even young children may attend more to dimensions that are at least somewhat relevant (i.e., to both deterministic and probabilistic dimensions) than to dimensions that are entirely irrelevant.

Critically, the research that has investigated the development of selective attention during category learning has not disentangled whether this developmental trajectory is driven by changing contributions of focusing, filtering, or both. Disentangling these contributions is challenging: When people learn a specific category structure, any difference in attention to relevant versus irrelevant dimensions can be due to focusing attention to relevant dimensions, filtering irrelevant dimensions, or both. One way to overcome this challenge is to introduce a “switch” in relevance at some point during category learning, in which a dimension that was previously irrelevant becomes relevant. If category learning involves focusing, learners should always struggle to shift their focus to any newly relevant dimension after the switch. Critically, if category learning also involves filtering, learners should find it more difficult to focus on a newly relevant dimension that had been irrelevant (and thus filtered) prior to the switch, versus a newly relevant dimension that was not present (and thus could not be filtered) prior to the switch (Goldstone & Steyvers, 2001; Hoffman & Rehder, 2010). Research that has used this approach with adults has found evidence of both focusing and filtering (Goldstone & Steyvers, 2001). As described below, the present research used this approach to disentangle the contributions of focusing and filtering in the development of category learning.

Present Research

The present research sought to disentangle the contributions of focusing and filtering to selective attention in the development of category learning. The research consisted of two experiments designed to accomplish this goal. Both experiments were designed to target the period of development that has yielded conflicting evidence about the development of filtering (Aslan & Bäuml, 2010; Chevalier & Blaye, 2008; Pritchard & Neumann, 2004; Tipper & McLaren, 1990). Thus, both experiments investigated focusing and filtering in two age groups within this period (i.e., ages 4 and 5), and contrasted them with adults.

Experiment 1 investigated whether even young children deploy some degree of selective attention to relevant versus entirely irrelevant dimensions during category learning. Adults and children aged 4 and 5 learned simple categories of creatures that possessed one relevant and one irrelevant dimension. To target selective attention when relevant dimensions must be learned rather than externally cued, in contrast with some prior studies of category learning in development (Deng & Sloutsky, 2015, 2016), participants were not provided with any initial cues about the relevance of different dimensions. For example, in Deng and Sloutsky (2015, 2016), participants were given instructions identifying the relevant dimensions prior to category learning, such as that “[all/most] [members of this category] have this kind of head”. In contrast, participants in the present studies were not told which dimensions were relevant. Thus, participants could only learn by making categorization decisions and receiving feedback. Selective attention to the relevant dimension was measured based on better subsequent recognition memory for familiar versus novel values of the relevant versus irrelevant dimension (Deng & Sloutsky, 2016). To anticipate the results, category learning occurred in all age groups. Importantly, for these simple categories, category learning was accompanied by greater learned selective attention to the relevant versus the irrelevant dimension even in the youngest children.

Experiment 2 used the “switch” approach to investigate whether the contributions of focusing versus filtering to selective attention change with age. Here, participants initially learned categories like those used in Experiment 1, but experienced a switch half-way through category learning. In this switch, the initially relevant dimension became irrelevant, and the initially irrelevant dimension became relevant.

We anticipated that all participants would find it challenging to learn the new category structure following the switch. To isolate the contributions of filtering to this challenge, we assigned participants to one of two conditions. In the “Visible” condition, the initially irrelevant dimension that became relevant post-switch was visible prior to the switch. Thus, in the Visible condition, the initially irrelevant dimension could be filtered prior to the switch. In the “Hidden” condition, the initially irrelevant dimension was occluded prior to the switch. Thus, in the Hidden condition, the initially irrelevant dimension could not be filtered prior to the switch. Both conditions pose learners with a challenge once the switch takes place to shift their focus away from the dimension that was relevant prior to the switch, and towards the newly relevant dimension. However, the newly relevant dimension should be more challenging to attend to if it was filtered (i.e., suppressed) prior to the switch. Given that such filtering is only possible in the Visible condition, selective attention that involves filtering should lead to worse performance after the switch in the Visible than in the Hidden condition. In contrast, selective attention that does not involve filtering should lead to similar performance after the switch in both conditions. Using this approach, we investigated two alternative hypotheses: (1) Selective attention involves both focusing and filtering even early in development, and (2) Selective attention involves focusing early in development, and the role of filtering increases with age.

Experiment 1

Methods

Participants

Participants were recruited in three age groups: 4-year-olds (N = 33; M = 4 years, 5 months), 5-year-olds (N = 35; M = 5 years, 4 months), and adults (N = 53). The child age groups were selected because they capture a period of development when the contributions of filtering to selective attention are disputed (Aslan & Bäuml, 2010; Chevalier & Blaye, 2008; Pritchard & Neumann, 2004; Tipper & McLaren, 1990).

Materials

The category stimuli were created based on extensive piloting designed to identify category structures that even 4-year-old children could learn. This piloting had two key constraints. First, values on only a single dimension could be relevant to category membership, and values on all other dimensions had to be irrelevant. This constraint was necessary for testing selective attention to relevant versus irrelevant dimensions. Second, to target selective attention to dimensions that is evoked by their learned relevance, it was of critical importance to use categories that children could learn purely from categorizing exemplars and receiving feedback. This constraint contrasts with some prior studies of category learning in children (Deng & Sloutsky, 2015, 2016) that preceded category learning with explicit cues to the dimensions relevant to category membership. These cues were used in these prior studies because extensive piloting showed that they were necessary for young children to learn categories with several (e.g., seven) dimensions (personal communication with authors). The results of piloting for the present studies indicated that in the absence of explicit relevance cues, successful category learning only occurred in the majority of 4-year-old children with a simple category structure with two dimensions: One relevant, and one irrelevant. Moreover, the values of the relevant dimension that were each associated with a different category needed to be highly visually discriminable.

Stimuli consisted of novel creatures with two dimensions: flippers and tails. Each dimension was created by morphing between two anchor images that were different in shape (e.g., two different-shaped flippers). Values of these dimensions used in the stimuli were only taken from near one of the two extremes of the morph dimension (shown in Figure 1A). 8 category exemplars were created from combinations of 4 possible values on each dimension (2 values from near one extreme of the dimension, and 2 values from near the other extreme). In the experiment (see Procedure), these stimuli were divided into two categories, such that the values of one (“relevant”) dimension perfectly predicted category membership, and the value of the other (“irrelevant”) dimension occurred equally often in both categories. When a dimension was relevant, values from near one extreme of the dimension determined that the creature belonged to one category, and values from near the other extreme determined that the creature belonged to the other category. For a recognition memory task that followed category learning (see Procedure), we additionally created eight novel creatures: Four in which the flipper was replaced with a novel value, and four in which the tail was replaced with a novel value (Figure 1B).

Figure 1. — Panel A depicts creatures that illustrate values on the two extremes of each dimension. Panel B depicts the novel values for each dimension used in the recognition memory task. Panel C depicts the creatures with dimensions hidden by bubbles, as used in Experiment 2.

Design and Procedure

The experiment consisted of two phases: Category learning, and recognition memory. Participants were randomly assigned to complete one of two versions of the experiment that counterbalanced the relevance of the two dimensions to category membership. Thus, for a given participant, the values of one “Relevant” dimension perfectly predicted category membership, and the values of the “Irrelevant” dimension occurred equally often in both categories.

The procedure was similar for adults and children, with the exception that adults followed instructions presented on the computer screen and used the keyboard to make responses, whereas an experimenter read instructions aloud and recorded children’s verbal responses. At the start of category learning, participants were instructed that they would see two kinds of creatures: Zibbies and Tomas. They were then shown two example creatures (Figure 1A). As in the creatures that participants subsequently learned to categorize, each example creature had a flipper and tail value taken from one of the extremes of the flipper and tail dimensions. While these creatures were shown, participants were told that all creatures had two body parts that come in different shapes: A flipper, and a tail. Each body part was highlighted with a yellow outline while the instructions or experimenter pointed out its two different values. Then, participants were told that the shape of one of the body parts is important for figuring out whether a creature is a Zibbie or a Toma. These instructions were designed based on pilot testing to increase the likelihood that even four-year-olds could learn the categories, while still requiring participants to learn the dimension relevant for category membership for themselves.

Participants then completed four category learning blocks that each contained one trial for each of the 8 category exemplars described above. On each trial, participants were asked to identify the creature as a Zibbie or a Toma. Following the categorization decision, participants received corrective feedback in which the creature remained on the screen, and were told “[Correct!/Oops!] It’s a [Zibbie!/Toma!]”.

After the final category learning block, participants began the recognition memory task. Participants were told that they would see some more creatures, some of which would be the same as ones they saw during the first part of the experiment, and some that would be new. They were further told that in new creatures, one of the body parts would be a new shape, and that their job was to figure out which creatures were ones they saw before, and which ones were new. Participants then completed 32 recognition memory trials in which they saw a creature, and indicated whether it was “old” or “new”. These trials consisted of: (1) 16 “Old” trials, across which each of the 8 category exemplars was presented twice, and (2) 16 “New” trials, across which each of the 8 novel stimuli was presented twice. As described in the Materials section above, half of the New trials presented a creature in which the relevant dimension was replaced with its novel value, and half presented a creature in which the irrelevant dimension was novel.

Results

All analyses were conducted using Bayesian models constructed using the rstan package (Stan Development Team, 2020) in the R environment for statistical computing (R Core Team, 2021). We used Bayesian analyses because both differences and similarities between conditions (e.g., between memory for relevant versus irrelevant dimensions) were of interest. All data and scripts have been made available on OSF https://osf.io/8db62/?viewonly=9b54963ade304397bed4ca7f1883be84.

Category Learning

Figure 2 depicts category learning in each of the three age groups. Analyses assessed whether participants in each age group learned the categories. The outcome variable for these analyses was trial-by-trial categorization accuracy. Trial-by-trial accuracy was analyzed using a Bayesian hierarchical model in which it was predicted as the outcome of a logistic regression, with an intercept and slope for change in accuracy across trials for each participant. Intercepts and slopes for participants were each drawn from one of three higher-level distributions (all given the same weak priors), based on the participant’s age group. Fitting this model to the data thus estimates the trajectory of category learning both for each participant, and for each age group. Rather than a single estimate of the category learning trajectory for a given participant or age group, the Bayesian approach yields a posterior distribution of probable category learning trajectories.

We assessed category learning in each age group using the model’s posterior distributions of categorization accuracy in the final two blocks in each age group. Specifically, for each sample from the posterior distribution for each age group, we calculated the model’s estimation of the mean accuracy on the final two blocks of category learning. This calculation produced a posterior distribution of probable accuracy on the final two blocks for each age group. We then calculated the range of most probable final accuracies for each age group using Highest Density Intervals (HDIs). This interval is the range of a distribution that contains some specified percentage of probable values. The interpretation of such intervals is simply the probability that the “true” value falls within the range. In all age groups, categorization accuracy was predicted to be above chance (i.e., .5) in the final two blocks (4-year-olds: Median = 0.62, 90% HDI = [0.60, 0.66]; 5-year-olds: Median = 0.69, 90% HDI = [0.66, 0.71]; Adults: Median = 0.96, 90% HDI = [0.94, 0.96]). Comparing the mean accuracy on the final two blocks between age groups for each posterior sample indicated that accuracy was slightly greater in 5- than in 4-year-olds (Median = 0.06, 90% HDI = [0.02, 0.10]), and substantially from 5-year-olds to adults (Median = 0.26, 90% HDI = [0.23, 0.29]).

Selective Attention

We assessed whether category learning was accompanied by selective attention to the relevant dimension based on whether participants had better recognition memory for values of the relevant versus the irrelevant dimension. We measured recognition memory for each dimension using d-prime scores, calculated from hits for correctly identifying creatures with an old value on the dimension as “old”, and false alarms for incorrectly identifying new creatures as “old”. For each participant, we then calculated a “Relevant Attention” score as their d-prime for the relevant dimension minus their d-prime for the irrelevant dimension.

Figure 3 shows d-prime and Relevant Attention scores by age, which suggest that all age groups tended to attend more to the relevant dimension. We analyzed these data to test whether Relevant Attention scores tended to be above 0 in each age group. For this analysis, we constructed a Bayesian hierarchical model in which each participant’s Relevant Attention score was drawn from a normal distribution. For each participant, the mean of this distribution was drawn from one of three higher-level normal distributions (all given the same weak priors), based on the participant’s age. Thus, we assessed whether Relevant Attention was greater than 0 based on whether the mean of the distribution for each age group tended to be greater than 0. All age groups had a high probability that Relevant Attention scores were greater than 0: 89% for four-year-olds, 90% for five-year-olds, and 90% for adults. Thus, category learning was accompanied by similarly greater attention to relevant versus irrelevant dimensions from age four to adulthood.

Discussion

Experiment 1 revealed that successful category learning was accompanied by greater attention to a category-relevant dimension than to a category-irrelevant dimension. However, this result does not disentangle whether selective attention only involves focusing on the relevant dimension, or additionally involves filtering the irrelevant dimension.

Experiment 2 was designed to disentangle these possibilities. To achieve this goal, Experiment 2 introduced a “switch” in dimension relevance halfway through category learning. Following the switch, the dimension that was initially relevant to category membership became irrelevant, and a different dimension instead became relevant. Critically, we manipulated whether participants had the opportunity to filter the dimension that became relevant after the switch before it became relevant. Specifically, participants in the Visible condition saw the dimension that only became relevant after the switch, so had the opportunity to learn to filter it while it was initially irrelevant. In contrast, participants in the Hidden condition did not have the opportunity to learn to filter this dimension because it was covered by bubbles before the switch. Thus, it should be more difficult to learn to categorize on the basis of the newly relevant dimension following the switch in the Visible versus Hidden condition only if selective attention involves filtering.