Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Jun 1.
Published in final edited form as: J Exp Psychol Learn Mem Cogn. 2021 Feb 1;48(6):813–828. doi: 10.1037/xlm0000856

Conceptual Anchoring Dissociates Implicit and Explicit Category Learning

J David Smith 1, Brooke N Jackson 1, Markie N Adamczyk 2, Barbara A Church 1,2,a
PMCID: PMC8325699  NIHMSID: NIHMS1671738  PMID: 33523691

Abstract

Categorization researchers have long debated the possibility of multiple category-learning systems. The need persists for paradigms that dissociate explicit-declarative category-learning processes (featuring verbalizeable category rules) from implicit-procedural processes (featuring stimulus-response associations lying beneath declarative cognition). The authors contribute a new paradigm, using perfectly matched exclusive-or (XOR) category tasks differing only in the availability or absence of easily verbalizeable conceptual content. This manipulation transformed learning. The conceptual task alone was learned suddenly, by insightful rule discovery, producing explicit-declarative XOR knowledge. The perceptual task was learned more gradually, consistent with associative-learning processes, producing impoverished declarative knowledge. We also tested participants under regimens of immediate and deferred reinforcement. The conceptual task alone was learned through processes that survive the loss of trial-by-trial reinforcement. All results support the idea that humans have perceptual-associative processes for implicit learning, but also an overlain conceptual system that under the right circumstances constitutes a parallel explicit-declarative category-learning system.

Keywords: category learning, implicit cognition, explicit cognition, associative learning, category rules, procedural learning

Introduction

Categorization is an influential area in cognitive science because it is a crucial ability (e.g., Ashby & Maddox, 2011; Knowlton & Squire, 1993; Murphy, 2002; Nosofsky, 1986; Smith et al., 2008; Smith et al., 2016). Humans could have multiple systems or processes to manage the diverse demands of category tasks involving family-resemblance, rule-based, conjunctive, disjunctive, random, or ad hoc categories, just as they have multiple memory systems or processes. Indeed, memory and categorization systems might be meaningfully interdependent (e.g., explicit categorization and declarative memory; implicit categorization and procedural memory).

However, a theoretical countercurrent holds that assuming multiple categorization systems is unparsimonious and unjustified (e.g., Le Pelley et al., 2019). It suggests that a single system can predict the relevant categorization phenomena (e.g., Newell et al., 2010; Nosofsky et al., 2005) if one assumes that tasks vary in difficulty-complexity and that this variation produces empirical phenomena that are mistaken for the operation of separate systems. It is a lasting idea in cognitive science that singleness is elegant and scientifically preferable.

Thus, a sharp theoretical debate has persisted, especially focused on the possibility of dissociating explicit-declarative categorization processes (featuring verbal category rules) from implicit-procedural processes (featuring associations strengthened by processes akin to operant conditioning). Accordingly, the need persists for additional dissociative paradigms to help resolve this debate. Here, we contribute a new paradigm dissociating implicit and explicit categorization systems as described now.

Implicit-Procedural and Explicit-Declarative Categorization

Implicit category learning.

Our approach draws from the neuroscience of categorization (Ashby et al., 1998; Maddox & Ashby, 2004; Seger & Miller, 2010) that distinguishes different neural systems of learning. A hypothesized implicit system is energized by one of the brain’s primary reinforcement mechanisms. It likely grounds skill and habit learning (Mishkin et al., 1984) and learning during instrumental conditioning as well as some forms of discrimination learning and perceptual categorization (Ashby & Ennis, 2006; Barnes et al., 2005; Knowlton et al., 1996; O'Doherty et al., 2004; Seger & Cincotta, 2005). It is allied to various forms of associative learning, though it does not encompass all forms of associative learning (e.g., Pavlovian conditioning). This form of implicit learning occurs gradually and associatively, relying on trial repetition and immediate reinforcement (Maddox et al., 2003; Maddox & Ing, 2005). Participants learning implicitly may not be aware of their category knowledge or able to verbalize it (Ashby et al., 1998; Ashby & Ell, 2001). This implicit system is linked to particular parts of the basal ganglia. For example, extrastriate visual cortex projects to the tail of the caudate nucleus that then projects on to premotor cortex (Alexander et al., 1986). The caudate nucleus is well situated to associate percepts to actions, perhaps its primary role (Rolls, 1994; Wickens, 1993).

Thinking beyond neuroscience, readers will see that this implicit process has a long history within the literature on categorization. It was central to Shepard et al.’s (1961) founding exploration of the six logical classification tasks. They asked whether a unitary, associative mechanism could account for the tasks’ relative difficulties. It was central to Lee Brooks’ (e.g., 1978) seminal research exploring nonanalytic classification, by which participants associate category labels to specific remembered stimuli. It was the basis of exemplar theory and exemplar models (e.g., Nosofsky, 1986; Nosofsky et al., 1994). It dominated the comparative literature on categorization (because nonverbal animals may have only the associative process—Smith et al., 2004). This implicit process creates behavioral equivalence classes, allowing organisms to behave equivalently toward perceptually similar things. One may think of these equivalence classes as “categories” or not, and readers may differ on this point. Nonetheless, the learning of these classes has been central to the broader categorization literature, and therefore this associative process plays a central role in the multiple-systems debate.

Explicit category learning.

An explicit system of learning could be supported by declarative memory. Explicit learning occurs through hypothesis testing reliant on working memory (Fuster, 1989; Goldman-Rakic, 1987) and executive attention (Posner & Petersen, 1990). These cognitive utilities are known to support hypothesis testing and rule formation (e.g., Brown & Marsden, 1988; Robinson et al., 1980). This explicit system should learn quickly, perhaps suddenly through insightful discovery. Participants would construe the task for themselves and develop their own rule to guide performance. Their explicit category knowledge would be held in working consciousness and should generally be verbalizeable. This explicit neural system is likely grounded in the prefrontal cortex, the anterior cingulate gyrus, the head of the caudate nucleus, and the hippocampus (Ashby et al., 1998; Ashby & Ell, 2001).

Various dissociations have been demonstrated between implicit and explicit category learning (Ashby & Valentin, 2017). For example, Smith et al. (2014) used a regimen of deferred reinforcement that is one aspect of the present method. Participants completed a block of trials with no feedback. Then they received their positive outcomes grouped together and following that their negative outcomes grouped together. Reinforcement was displaced in time from trial performance. Knowing which trials were completed correctly was difficult and therefore associatively crediting reinforcement to particular stimulus-response pairs was impossible.

Given this feedback regimen, participants learned matched category structures thought to elicit implicit-procedural and explicit-declarative learning. Smith et al. (2014) observed a striking dissociation. Implicit learning was devastated under deferred reinforcement, because of the difficulty in assigning reinforcement credit already described. But explicit learning remained intact, because participants held in active mind the rule applied during the block, evaluated the rule’s success at block’s end, and kept or replaced it. Smith et al. (2018) found converging dissociative results using a different reinforcement regimen that is also incorporated here.

Empirical Goals

Though these dissociations support the dissociative framework described, that framework is not universally accepted. Therefore, we pursued three empirical goals to develop a distinctive dissociative paradigm that could advance the debate.

One empirical goal was to move this area beyond its dependence on the rule-based (RB) and information-integration (II) tasks that are often used to distinguish explicit and implicit learning (e.g., Ashby & Valentin, 2017). The RB-II dissociative framework is illuminating. We use it (e.g., Smith et al., 2012). However, it causes interpretative problems. One problem is that the nonidentical RB and II tasks are rotations of one another in perceptual space, so that the RB and II category solutions can be construed to have different dimensionalities. This raises questions about the difficulty-complexity of the tasks and can seem to give comfort to a single-system description. So, we sought a pair of implicit/explicit tasks that were identically structured in perceptual space and equated for stimulus dimensionality and other stimulus-to-stimulus relationships. We thought this might give us equivalent complexity and allow a stronger dissociative interpretation.

A second major empirical goal was to combine in our study several of the components that have created dissociative demonstrations in this literature. We hoped in this way to create within one paradigm a particularly strong and meaningful implicit-explicit dissociation. To this end, we studied the backward learning curves that can catch humans having explicit and instantaneous categorization insights. We studied category verbalization to distinguish explicit from implicit learning within a categorization task. We studied alternative reinforcement regimens that were theoretically predicted to differentially affect explicit and implicit learning.

A third major empirical goal was to produce our dissociation in a new way, by varying the kind and level of category knowledge that the implicit and explicit tasks foster. It has been a tacit assumption within the perceptual categorization literature, and within the RB-II area of study, that explicit category solutions take the form of conceptual rules that are often verbalizeable. This raised the possibility that we might produce a dissociation by varying the cognitive affordances of our tasks, so that one task fostered abstract-conceptual cognition more than the other. Then, by a range of converging measures, we could catch the explicit mind in the act of seizing these conceptual affordances. In this respect, the present approach represented constructive outreach from research on perceptual categorization to research on conceptual categorization and higher-level concepts (e.g., Medin & Wattenmaker, 1987; Murphy & Medin, 1985; Wattenmaker et al., 1986). It seems to us that there are possibilities for constructive cooperation and cross talk across the perceptual and conceptual areas of categorization research, though these have largely remained separate in the past. One crucial binding principle might be, for example, that explicit categorization is after all a privileged locus for conceptual and theory-based categorization.

Empirical Approach

We built matched pairs of XOR category tasks. XOR tasks have had a distinguished career in psychology (e.g., Feldman, 2000; Nosofsky et al., 1994; Shepard et al., 1961; Smith et al., 2004). For example, Shepard et al. (1961) discovered that the XOR learning trajectory and error pattern disconfirmed the unitary associative-learning theory they were exploring. Rather, participants seemed to be using rules and dimensional hypotheses, which Shepard et al. thought were likely carried by the explicit symbolic vehicle of language. It is remarkable that the literature on implicit and explicit category learning is still trying—after 60 years—to possibly find its way back to Shepard et al. That the systems debate remains one of the central debates in categorization after six decades underscores the theoretical importance of resolving it.

Our task pairs presented a crucial contrast. Though the tasks were logically identical, with the tasks’ stimuli placed and spaced apart identically in the same two-dimensional space, one task was configured so that it might make conceptual content discoverable as the task unfolded trial by trial. This task presented an abstract-conceptual affordance that could support correct classification. The other task simply had its stimulus values shifted globally (a simple coordinate translation through stimulus space), so that it did not present such an affordance that we (or, as it turned out, participants) could discern. In this latter task, we thought that perceptual appearance and associative learning might dominate. We will refer to the conceptual and perceptual tasks, respectively, to communicate this distinction.

Participants completed tasks under the contrasting reinforcement regimens of deferred and immediate reinforcement (e.g., Smith et al., 2014, 2018). Deferred reinforcement appears to suppress an important kind of reinforcement-based associative learning, while leaving relatively intact participants’ processes of hypothesis testing and explicit rule learning. Thus, this contrast seemed apt to dissociate in an additional way implicit and explicit category-learning processes.

We predicted that 1) the conceptual and perceptual tasks would be learned rapidly and slowly, respectively; 2) the tasks would be learned suddenly and gradually, respectively; 3) only the conceptual task would be robustly learnable under deferred reinforcement; 4) only the conceptual task would elicit clear verbalizations of the category principle; and 5) only the perceptual task (in our view, requiring associative learning) would become essentially unlearnable under deferred reinforcement (in our view, disabling the necessary type of associative learning).

We believe that no single-system description can make these predictions. A single-system description cannot predict both sudden insight learning and slow gradual associative learning (e.g., Smith & Ell, 2015). It cannot explain why the two kinds of category knowledge—perceptual and conceptual—would obey different processing principles. There are not perceptual and conceptual levels in a single system. It cannot explain why deferred reinforcement disrupts one kind of learning but not the other if attentional difficulty is held constant. From the single-system perspective, there must not be two forms of learning. Finally, the single-system viewpoint cannot explain why one form of category knowledge would be conscious and verbalizeable and one not. We acknowledge that specifying the single-system’s predictions is fraught, because its proponents have not clearly defined the single system they endorse. For example, Le Pelley et al. (2019, p. 1408) explicitly acknowledged that they could not provide “rigorous definitions of the interrelated constructs of cognitive complexity, memory demands, and task difficulty” on which the single-system idea depends heavily.

Nonetheless, confirming our five converging positive predictions would strongly suggest that humans have perceptual-associative processes for implicit learning, but also an overlain conceptual system that under the right circumstances constitutes a parallel explicit-declarative system for category learning. In fact, because the explicit and implicit processes considered in this article stand so diametrically opposed across many dimensions of human cognition, they could turn out to represent the clearest possible dissociation between two learning systems.

Experiment 1

Method

Participants.

One hundred and eighty-two Georgia State undergraduates1—with normal or corrected vision—participated for partial course credit. Sessions lasted for 52 minutes or 480 trials. Participants were assigned randomly to a task and reinforcement condition using their sequential participant number. Because of the need to supply different verbal instructions, immediate reinforcement participants (receiving immediate feedback after every trial) and deferred reinforcement participants (receiving deferred feedback only after an ensuing trial had been completed) were tested separately in groups of up to four at a time. Participants’ data were excluded if they completed fewer than 480 trials or did not complete a final questionnaire2. The final data set included 40, 31, 39, and 28 participants divided, respectively, among these conditions: conceptual task/immediate reinforcement (CI+), conceptual task/deferred reinforcement (CD), perceptual task/immediate reinforcement (PI), and perceptual task/deferred reinforcement (PD). Twelve participants were excluded for not enough trials (0, 2, 2, and 8, respectively, in the four conditions). Thirty-two participants were excluded for not completing a final questionnaire (4, 15, 3, and 10 participants, respectively). These exclusions were principled and necessary—they let us equate learning experience, analyze completed protocols, and study performance levels in relation to participants’ introspections and verbalized rules.

Stimuli.

The stimuli were red rectangles, varying in width and height, presented on a black background in the computer screen’s top center (Figure 1). There were four stimuli in each task, two contrasting but memorizeable stimuli in each category. Category A stimuli occupied the lower-left and upper-right quadrants of the stimulus space. Category B stimuli occupied the upper-left and lower-right quadrants. These placements honored the tasks’ XOR structure.

Figure 1. Stimuli used in Experiment 1.

Figure 1

Note: Boxes in the right upper and left lower quadrants were category A. The left upper and right lower stimuli were category B. A.) Stimuli presented in the conceptual task. B.) Stimuli in the perceptual task.

The widths and heights of stimuli varied between tasks. In the conceptual task, the Category A stimuli were 65 (width)-44 height and 130-88 in screen pixels. These dimensions produced squares on our running screens, given that screen pixels are taller than they are wide. This abstract property of Category A stimuli provided a conceptual entry into this task beyond just memorizing four shapes and their correct category responses. The B stimuli were 65-88 and 130-44 in pixels—respectively, nondescript standing-up and lying-down rectangles.

In the perceptual task, we simply added 100 pixels to the widths of all four stimuli. The A stimuli were 165-44 and 230-88 in pixels. The B stimuli were 165-88 and 230-44. Now all stimuli were lying-down rectangles, highly discriminable and memorizeable as shapes but presumably not conceptually codeable in some additional way that could support correct categorization. Of course, participants might have discovered some conceptual construal of the task that we could not discern. For this reason, the experiment’s results will offer their own comment on this presumption. Even absent such a conceptual cue, the four stimuli were mutually discriminable as shown in Figure 1. They varied by a factor of 3 in area and in shape as well. The stimuli were also easily perceptible, brightly colored, and large.

Categorization trials.

On each trial, the rectangle that was to be categorized appeared at the screen’s far right. Leftward were the large letters “A” and “B”, with a participant-controlled cursor between them. Participants pressed keys (S, L, labelled A, B, corresponding spatially to the two screen icons). Top and Bottom trials were displayed at the screen’s top and bottom, for reasons to be explained.

Reinforcement regimens.

Our crucial manipulation was to disrupt the normal cycle of immediate reinforcement following response. In the immediate reinforcement conditions, this cycle was sustained. Participants saw a stimulus, categorized it, and received immediate reinforcement. After correct responses, they saw Correct +1 Points Total Points N+1. After incorrect responses, they saw Incorrect −1 Points Total Points N-1. In the latter case they received a brief penalty timeout.

In the deferred reinforcement conditions, feedback was deferred as follows. Following a Bottom trial, participants received deferred feedback regarding the previous Top trial. For example, they might see (given a correct response) presented at the top of the screen in the position for Top trial feedback, Last Trial Correct +1 Points Total Points N+1. Or, following a Top trial, they might see (given an incorrect response) presented at the bottom of the screen in the position for Bottom trial feedback, Last Trial Incorrect −1 Points Total Points N-1. In the latter case they received a brief penalty timeout. The next trial then followed.

The deferred feedback was positioned spatially to show to which trial the feedback pertained—this was the purpose of the alternating Top and Bottom trials (see Figure 2 for an example). This reinforcement did not concern a presently available stimulus, or the most recent stimulus presentation, or the most recent behavioral response. Thus, associative learning was disrupted representationally, temporally, and behaviorally.

Figure 2. Trial and feedback structure in the deferred condition in Experiment 1.

Figure 2

Instructions: immediate reinforcement condition.

Participants were told they would categorize boxes as Category A or B, that A and B boxes would occur equally often, and that they would have to guess at first but could learn to respond correctly. They were told that even though the boxes alternated top and bottom on the screen, this had nothing to do with their Category A or B status. They knew they would gain or lose points for correct and incorrect responses, respectively, and that they would receive a timeout for errors. They were told that errors would cost them points, and time to earn points, and possibly make their session longer. They were told they would receive immediate feedback after each trial.

Instructions: deferred reinforcement condition.

The instructions were similar, excepting the description of reinforcement. Participants were told that after Trial 2 they would receive feedback from Trial 1, and so forth, with feedback always lagging one trial behind. They were told that even though the boxes alternated top and bottom on the screen, this had nothing to do with their Category A or B status, but was done to help them keep track of whether the feedback they received applied to a top or bottom trial.

Debriefing phase.

An additional aspect of our method was to collect participants’ explicit reports of their task construals, especially in the conceptual tasks that allowed a conceptual construal. Therefore, we gave them a debriefing questionnaire containing these items:

Why did you use Response A in this task? That is, what WAS a Type A Stimulus?

Why did you use Response B in this task? That is, what WAS a Type B Stimulus?

These let us analyze participants’ declarative understanding of the categories they had learned.

Analytic methods: questionnaires.

The questionnaire items were independently blind coded by two research assistants in our laboratory. They were instructed to give a questionnaire a 1 or 2 if the participant indicated that they simply used either the height or the length of the box to categorize. They were to give a questionnaire a 3 if the participant said “A’s were squares and B’s rectangles” (the correct verbalization of the conceptual task), or a 4 if they said the equivalent of “A’s are shorter and narrower or taller and wider; B’s are taller and narrower or shorter and wider (the correct verbalization of the perceptual task). In essence, the latter verbalization simply described the four stimuli in an item-specific manner, because in this case there is no constructive conceptual reframing. Codes 5 and 6 were used for explanations like those for codes 3 and 4 but flipping which were A’s and B’s. Raters used code 7 for anything else. A third rater provided back-up ratings that were used only to resolve the rare disagreements between the two students, who were blind to each other’s ratings and also blind to the level of performance achieved by the participant who completed the questionnaire. Interrater reliability between the first two raters was 85.5% agreement, Cohen’s κ = .777. This approach to coding participant reports—using a third rater to break ties—was somewhat conservative regarding testing our hypotheses, because it ensured that we retain for analysis all of the difficult and uncertain reports that caused our original raters difficulty. Finally, a 4th blind rater coded approximately 25% of the data, and showed a concordance of 91.43% agreement, Cohen’s κ = .863, with the final coding.

Results

Accuracy analyses.

Figure 3 presents proportions of correct responses for each condition by 20-trial block. Table 1 presents the means and standard deviations of accuracy overall and in the last 100 trials. The data from Figure 3 were entered into a three-way general linear model (GLM) with task (conceptual, perceptual) and reinforcement (immediate, deferred) as between-participant factors and trial block as a within-participant factor. The significant main effects of task F (1, 134) = 46.265, p < .001, ηp2 = .257, and reinforcement, F (1, 134) = 36.109, p < .001, ηp2 = .328, showed that participants performed better in the conceptual task and the immediate condition. The significant main effect of trial block F (23, 3082) = 22.770, p < .001, ηp2 = .145, confirmed that learning occurred. The significant interactions between task and reinforcement, F (1, 134) = 3.930, p = .049, ηp2 = .028, block and reinforcement, F (23, 3082) = 4.064, p < .001, ηp2 = .033, and the three-way, F (23, 3082) = 3.930, p < .001, ηp2 =.029, suggested that reinforcement differentially affected performance across tasks. The insignificant interaction between task and block, F < 2, suggested that the learning trajectory was similar in the two tasks. Parallel GLM analyses were also conducted adding in the participants dropped for missing questionnaires, as requested by an interested reviewer. These found the same pattern of results except that the task by block interaction became significant, F (23, 3818) = 2.874, p < .001, ηp2 = .017 with larger N’s in each group (CI = 46, CD = 44, PI = 42, PD = 38).

Figure 3. Proportion correct across 20-trial blocks in the four conditions.

Figure 3

Note: CI = Conceptual Immediate, CD = Conceptual Deferred, PI = Perceptual Immediate, PD = Perceptual Deferred.

Table 1.

Means and standard deviations (SD) of proportion correct across all trials and in the last 100 trials each condition.

Condition N All Mean All SD Last 100 Mean Last 100 SD
Conceptual Immediate 40 .954 .090 .979 .079
Conceptual Deferred 31 .795 .209 .838 .235
Perceptual Immediate 39 .828 .163 .907 .157
Perceptual Deferred 28 .566 .127 .530 .158

To clarify the three-way interaction, we conducted three two-factor GLMs with task and reinforcement as between-participant factors. One used the difference score between the first and last block as the dependent variable to encompass the whole learning trajectory. The other two used the first and last blocks separately as the dependent measure to explore the early or late focus of the trajectory differences. When examining the learning scores, there was a significant main effect of reinforcement, F (1, 134) = 18.200, p < .001, ηp2 = .120, reflecting more learning with immediate reinforcement. The crucial significant interaction, F (1, 134) = 14.416, p < .001, ηp2 = .097, reflected sharply reduced learning in the PD condition (all other Fs < 1). Planned comparisons found that immediate and deferred learning levels were not statistically distinguishable in the conceptual task, t (69) = .306, p = .760, d = .071, but were significantly different in the perceptual task, t (65) = 6.385, p < .001, d = 1.662. The very high performance (.907 correct) shown by participants in the last 100 trials of the PI condition confirms the full discriminability of the stimuli in that task and their consequent individual memorizeability.

Separate analyses of the first and last blocks found significant main effects of task, F (1, 134) = 30.528, p < .001, ηp2 = .186, F (1, 134) = 28.577, p < .001, ηp2 = .176 (first, last block respectively) and reinforcement, F (1, 134) = 4.713, p = .032, ηp2 = .034, F (1, 134) = 63.145, p < .001, ηp2 = .320, indicating that conceptual tasks and immediate reinforcement produced faster and more total final learning. (Indeed, this speed advantage was expressed even in the elevated correct proportions for participants in Block 1, some of whom are already discovering the conceptual rule.) There was a significant interaction between task and reinforcement only in the last block, F < 2, F (1, 134) = 13.443, p < .001, ηp2 = .091, reflecting bigger differences in final learning between the reinforcement conditions in the perceptual condition, but less differential effect in initial learning. Planned comparisons of immediate versus deferred reinforcement in each task (Conceptual and Perceptual, respectively) during the first block, t (69) = 2.027, p = .047 d = .476, t (65) = .914, p = .364, d = .229 and the last block, t (69) = 3.162, p = .002, d = .723, t (65) = 7.862, p < .001, d = 1.907, supported this interpretation.

Backward learning curves.

Following Smith et al. (2014) and Smith and Ell (2015), we examined the suddenness of learning in the different conditions. We defined learners as those who at any point completed three consecutive 20-trial blocks with 0.95 accuracy. In the four conditions (CI, CD, PI, PD), there were 39, 23, 31, and 3 learners. Chi square analyses found that the proportion of learners in the four conditions was significantly different from expected learning levels if there were no relationship between condition and learning, χ2 (3, N = 138) = 61.443, p < .001; w = .667. Parallel Chi Square analyses including the participants dropped for missing questionnaires found the same significant result with a greater number of learners (CI = 43, CD = 34, PI = 34, PD = 4).

Next, we studied the trajectory by which participants arrived at criterion. To do so, we created backward learning curves (BLCs). That is, we aligned the trial blocks at which learners reached criterion, and we examined performance levels backward and forward from that point. Readers should note the following idiosyncrasy of BLCs. Participants who reached criterion very quickly in the task will have fewer pre-criterion data points but more post-criterion data points. For late learners, the pattern is reversed. Thus, different participants and different numbers of participants are captured by different data points in an BLC graph. The alignment of all participants’ performance at the start of the criterion run lets one see the consensual pathway toward criterion taken by learners. Without alignment, the arrival at criterion would fall in many different blocks for many different participants, and any consensual pathway would be muddied away. Figure 3 presented the data in this way—this is the idiosyncrasy of Forward Learning Curves.

Illustrating the BLC analysis, Figure 4 shows BLCs for each condition. The solid black circles show the performance of 39 learners in the CI condition. Performance transformed at the point of the criterion run. In the 5 blocks before that point, participants averaged .550 correct—essentially chance performance. In the 5 blocks after criterion, they averaged .992 correct (a sudden improvement of .442 or suddenly 9 more correct responses per 20-trial block) and sustained that level going forward. We believe that this pattern is consistent with, and only consistent with, the sudden discovery of a conceptual category rule (e.g., Smith & Ell, 2015). No operant associative-learning mechanism produces a learning curve of this character. This pattern of instantaneous, insightful rule discovery diagnoses uniquely the explicit category-learning processes that for some reason have remained so controversial within the categorization literature. Shortly we will add verbalizations—declarative category knowledge—into the mix.

Figure 4. Backwards learning curves for learners in Experiment’s four conditions.

Figure 4

Note: A. CI and CD conditions. B. PI and PD conditions. We aligned the trial blocks at which participants reached criterion (Block 0), to show the path by which they solved their task. C: Conceptual P: Perceptual I: Immediate Reinforcement D: Deferred Reinforcement.

From this conclusion follows predictions for the CD condition. First, because testing hypotheses is more complicated under deferred reinforcement, we expected fewer criterial learners. There were 23 learners, not 39 as before. Nonetheless, we expected to see the same saltatory leap of rule discovery among the successful learners. And we observed just that (.552 correct, 5 blocks before criterion, .986 correct, 5 blocks after criterion, a transition of .434—Figure 4A, grey triangles).

In contrast, in the PI condition, we expected rule discovery to occur less strongly because the conceptual content was less discoverable given the nature of the rectangle stimuli. However, under immediate reinforcement, participants could still learn associatively, gradually building up response strengths binding correct responses to the four specific stimuli. Or, they could use exemplar memorization to connect the 4 discriminable stimuli to correct responses. It is good to remember that implicit learners may well not be learning the formal XOR concept, or even a differentiation between two coherent categories, but rather learning associative connections from specific stimuli to correct response choices. In fact, we did have 31 learners in this condition. But learning obeyed a different dynamic (black circles in Figure 4B). Now there was a more gradual improvement in performance approaching the criterion run, as befits the gradual strengthening of correct stimulus-response mappings (.726 correct, 5 blocks before criterion, .976 correct, 5 blocks after criterion, a transition of .251). More precisely, performance increased from 0.777 in the last block before criterion to 0.953 at the criterion block itself. This was a performance change of .176, a minimal transition (i.e., less than 4 more correct responses per block) compared to the conceptual conditions. We also confirmed these differences statistically through comparisons of the corrected3 criterion scores (Block 0 minus Block −1) for learners in each task. We found that when compared to the perceptual immediate condition both the conceptual immediate, t(68) = 5.201 , p < .001, d =1.242, and the conceptual deferred conditions, t(52) = 4.772, p < .001, d = 1.318, showed significantly greater jumps in learning.

Moreover, some of the small transition in the PI condition is artifactual. The criterial blocks must contain high performance levels. They cannot contain any block of 16/20 (80%) or lower, for then criterion is unreachable. Such a block would be relegated to the phase before criterion, creating an artifactual pre-post separation. Similarly, the block just before criterion essentially cannot contain any block of 17/20 (85%) of higher, or this block would just serve to define criterion earlier. The artifactual separation is accentuated. Smith and Ell (2015) discussed this selective sampling aspect of BLCs in detail.

Simulations.

We used formal modeling to examine our results from this perspective. We placed simulated participants into a 100-block “task”, with performance governed by a pre-criterion and post-criterion underlying competence that we could let vary systematically from 0.5 to 1.0. In our taskless simulation, correct performance was simply determined by the throw of a 100-sided dice weighted by the operative competence (e.g., 80 successes in 100 for an 0.80 competence). Given this framework, we could isolate the simulants that 1) could reach the defined criterion used in the actual experiment; 2) could produce the performance we observed before criterion; and 3) could produce the performance we observed after criterion. In this way, we could ask what underlying competence levels our real participants likely had at different points in the task.

Figure 5A shows the result of our simulation for the CD condition we have already discussed. This graph illustrates all the levels of competence before and after criterion that can produce the data pattern we observed (i.e., .552 and .986 before and after criterion). Competence before criterion must be extremely low (maximum 0.69). Competence after criterion must be extremely high (minimum 0.90), respectively. There is a minimum gap of 0.21 between these. This inherent gap represents the pure leap of insight / competence that we are considering in this article. The formal simulation confirms that this gap must exist, just as did the extensive simulations in Smith and Ell (2015).

Figure 5. A formal model isolated all the values of underlying pre- and post-criterion competence (black dots) that can reproduce the observed levels of pre- and post-criterion performance in Experiment 1.

Figure 5

Note: A. CD condition. B. PI condition. For example (A), these participants had a maximum underlying competence of .69 before criterion and a minimum .90 competence after reaching criterion. There was an instantaneous increase of at least .21 in this transition.

Figure 5B shows in the same way the result of our simulation for the PI condition. This graph illustrates all the levels of competence before and after criterion that can produce the data pattern we observed (i.e., .726 and .976 before and after criterion). Competence after criterion can sink down into the mid 0.80’s and still produce what we observed in that phase. But competence before criterion in the mid 0.80’s can also predict what we observed in that phase. Thus, this simulation is consistent with the possibility that in the PI condition there was not any intrinsic leap at all. Only in the conceptual conditions must one conclude that there is a sudden leap in competence. Smith and Ell (2015) showed that gradualistic associative models cannot accommodate these sudden performance improvements.

Finally, we point out that the results from the PD condition also fit our predictions. Our theoretical perspective suggested that the perceptual task would deny participants the explicit learning process that sometimes survives deferred reinforcement. And the deferred reinforcement would deny participants the complementary implicit-associative process. Learning should have collapsed, and it did. There were 3 criterial learners, about 10% of the sample. In the aggregate there was almost no learning at all.

Verbalizations.

Fifty-nine of 71 participants (83%) were coded to have correctly described the conceptual condition. In the CI condition, with 40 total participants, with 39 strong learners, 38 participants (all of them learners) stated the task’s conceptual grounding. There was essentially a perfect concordance between successful learning and the task’s conceptual construal. In the CD condition, with 31 total participants, with 23 strong learners, 21 participants (all of them learners) reported the task’s conceptual grounding. All non-learners failed to report this construal. There was a nearly perfect concordance between learning and a conceptual declaration again. In both conceptual conditions, participants’ explicit-declarative category knowledge accounted extremely sensitively for their performance and for the suddenness of their rule discovery. Thus, the verbalizations were also consistent with the theoretical interpretation that explicit-declarative learning characterized the conceptual conditions. These participants discovered the alternative conceptual reframing afforded by the task.

On the other hand, only 12 of the 67 participants in the perceptual tasks (18%) were coded to have correctly described the perceptual condition. This percentage is more than four times smaller than we observed in the conceptual conditions. Moreover, of those 12, five showed their knowledge by drawing (not verbalizing) their knowledge. The others simply described more or less fully all four specific stimuli that appeared within the task. This confirms our sense as we prepared the experiment that the perceptual task would not provide to participants an alternative abstract-conceptual route to reframing the task, so that they would need to fall back on memorizing the four stimuli. These participants apparently did not discover an alternative conceptual reframing of their task. In the immediate reinforcement condition, with 39 total participants and 31 strong learners, 10 were coded as correctly describing the task (i.e., correctly describing the four stimuli presented under the correct category label). Of those, eight were learners (2 nonlearners) and four drew rather than verbalized their description. In the deferred reinforcement condition, with 28 total participants and 3 strong learners, 2 of those strong learners were coded as correctly describing the XOR rule. One drew the stimuli, and one verbalized their correct description. Learning in the perceptual task was thus more visual and more about specific-item associations to the four stimuli. The availability of these specific-item associations confirms in another way the full individual discriminability of the stimuli in the perceptual task.

Experiment 2

Experiment 2 explored these phenomena in a procedure different in three respects. We used a different stimulus domain, involving multiple circle stimuli to be judged relationally, instead of one rectangular stimulus. We adopted a different reinforcement regimen, one involving deferred reinforcement delivered only after the completion of each six-trial block. Third, the available conceptual construal was now about the abstract conceptual relation (Same or Different) between two stimuli, rather than a shape label given one stimulus. In many other respects, the methods were like those in Experiment 1 and so we only note the points of contrast.

Method

Participants.

One hundred and eighty Georgia State undergraduates4—with normal or corrected vision—participated for partial course credit. Sessions lasted for 52 minutes or 300 trials. Participants’ data were excluded if they completed fewer than 300 trials or did not complete a final questionnaire. Participants were assigned randomly to a task and reinforcement condition using their sequential participant number. The final data set after principled exclusions included 171 GSU participants divided among the four conditions as follows: CI condition (46), CD condition (43), PI condition (42), and PD condition (40). Five participants were excluded for not enough trials (0, 2, 1, and 2, respectively, in the mentioned conditions). Four participants were excluded for not completing a final questionnaire (3 in the CD condition and 1 in the PI condition).

Stimuli.

On each trial, participants saw a pair of circle stimuli, in red (left) and green (right) that always just touched at the 3:00 o’clock tangent point. The four XOR stimuli were placed in perceptual space as already described. In the conceptual task, the A stimuli had left and right radii of 25-25 and 50-50. Category A stimuli instantiated the abstract relation of SAME. Category B stimuli had left and right radii of 25 50 and 50 25. They instantiated the abstract relation of DIFFERENT (Figure 6A). In the perceptual task, we simply added 45 to the radius of all left circles. Once again, this was a simple coordinate translation through perceptual space. Now all trials presented differently sized circles, so all trials were DIFFERENT (Figure 6B). The translation through perceptual space only had the effect of changing the task’s conceptual affordances.

Figure 6. Stimuli used in Experiment 2.

Figure 6

Note: Circles in the right upper and left lower quadrants were category A. The left upper and right lower stimuli were category B. A.) Stimuli presented in the conceptual task. B.) Stimuli in the perceptual task.

Here, too, the individual stimuli (circle pairs) were obviously discriminable from one another and individually memorizeable for that reason. They varied strongly in overall size and in the circles’ differential areas. The performances achieved by learners after criterion will strongly confirm this claim of discriminability.

Categorization trials.

In Experiment 2, there was no need for alternating trials Top and Bottom to help participants understand that feedback applied to a former trial. Therefore, all trials were presented at screen center. The collection of responses from participants was exactly as already described. Trials continued for 15 20-trial blocks.

Reinforcement regimens.

In the CI and PI conditions, immediate reinforcement was delivered as already described. Now, though, we chose a different approach to defer reinforcement and disrupt the time-locked stimulus-response-reinforcement cycle. Participants saw and responded to six stimuli in succession without receiving any feedback. Then they received summary feedback as follows. In successive screen messages, they were told about all of their correct responses in the block as their Total Points tally was also incremented. Then, they were told about all of their incorrect responses in the block as their Total Points tally was also decremented (Figure 7). Each error message was accompanied by a brief penalty timeout. Under these conditions, stimuli and responses were separated in time and by intervening trials. Outcomes were scrambled away from the original order of presentation. There was no way to know which precise trials had been answered correctly or not. There was no way to strengthen the associative bonds between particular stimuli and responses. We expected operant associative-learning mechanisms to be disabled. However, explicit-conceptual learning processes might not be disabled. Participants could hold their working hypothesis in memory and wait for the summary feedback to come, judging then appropriately whether that hypothesis or rule was working out for them.

Figure 7. Trial and feedback structure in the deferred condition in Experiment 2.

Figure 7

Instructions: immediate conditions.

These instructions were as already described, except that participants were told they would categorize pairs of circles as Category A or B.

Instructions: deferred conditions.

These instructions were the same, except that their feedback instructions suited their deferred reinforcement regimen. They were told they would be given summary feedback every six trials.

Analytic methods: questionnaires.

At the end of the experiment participants filled out the questionnaire described in Experiment 1. The questionnaire items were independently blind coded by two research assistants. They were instructed to give a questionnaire a 1 if the participant said “A’s were the same and B’s different” (the correct verbalization of the conceptual task), or a 2 if they said the equivalent of “A’s are smallest with medium large or medium small with largest, B’s were smallest with largest or the two mediums (the correct verbalization of the perceptual task). Again, this was simply a description of the task’s four stimuli. Codes 3 and 4 were used for explanations like those for codes 1 and 2 but with only one type of stimuli correctly described. Code 5 was used if the person said they memorized the A’s and B’s and code 6 for anything else. A third rater provided backup ratings that were used only to resolve the rare disagreements between the two students, who were blind to each other’s ratings and also blind to the level of performance achieved by the participant who completed the questionnaire. Interrater reliability between the first two raters was 87.7% agreement, Cohen’s κ = .801. This approach to coding reports—using a third rater to break ties—was somewhat conservative regarding testing our hypotheses for the reason already described. Finally, a 4th blind rater coded approximately 25% of the data, and showed a concordance of 90.70% agreement, Cohen’s κ = .845, with the final outcome of the coding process.

Results

Figure 8 presents performance for the four conditions by 20-trial block. Table 2 presents the means and standard deviations of overall accuracy and terminal performance (last 100 trials). The former data were entered into a three-way GLM with task (conceptual, perceptual) and reinforcement (immediate, deferred) as between-participant factors and block as a within-participant factor. The significant main effects of task F (1, 167) = 92.483, p < .001, ηp2 = .356, and reinforcement, F (1, 167) = 131.320, p < .001, ηp2 = .440, showed that participants performed better in the conceptual task and with immediate reinforcement. The significant main effect of trial block, F (14, 2338) = 20.280, p < .001, ηp2 = .108, confirmed that learning occurred. The interaction between task and reinforcement, F (1, 167) = 3.459, p = .065, ηp2 = .020, was not significant, but the significant interactions between block and reinforcement, F (14, 2338) = 3.330, p < .001, ηp2 = .020, and the three-way, F (23, 3082) = 4.232, p < .001, ηp2 =.025, suggested that reinforcement condition differentially affected learning across tasks. The insignificant interaction between task and block, F < 2, suggested that the learning trajectory was similar in the two tasks.

Figure 8. Proportion correct across 20 trial blocks in the four conditions.

Figure 8

Note: CI = Conceptual Immediate, CD = Conceptual Deferred, PI = Perceptual Immediate, PD = Perceptual Deferred.

Table 2.

Mean and standard deviations (SD) of proportion correct across all trials and in the last 100 trials each condition

Condition N All Mean All SD Last 100 Mean Last 100 SD
Conceptual Immediate 46 .955 .048 .976 .032
Conceptual Deferred 43 .683 .199 .737 .228
Perceptual Immediate 42 .721 .158 .787 .186
Perceptual Deferred 40 .525 .066 .543 .105

Again, to clarify the three-way interaction, we conducted three two-factor GLMs with task and reinforcement as between-participant factors. One used the difference score between the first and last block as the dependent variable to encompass the whole learning trajectory. The other two used the first and last blocks as the dependent measure to explore the early or late locus of these differences. Examining learning scores, there was a significant main effect of reinforcement, F (1, 167) = 7.498, p < .001, ηp2 = .043, reflecting greater learning with immediate reinforcement. The crucial significant interaction, F (1, 167) = 8.202, p < .001, ηp2 = .047, reflected sharply reduced learning in the PD condition (all other Fs < 1). Planned comparisons found that learning in the immediate and deferred reinforcement conditions were not statistically distinguishable in the conceptual task, t (87) = .085, p = .933, d = .018, but were in the perceptual task, t (80) = 4.247, p < .001, d = .941. Separate analyses of the first and last blocks found significant main effects of task, F (1, 167) = 44.147, p < .001, ηp2 = .209, F (1, 167) =41.445 , p < .001, ηp2 = .199 (first, last block respectively) and reinforcement, F (1, 167) = 35.049, p < .001, ηp2 = .173, F (1, 167) = 71.603, p < .001, ηp2 = .300, indicating that conceptual tasks and immediate reinforcement produce both faster and more final learning overall. There was a significant interaction between task and reinforcement only in the first block, F (1, 167) = 13.054, p < .001, ηp2 = .073, F < 2, reflecting bigger differences in initial learning between the reinforcement conditions in the conceptual condition, but similar differences between types of reinforcement in final learning. Planned comparisons of immediate versus deferred reinforcement in each task (Conceptual and Perceptual respectively) during the first, t (87) = 5.968, p < .001 d = 1.263, t (80) = 1.997, p = .049, d = .442 and the last blocks, t (87) = 6.062, p < .001, d = 1.265, t (80) = 5.902, p < .001, d = 1.308, supported this interpretation.

Using the same learning criterion as in Experiment 1, we found that in the four conditions there were 45, 18, 16, and 0 learners. Chi square analyses found that the proportion of learners in the four conditions is significantly different from expected learning levels if there was no relationship between condition and learning, χ2 (3 N = 171) = 85.111, p < .001; w = .705.

These learners then let us study the path by which participants reached criterion (Figure 9). The performance of 45 learners in the CI condition transformed again at Block 0, with a .383 increase in performance (.598 correct, 5 blocks before criterion to .980, 5 blocks after criterion, Figure 9A—black circles). Rule discovery, not gradually strengthening associative connections, is consistent with this suddenness.

Figure 9. Backwards learning curves for learners in Experiment 2’s three conditions that produced learners reaching criterion.

Figure 9

Note: A CI and CD conditions. B. PI condition. Trial blocks are aligned at the point participants reached criterion (Block 0). C: Conceptual P: Perceptual I: Immediate Reinforcement D: Deferred Reinforcement.

As in Experiment 1, deferred reinforcement made hypothesis testing more difficult. There were 18 learners in the CD condition, not 45. It is expectable for two reasons that deferred reinforcement had a greater effect on learning in E2 (41.86% learners) compared to E1 (77.42% learners). First, the working memory demands required to hold on to choices while awaiting deferred feedback probably interferes more with the discovery of abstract relational concepts like same/different than with the discovery of basic perceptual concepts like square/rectangle. This would be consistent with recent research showing that concurrent working memory load interferes with participants’ transition from perceptually based matching to conceptual same/different matching in a relational match to sample task (Smith et al., 2019). Second, in E2, hypothesis testing had to operate over a deferral of 6 trials in a block, not just one lagged trial as in E1.

Despite these effects, we expected to see the same saltatory leap of rule discovery among the successful learners. And we observed just that (a transition of .467: .519 correct in the five blocks before criterion, .986 correct in the five blocks after criterion, Figure 9A, grey triangles). The transition at criterion was especially strong here, possibly because deferred reinforcement undercuts associative-learning processes, leaving the participant with only conceptual discovery.

We expected less learning in the PI condition because, even though the stimuli were placed identically in stimulus/perceptual space, the placement did not provide an intuitive conceptual route to performance. But we expected associative-perceptual learning to still occur, by developing stimulus-response linkages or by exemplar memorization (but likely not by learning the formal XOR task structure or two coherent, differentiable categories). In fact, we did have 16 learners in this condition. Their learning clearly obeyed a different dynamic (Figure 9B—black circles). There was a markedly gradual improvement in performance, as befits some gradual learning process. At criterion, there was a performance change of .125 between the pre-criterion block, .816, and criterion, .941 (.786 correct in the five blocks before criterion, .958 correct in the 5 blocks after criterion, a transition of .172). This change was very small and perhaps the result of the sampling constraints already described in Experiment 1. Again, this pattern was consistent with gradual, associative learning. The difference in the suddenness of learning can be seen clearly in the very different backwards curves producing similar overall performance in the CD and PI conditions (compare Figures 9A and 9B). Also, statistical comparisons of the criterion scores (Block 0 minus Block −1)5 for learners found that when compared to the PI condition both the CI condition, t(61) = 3.740 , p < .001, d =1.254, and the CD condition, t(32) = 6.292, p < .001, d = 1.3, showed significantly greater jumps in learning.

There were no learners in the PD condition. The deferred reinforcement disabled associative learning processes. The translation of the task through stimulus space undermined the task’s conceptual availability. Learning should have been doubly hamstrung in this condition, and it was.

We examined participants’ verbalizations as before. In the conceptual conditions, 61 of 89 participants reported the task’s conceptual grounding (68%). In the CI condition, with 46 total participants, with 45 strong learners, 42 participants (all learners) reported the task’s conceptual grounding. In the CD condition, with 43 total participants, with 18 learners, 19 participants (18 learners and one participant whose last block was perfect but who still did not meet criterion) reported that grounding. Here, too, there was a strong concordance between learning and a conceptual declaration. In both conditions, participants’ declarative category knowledge accounted for their performance and for the suddenness of their learning. On the other hand, only 6 of the 82 participants in the perceptual tasks (7%, a percentage nine times less) were coded to have correctly described the perceptual condition—that is, they correctly described the four stimuli presented in the task. Of those 6, 2 showed their knowledge through drawing, not verbalizing. All came from the group of 16 learners in the immediate reinforcement condition. Overall, just as in Experiment 1, the results are only consistent with the explanation that explicit-declarative learning uniquely characterized and transformed the conceptual conditions.

General Discussion

The Difficulty Hypothesis

Through a 20-year theoretical debate, the single-system idea has depended on an amorphous construct of task difficulty. It proposes that seeming multiple system results can be explained by assuming a unitary learning system, close to the associative-memory system described here, with some tasks just harder than others. Unfortunately, the difficulty hypothesis in categorization is untenable. The descriptor “difficult” has no psychological meaning unless one understands the processes brought to a task. And single system proponents have not defined the construct of difficulty in any principled processing manner that transcends case by case convenience (Ashby et al., 2020).

The present results represent additional failures of the unitary hypothesis. First, single system models cannot accommodate the performance leaps that participants showed in our conceptual tasks (Smith & Ell, 2015). Participants adopted instead insightful rule discovery. Second, single-system accounts cannot explain the qualitative shift of task knowledge from the tacit and behavioral to the conscious and aware. A second learning process resident in working consciousness can explain this shift. In fact, Smith et al. (2019) showed they could disrupt this learning process using working-memory interference in a relational task like that used in Experiment 2. Third, a single system cannot explain the transition from concrete-perceptual to abstract-conceptual information. It has no second register of abstract nodes overlaying perceptual nodes. Fourth, single-system accounts cannot explain the task-selective use of declarative language shown here. Fifth, single system accounts cannot explain why conceptual learning selectively survives deferred reinforcement. Deferred reinforcement is a powerful block to many kinds of associative learning (stimulus response, stimulus stimulus, stimulus reinforcement, etc,). Clearly, participants have swapped in some qualitatively different learning process that can reflect on feedback dislocated in time and rearranged out of trial by trial order.

To force these qualitative results to fit a single-system model, one would have to collapse together the implicit and the explicit, the conscious and the automatic, the reflexive and the reflective, the verbal and the behavioral, probably the striatal and the pre-frontal, and the reinforcement dependent and independent. By doing so, you would crush together the principal diametrical oppositions in human cognition into one vague mass, slowing theory development and empirical progress. Therefore, we believe it is time for the field to adopt a disciplined multiple systems understanding of humans’ performance in discrimination, classification, and categorization tasks.

Human Knowledge and Unitary Description

The present findings reach beyond perceptual categorization toward the literature on category naturalness and structural constraints on category learning (e.g., Medin & Schwanenflugel, 1981; Medin & Wattenmaker, 1987; Wattenmaker et al., 1986). This research showed that humans are flexible category learners, sometimes succeeding in learning poorly structured categories, XOR categories, even random categories. This research, like the present research, showed that humans can use their knowledge (e.g., of causality, illness, and so forth) in support of category learning. We credit this research, in no way claiming it is our distinctive insight that humans bring conceptual knowledge to the task of categorization.

However, the focus in that area was category naturalness and the search for structural constraints on category learnability. Research interpretations focused there. Medin and his colleagues found that interactive-cue models let one account for many learning flexibilities. They introduced the theoretical narrative that this kind of processing system might serve much of categorization. There might be no need for envisioning different categorization systems or brain loci. All categories might be equally learnable through the process of configural cue encoding or exemplar memorization. This theoretical narrative naturally suggested that categorization is unitary. By this path, exemplar theory and exemplar models emerged and dominated the literature for 25 years.

It is ironic that the research on humans’ learning flexibility was redirected to support the unitary description. The present research helps repair this narrative. Our research shows, by several converging cognitive dissociations, that humans recruit the knowledge structures that transform XOR tasks by recruiting higher levels of explicit-declarative categorization. Seen in that light, the beautiful theory- and knowledge-based work of Medin and others supports the existence of multiple systems in categorization.

The Content of Explicit Categorization

Our research offers clues about the building blocks of explicit-conceptual categories. First, explicit categorization is especially linked to the learning of category rules, particularly verbalizeable category rules. Second, it is linked to category content that transcends the level of concrete perceptual appearance. Third, it is linked to the use of higher-level relational cues, like Same/Different. Illustrating this point, the relation of Sameness can apply equally no matter whether color, shape, size, or any other perceptual features are shared. A perceptual associative system can gain no traction on an appearance-transcending relation of this kind. Fourth, it is more of an executive function, a conclusion strengthened by the demonstration that a concurrent working-memory load interferes with relational learning (Smith et al., 2019). It seems to us that all of these—the abstractive, the verbal, the rule-based, the relational, the executive—could be components of a system that builds explicitly more conceptual, theory-based concepts. Indeed, explicit cognition could well be the preferred place for conceptual, theory-based categorization.

We are not alone in raising these possibilities. Developmental, cognitive, and comparative psychologists have all considered explicit cognition, relational concepts, and language as closely allied (e.g., Christie & Gentner, 2014; Grafman & Litvan, 1999; Halford et al., 2010; Hummel & Holyoke, 2003). The possible bridges between the perceptual and conceptual level of cognitive functioning have also been carefully analyzed (Penn et al., 2008). But, regarding categorization, the realization of a multiple systems perspective serves an especially important role, for it knits together two branches of the categorization literature (perceptual and theory-based categorization) that have remained mostly separate.

Why Multiple Category Systems?

Explicit and implicit learning systems might play complementary roles in cognition. For example, humans’ operant associative learning has strengths and weaknesses. It produces stable, adaptive behavior shaped by contingencies. It is permastored and lasting. It is attainable by many species, because it need not depend on selective attention, conceptual abstraction, or consciousness. However, operant associative learning is dependent on immediate reinforcement and extensive trial repetition. It learns slowly and incrementally. It is inertial—the unlearning of even destructive habits can be extraordinarily difficult. Learning cannot occur off-line or with displacement in time or space from the task’s trials. New approaches cannot be chosen, or old approaches dropped, in a crisis. This learning steers like the Queen Mary.

Explicit learning is a perfect complement to implicit learning because it turns on a dime. It does not depend on immediate reinforcement or event repetition. Learning can occur off-line and with displacement. Learning and unlearning can occur suddenly at need. Explicit learning is superior when the organism lacks conditioned responses and trained habits. It is crucial in crises, when the organism has to answer the question: So, what do I do now? Thus, we believe that implicit and explicit learning may both confer behavioral fitness in their own ways, to the point that other vertebrate lines could have evolved their own version of multiple learning systems.

Comparative studies

Accordingly, one considers an evolutionary perspective toward multiple systems. How have these systems emerged during evolution, and in which vertebrate lines? This intriguing evolutionary story shows once more the theoretical power of the multiple-systems description.

In the most sophisticated relational task—the Relational Matching to Sample (RMTS) task (e.g., Smith et al., 2013), humans and some apes perform successfully. It is meaningful that the successful apes are those who have received proto-language training (e.g. Premack, 1976; Thompson et al., 1997). Thus “language” enters the evolutionary story. Monkeys have failed in RMTS tasks (e.g., Smith et al., 2013) or shown glimmers of success given dogged training (e.g., Fagot & Thompson, 2011). It is meaningful that in Maugard et al. (2013), baboons performing RMTS tasks were disrupted by working-memory interference as humans were in Smith et al. (2019). Thus, executive processing enters the evolutionary story. In less complicated relational tasks, with content like in Experiment 2, monkeys are robustly successful (e.g., Shields et al., 1997), but not pigeons. Pigeons mainly fail in the RMTS task and in true pairwise Same/Different paradigms (Young et al., 1997), though sometimes they can find low-level cues like visual calmness/busyness with which to buttress their performance.

The same evolutionary story is told by the rule-based category tasks of the RB-II literature (see Castro et al., 2020; Qadri et al., 2019; Smith, Ashby, et al., 2011; Smith et al., 2012). The same evolutionary story is told by the extensive comparative literature on cognitive self-awareness or metacognition (see Zakrzewski et al., 2017, for review).

This comparative research is illuminated by the multiple-systems description. The species progression—pigeons, monkeys, apes, humans—is explained if one acknowledges two learning systems: a basic, operant associative-learning process shared across the species (learning theorists have accepted this basic process for 100 years), and an explicit-conceptual system that is also emerging, especially in the primate line (judging by the limits on current research). This is why Church et al. (2020; also Smith & Church, 2021) proposed that the multiple-systems idea may be the most powerful explanatory tool today in comparative psychology, for it explains dozens of findings about humans and animals in discrimination and classification tasks.

In fact, it is an interesting question whether macaques could appreciate the conceptual affordances of the XOR tasks adopted here. Macaques generally find XOR category structures problematic because they must transcend perceptual appearance to put dissimilar objects together in the same category (e.g., Smith, Coutinho, et al., 2011). However, their performance might be facilitated if one gave them higher-level affordances as we did humans here. Of course, macaques would have to appreciate these affordances in a languageless way, granting a test of the necessity of language for this purpose.

Here one sees once more that the hypothesis of multiple learning systems is richly productive of new directions for theoretical growth and empirical investigation. Thus, for many reasons, including the new dissociative findings between explicit and implicit learning demonstrated here, we hope this article helps foster the field’s consensus that multiple systems of category learning probably underlie performance in humans and possibly other species, too. For we believe this dissociative framework has great potential to accelerate theoretical and empirical progress in our area, just as it transformed theory and research in memory—also after a long and sharp debate.

Supplementary Material

open disclosure form

Acknowledgments

The preparation of this publication was supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under Award Number R01HD093690. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors declare no financial interest and no conflicts of interest. The data, analyses, and questionnaires can be viewed using the following link to open science framework (Church, 2020). https://osf.io/ystr9/?view_only=211b3682a92146d481f0d95e11067137

Footnotes

1

Power analyses using effect size estimates based on Smith et al., 2018 suggested that .95 power could be achieved for all sub-analyses with a sample of 148. A sample size of only 40 would achieve that power level for the crucial three-way interaction.

2

In follow-up analyses, all GLM analyses were repeated including participants dropped for no questionnaire to confirm the added power would not change the basic pattern of findings.

3

Criterion scores were calculated by subtracting the proportion correct in block −1 from block 0. If block −1 did not exist, we used .5 (chance performance) to stand in for the missing block −1.

4

Power analyses using effect size estimates based on Smith et al., 2014 suggested that .95 power could be achieved for all sub-analyses of interest with a sample of 152. A sample of 52 would achieve that power for the crucial three-way interaction.

5

Criterion scores were calculated by subtracting the proportion correct in block −1 from block 0. If block −1 did not exist, we used .5 (chance performance) to stand in for the missing block −1.

References

  1. Alexander GE, DeLong MR, & Strick PL (1986). Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annual Review of Neuroscience, 9, 357–381. 10.1146/annurev.ne.09.030186.002041 [DOI] [PubMed] [Google Scholar]
  2. Ashby FG, Alfonso-Reese LA, Turken AU, & Waldron EM (1998). A neuropsychological theory of multiple systems in category learning. Psychological Review, 105, 442–481. 10.1037/0033-295X.105.3.442 [DOI] [PubMed] [Google Scholar]
  3. Ashby FG, & Ell SW (2001). The neurobiology of human category learning. Trends in Cognitive Science, 5, 204–210. 10.1016/S1364-6613(00)01624-7 [DOI] [PubMed] [Google Scholar]
  4. Ashby FG, & Ennis JM (2006). The role of the basal ganglia in category learning. In Ross BH (Ed.), The psychology of learning and motivation: Advances in research and theory. (Vol. 46, pp. 1–36). Academic Press, 10.1016/S0079-7421(06)46001-1 [DOI] [Google Scholar]
  5. Ashby FG, & Maddox WT (2011). Human category learning 2.0. Annals of the New York Academy of Sciences, 1224, 147–161. 10.1111/j.1749-6632.2010.05874.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Ashby FG, Smith JD & Rosedahl LA (2020). Dissociations between rule-based and information-integration categorization are not caused by differences in task difficulty. Memory & Cognition, 48, 541–552. 10.3758/s13421-019-00988-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Ashby FG, & Valentin VV (2017). Multiple systems of perceptual category learning: Theory and cognitive tests. In Cohen H and Lefebvre C (Eds.), Handbook of categorization in cognitive science (2nd ed., pp. 157–188). Elsevier. 10.1016/B978-0-08-101107-2.00007-5 [DOI] [Google Scholar]
  8. Barnes TD, Kubota Y, Hu D, Jin DZ, & Graybiel AM (2005). Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories. Nature, 437, 1158–1161. 10.1038/nature04053 [DOI] [PubMed] [Google Scholar]
  9. Brooks LR (1978). Nonanalytic classification and memory for instances. In Rosch E & Lloyd B (Eds.), Cognition and categorization (pp. 169–211). Erlbaum. [Google Scholar]
  10. Brown RG, & Marsden CD (1988). Internal versus external cures and the control of attention in Parkinson’s disease. Brain, 111, 323–345. 10.1093/brain/111.2.323 [DOI] [PubMed] [Google Scholar]
  11. Castro L, Savic O, Navarro V, Sloutsky VM, & Wasserman EA (2020). Selective and distributed attention in human and pigeon category learning. Cognition, 204, Epub 10.1016/j.cognition.2020.104350 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Christie S, & Gentner D (2014). Language helps children succeed on a classic analogy task. Cognitive Science, 38, 383–397. 10.1111/cogs.12099 [DOI] [PubMed] [Google Scholar]
  13. Church BA (2020, October 6). DATA Conceptual anchoring dissociates implicit and explicit category learning. Retrieved from osf.io/ystr9. 10.17605/OSF.IO/YSTR9 [DOI] [PMC free article] [PubMed]
  14. Church BA, Jackson BN, & Smith JD (2020, in press). Dissociable learning processes: A comparative perspective. In Krause M Hollis KL & Papini MR (Eds.), Evolution of learning and memory mechanisms. Cambridge University Press. [Google Scholar]
  15. Fagot J, & Thompson RKR (2011). Generalized relational matching by guinea baboons (Papio papio) in two-by-two-item analogy problems. Psychological Science, 22, 1304–1309. 10.1177/0956797611422916 [DOI] [PubMed] [Google Scholar]
  16. Feldman J (2000). Minimization of Boolean complexity in human concept learning. Nature, 407, 630–633. 10.1038/35036586 [DOI] [PubMed] [Google Scholar]
  17. Fuster JM (Ed.). (1989). The prefrontal cortex, (2nd ed.). Lippincott-Raven. [Google Scholar]
  18. Goldman-Rakic PS (1987). Circuitry of the prefrontal cortex and the regulation of behavior by representational knowledge. In Plum F & Mountcastle V (Eds.), Handbook of physiology (pp. 373–417). American Physiological Society, 10.1002/cphy.cp010509 [DOI] [Google Scholar]
  19. Grafman J, & Litvan I (1999). Importance of deficits in executive functions. Lancet, 354, 1921–1923. 10.1016/S0140-6736(99)90438-5 [DOI] [PubMed] [Google Scholar]
  20. Halford GS, Wilson WH, and Phillips S (2010). Relational knowledge: The foundation of higher cognition. Trends in Cognitive Sciences, 14, 497–505. 10.1016/j.tics.2010.08.005 [DOI] [PubMed] [Google Scholar]
  21. Hummel JE, & Holyoak KJ (2003). A symbolic-connectionist theory of relational inference and generalization. Psychological Review, 110, 220–264. 10.1037/0033-295x.110.2.220 [DOI] [PubMed] [Google Scholar]
  22. Knowlton BJ, & Squire LR (1993). The learning of categories: Parallel brain systems for item memory and category knowledge. Science, 262, 1747–1749. 10.1126/science.8259522 [DOI] [PubMed] [Google Scholar]
  23. Knowlton BJ, Mangels JA, & Squire LR (1996). A neostriatal habit learning system in humans. Science, 273, 1399–1402. 10.1126/science.273.5280.1399 [DOI] [PubMed] [Google Scholar]
  24. Le Pelley ME, Newell BR, & Nosofsky RM (2019). Deferred feedback does not dissociate implicit and explicit category learning systems: Commentary on Smith et al. (2014). Psychological Science, 30, 1403–1409. 10.1177/0956797619841264 [DOI] [PubMed] [Google Scholar]
  25. Maddox WT, & Ashby FG (2004). Dissociating explicit and procedural-learning based systems of perceptual category learning. Behavioural Processes, 66, 309–332. 10.1016/j.beproc.2004.03.011 [DOI] [PubMed] [Google Scholar]
  26. Maddox WT, Ashby FG, & Bohil CJ (2003). Delayed feedback effects on rule-based and information-integration category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 650–662. 10.1037/0278-7393.29.4.650 [DOI] [PubMed] [Google Scholar]
  27. Maddox WT, & Ing AD (2005). Delayed feedback disrupts the procedural-learning system but not the hypothesis testing system in perceptual category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 100–107. 10.1037/0278-7393.31.1.100 [DOI] [PubMed] [Google Scholar]
  28. Maugard A, Marzouki Y, & Fagot J (2013). Contribution of working memory processes to relational matching-to-sample performance in baboons (Papio papio). Journal of Comparative Psychology, 127, 370–379. 10.1037/a0032336 [DOI] [PubMed] [Google Scholar]
  29. Medin DL, & Schwanenflugel PJ (1981). Linear separability in classification learning. Journal of Experimental Psychology: Human Learning and Memory, 7, 355–368. 10.1037/0278-7393.7.5.355 [DOI] [Google Scholar]
  30. Medin DL, & Wattenmaker WD (1987). Category cohesiveness, theories and cognitive archeology. In Neisser U (Ed.), Concepts and conceptual development: Ecological and intellectual factors in categorization (pp. 25–62). Cambridge University Press. [Google Scholar]
  31. Mishkin M, Malamut B, & Bachevalier J (1984). Memories and habits: Two neural systems. In Lynch G, McGaugh JL, & Weinberger NM (Eds.), Neurobiology of human learning and memory (pp. 65–88). The Guilford Press. [Google Scholar]
  32. Murphy GL (2002). The big book of concepts. MIT Press. [Google Scholar]
  33. Murphy GL, & Medin DL (1985). The role of theories in conceptual coherence. Psychological Review, 92, 289–316. 10.1037/0033-295X.92.3.289 [DOI] [PubMed] [Google Scholar]
  34. Newell BR, Dunn JC, Kalish M (2010). The dimensionality of perceptual category learning: A state-trace analysis. Memory & Cognition, 38, 563–81. 10.3758/MC.38.5.563 [DOI] [PubMed] [Google Scholar]
  35. Nosofsky RM (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General, 115, 39–57. 10.1037//0096-3445.115.1.39 [DOI] [PubMed] [Google Scholar]
  36. Nosofsky RM, Gluck MA, Palmeri TJ, McKinley SC, & Glauthier P (1994). Comparing models of rule-based classification learning: A replication and extension of Shepard, Hovland, and Jenkins (1961). Memory and Cognition, 22, 352–369. 10.3758/BF03200862 [DOI] [PubMed] [Google Scholar]
  37. Nosofsky RM, Stanton RD, & Zaki SR (2005). Procedural interference in perceptual classification: Implicit learning or cognitive complexity? Memory & Cognition, 33, 1256–1271. 10.3758/bf03193227 [DOI] [PubMed] [Google Scholar]
  38. O’Doherty J, Dayan P, Schultz J, Deichmann R, Friston K, & Dolan RJ (2004). Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science, 304, 452–454. 10.1126/science.1094285 [DOI] [PubMed] [Google Scholar]
  39. Penn DC, Holyoak KJ, & Povinelli DJ (2008). Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds. Behavioral Brain Sciences, 31, 109–178. 10.1017/s0140525x08003543 [DOI] [PubMed] [Google Scholar]
  40. Posner MI, & Petersen SE (1990). The Attention systems in the human brain. Annual Review of Neuroscience, 13, 25–42. 10.1146/annurev.ne.13.030190.000325 [DOI] [PubMed] [Google Scholar]
  41. Premack D (1976). Intelligence in ape and man. Lawrence Erlbaum. [Google Scholar]
  42. Qadri MAJ, Ashby FG, Smith JD, & Cook RG (2019). Testing analogical rule transfer in pigeons (Columba livia) and humans (Homo sapiens). Cognition, 183, 256–268. 10.1016/j.cognition.2018.11.0ll [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Robinson AL, Heaton RK, Lehman RA, & Stilson DW (1980). The utility of the Wisconsin Card Sorting Test in detecting and localizing frontal lobe lesions. Journal of Consulting and Clinical Psychology, 48, 605–614. 10.1037/0022-006X.48.5.605 [DOI] [PubMed] [Google Scholar]
  44. Rolls ET (1994). Neurophysiology and cognitive functions of the striatum. Revue Neurologique, 150, 648–660. [PubMed] [Google Scholar]
  45. Seger CA, & Cincotta CM (2005). The roles of the caudate nucleus in human classification learning. Journal of Neuroscience, 25, 2941–2951. 10.1523/JNEUROSCI.3401-04.2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Seger CA , & Miller EK (2010). Category learning in the brain. Annual Review of Neuroscience. 33, 203–219. 10.1146/annurev.neuro.051508.135546 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Shepard RN, Hovland CI, & Jenkins HM (1961). Learning and memorization of classifications. Psychological Monograph, 75, 1–42. 10.1037/h0093825 [DOI] [Google Scholar]
  48. Shields WE, Smith JD, & Washburn DA (1997). Uncertain responses by humans and rhesus monkeys (Macaca mulatta) in a psychophysical same-different task. Journal of Experimental Psychology: General, 126, 147–164. 10.1016/S0010-0277(96)00726-3 [DOI] [PubMed] [Google Scholar]
  49. Smith JD, Ashby FG, Berg ME, Murphy MS, Spiering B, Cook RG, & Grace RC (2011). Pigeons’ categorization may be exclusively nonanalytic. Psychonomic Bulletin & Review, 18, 414–421. 10.3758/s13423-010-0047-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Smith JD, Berg ME, Cook RG, Boomer J, Crossley MJ, Murphy MS, Spiering B, Beran MJ, Church BA, Ashby FG, & Grace RC (2012). Implicit and explicit categorization: A tale of four species. Neuroscience and Biobehavioral Reviews, 36, 2355–2369. 10.1016/j.neubiorev.2012.09.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Smith JD, Boomer J, Zakrzewski AC, Roeder J, Church BA, & Ashby FG (2014). Deferred feedback sharply dissociates implicit and explicit category learning. Psychological Science, 25, 447–457. 10.1177/0956797613509112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Smith JD & Church BA (2021). A dissociative framework for understanding same-different conceptualization. Current Opinion in Behavioral Science, 37, 13–18. 10.1016/j.cobeha.2020.06.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Smith JD Coutinho MVC, & Couchman JJ, (2011). The learning of exclusive-or categories by monkeys (Macaca mulatta) and humans (Homo sapiens). Journal of Experimental Psychology: Animal Behavior Processes, 37, 20–29. 10.1037/a0019497 [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Smith JD, & Ell S (2015). One giant leap for categorizers: One small step for categorization theory. PLOS One, 23. 10.1371/journal.pone.0137334 [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Smith JD, Flemming TM, Boomer J, Beran MJ, & Church BA (2013). Fading perceptual resemblance: A path for rhesus macaques (Macaca mulatta) to conceptual matching? Cognition, 729, 1598–1614. 10.1016/j.cognition.2013.08.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Smith JD, Jamani S, Boomer J, & Church BA (2018). One-back reinforcement dissociates implicit-procedural and explicit-declarative category learning. Memory & Cognition, 46, 261–273. 10.3758/s13421-017-0762-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Smith JD, Jackson BN, & Church BA (2019). Breaking the perceptual-conceptual barrier Relational matching and explicit cognition. Memory & Cognition, 47, 544–560. 10.3758/s13421-018-0890-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Smith JD, Minda JP, & Washburn DA (2004). Category learning in rhesus monkeys: A study of the Shepard, Hovland, and Jenkins tasks. Journal of Experimental Psychology: General, 133, 398–414. 10.1037/0096-3445.133.3.398 [DOI] [PubMed] [Google Scholar]
  59. Smith JD, Redford JS, & Haas SM (2008). Prototype abstraction by monkeys (Macaca mulatta). Journal of Experimental Psychology: General, 137, 390–401. 10.1037/0096-3445.137.2.390 [DOI] [PubMed] [Google Scholar]
  60. Smith JD, Zakrzewski AC, Johnson JM, Valleau JC, & Church BA (2016). Categorization: The view from animal cognition. Behavioral Sciences, 6, 12. 10.3390/bs6020012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Thompson RKR, Oden DL, & Boysen ST (1997). Language naïve chimpanzees (Pan troglodytes) judge relations between relations in a conceptual matching-to-sample task. Journal of Experimental Psychology: Animal Behavior Processes, 23, 31–43. 10.1037/0097-7403.23.1.31 [DOI] [PubMed] [Google Scholar]
  62. Wattenmaker WD, Dewey GI, Murphy TD, & Medin DL (1986). Linear separability and concept learning: Context, relational properties and concept naturalness. Cognitive Psychology, 18, 158–194. 10.1016/0010-0285(86)90011-3 [DOI] [PubMed] [Google Scholar]
  63. Wickens J (1993). A theory of the striatum. Pergamon Press. [Google Scholar]
  64. Young ME, Wasserman EA, & Garner KL (1997). Effects of number of items on the pigeon's discrimination of same from different visual displays. Journal of Experimental Psychology: Behavior Processes, 23, 491–501. 10.1037/0097-7403.23.4.491 [DOI] [PubMed] [Google Scholar]
  65. Zakrzewski AC, Johnson JM, & Smith JD (2017). The comparative psychology of metacognition. In Call J, Burghardt GM, Pepperberg IM, Snowdon CT, & Zentall T (Eds.), APA Handbook of Comparative Psychology: Perception, Learning, and Cognition (pp. 703–721). American Psychological Association. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

open disclosure form

RESOURCES