Conceptual Anchoring Dissociates Implicit and Explicit Category Learning

J David Smith; Brooke N Jackson; Markie N Adamczyk; Barbara A Church

doi:10.1037/xlm0000856

. Author manuscript; available in PMC: 2023 Jun 1.

Published in final edited form as: J Exp Psychol Learn Mem Cogn. 2021 Feb 1;48(6):813–828. doi: 10.1037/xlm0000856

Conceptual Anchoring Dissociates Implicit and Explicit Category Learning

J David Smith ¹, Brooke N Jackson ¹, Markie N Adamczyk ², Barbara A Church ^1,^2,^a

PMCID: PMC8325699 NIHMSID: NIHMS1671738 PMID: 33523691

Abstract

Categorization researchers have long debated the possibility of multiple category-learning systems. The need persists for paradigms that dissociate explicit-declarative category-learning processes (featuring verbalizeable category rules) from implicit-procedural processes (featuring stimulus-response associations lying beneath declarative cognition). The authors contribute a new paradigm, using perfectly matched exclusive-or (XOR) category tasks differing only in the availability or absence of easily verbalizeable conceptual content. This manipulation transformed learning. The conceptual task alone was learned suddenly, by insightful rule discovery, producing explicit-declarative XOR knowledge. The perceptual task was learned more gradually, consistent with associative-learning processes, producing impoverished declarative knowledge. We also tested participants under regimens of immediate and deferred reinforcement. The conceptual task alone was learned through processes that survive the loss of trial-by-trial reinforcement. All results support the idea that humans have perceptual-associative processes for implicit learning, but also an overlain conceptual system that under the right circumstances constitutes a parallel explicit-declarative category-learning system.

Keywords: category learning, implicit cognition, explicit cognition, associative learning, category rules, procedural learning

Introduction

Categorization is an influential area in cognitive science because it is a crucial ability (e.g., Ashby & Maddox, 2011; Knowlton & Squire, 1993; Murphy, 2002; Nosofsky, 1986; Smith et al., 2008; Smith et al., 2016). Humans could have multiple systems or processes to manage the diverse demands of category tasks involving family-resemblance, rule-based, conjunctive, disjunctive, random, or ad hoc categories, just as they have multiple memory systems or processes. Indeed, memory and categorization systems might be meaningfully interdependent (e.g., explicit categorization and declarative memory; implicit categorization and procedural memory).

However, a theoretical countercurrent holds that assuming multiple categorization systems is unparsimonious and unjustified (e.g., Le Pelley et al., 2019). It suggests that a single system can predict the relevant categorization phenomena (e.g., Newell et al., 2010; Nosofsky et al., 2005) if one assumes that tasks vary in difficulty-complexity and that this variation produces empirical phenomena that are mistaken for the operation of separate systems. It is a lasting idea in cognitive science that singleness is elegant and scientifically preferable.

Thus, a sharp theoretical debate has persisted, especially focused on the possibility of dissociating explicit-declarative categorization processes (featuring verbal category rules) from implicit-procedural processes (featuring associations strengthened by processes akin to operant conditioning). Accordingly, the need persists for additional dissociative paradigms to help resolve this debate. Here, we contribute a new paradigm dissociating implicit and explicit categorization systems as described now.

Implicit-Procedural and Explicit-Declarative Categorization

Implicit category learning.

Our approach draws from the neuroscience of categorization (Ashby et al., 1998; Maddox & Ashby, 2004; Seger & Miller, 2010) that distinguishes different neural systems of learning. A hypothesized implicit system is energized by one of the brain’s primary reinforcement mechanisms. It likely grounds skill and habit learning (Mishkin et al., 1984) and learning during instrumental conditioning as well as some forms of discrimination learning and perceptual categorization (Ashby & Ennis, 2006; Barnes et al., 2005; Knowlton et al., 1996; O'Doherty et al., 2004; Seger & Cincotta, 2005). It is allied to various forms of associative learning, though it does not encompass all forms of associative learning (e.g., Pavlovian conditioning). This form of implicit learning occurs gradually and associatively, relying on trial repetition and immediate reinforcement (Maddox et al., 2003; Maddox & Ing, 2005). Participants learning implicitly may not be aware of their category knowledge or able to verbalize it (Ashby et al., 1998; Ashby & Ell, 2001). This implicit system is linked to particular parts of the basal ganglia. For example, extrastriate visual cortex projects to the tail of the caudate nucleus that then projects on to premotor cortex (Alexander et al., 1986). The caudate nucleus is well situated to associate percepts to actions, perhaps its primary role (Rolls, 1994; Wickens, 1993).

Thinking beyond neuroscience, readers will see that this implicit process has a long history within the literature on categorization. It was central to Shepard et al.’s (1961) founding exploration of the six logical classification tasks. They asked whether a unitary, associative mechanism could account for the tasks’ relative difficulties. It was central to Lee Brooks’ (e.g., 1978) seminal research exploring nonanalytic classification, by which participants associate category labels to specific remembered stimuli. It was the basis of exemplar theory and exemplar models (e.g., Nosofsky, 1986; Nosofsky et al., 1994). It dominated the comparative literature on categorization (because nonverbal animals may have only the associative process—Smith et al., 2004). This implicit process creates behavioral equivalence classes, allowing organisms to behave equivalently toward perceptually similar things. One may think of these equivalence classes as “categories” or not, and readers may differ on this point. Nonetheless, the learning of these classes has been central to the broader categorization literature, and therefore this associative process plays a central role in the multiple-systems debate.

Explicit category learning.

An explicit system of learning could be supported by declarative memory. Explicit learning occurs through hypothesis testing reliant on working memory (Fuster, 1989; Goldman-Rakic, 1987) and executive attention (Posner & Petersen, 1990). These cognitive utilities are known to support hypothesis testing and rule formation (e.g., Brown & Marsden, 1988; Robinson et al., 1980). This explicit system should learn quickly, perhaps suddenly through insightful discovery. Participants would construe the task for themselves and develop their own rule to guide performance. Their explicit category knowledge would be held in working consciousness and should generally be verbalizeable. This explicit neural system is likely grounded in the prefrontal cortex, the anterior cingulate gyrus, the head of the caudate nucleus, and the hippocampus (Ashby et al., 1998; Ashby & Ell, 2001).

Various dissociations have been demonstrated between implicit and explicit category learning (Ashby & Valentin, 2017). For example, Smith et al. (2014) used a regimen of deferred reinforcement that is one aspect of the present method. Participants completed a block of trials with no feedback. Then they received their positive outcomes grouped together and following that their negative outcomes grouped together. Reinforcement was displaced in time from trial performance. Knowing which trials were completed correctly was difficult and therefore associatively crediting reinforcement to particular stimulus-response pairs was impossible.

Given this feedback regimen, participants learned matched category structures thought to elicit implicit-procedural and explicit-declarative learning. Smith et al. (2014) observed a striking dissociation. Implicit learning was devastated under deferred reinforcement, because of the difficulty in assigning reinforcement credit already described. But explicit learning remained intact, because participants held in active mind the rule applied during the block, evaluated the rule’s success at block’s end, and kept or replaced it. Smith et al. (2018) found converging dissociative results using a different reinforcement regimen that is also incorporated here.

Empirical Goals

Though these dissociations support the dissociative framework described, that framework is not universally accepted. Therefore, we pursued three empirical goals to develop a distinctive dissociative paradigm that could advance the debate.

One empirical goal was to move this area beyond its dependence on the rule-based (RB) and information-integration (II) tasks that are often used to distinguish explicit and implicit learning (e.g., Ashby & Valentin, 2017). The RB-II dissociative framework is illuminating. We use it (e.g., Smith et al., 2012). However, it causes interpretative problems. One problem is that the nonidentical RB and II tasks are rotations of one another in perceptual space, so that the RB and II category solutions can be construed to have different dimensionalities. This raises questions about the difficulty-complexity of the tasks and can seem to give comfort to a single-system description. So, we sought a pair of implicit/explicit tasks that were identically structured in perceptual space and equated for stimulus dimensionality and other stimulus-to-stimulus relationships. We thought this might give us equivalent complexity and allow a stronger dissociative interpretation.

A second major empirical goal was to combine in our study several of the components that have created dissociative demonstrations in this literature. We hoped in this way to create within one paradigm a particularly strong and meaningful implicit-explicit dissociation. To this end, we studied the backward learning curves that can catch humans having explicit and instantaneous categorization insights. We studied category verbalization to distinguish explicit from implicit learning within a categorization task. We studied alternative reinforcement regimens that were theoretically predicted to differentially affect explicit and implicit learning.

A third major empirical goal was to produce our dissociation in a new way, by varying the kind and level of category knowledge that the implicit and explicit tasks foster. It has been a tacit assumption within the perceptual categorization literature, and within the RB-II area of study, that explicit category solutions take the form of conceptual rules that are often verbalizeable. This raised the possibility that we might produce a dissociation by varying the cognitive affordances of our tasks, so that one task fostered abstract-conceptual cognition more than the other. Then, by a range of converging measures, we could catch the explicit mind in the act of seizing these conceptual affordances. In this respect, the present approach represented constructive outreach from research on perceptual categorization to research on conceptual categorization and higher-level concepts (e.g., Medin & Wattenmaker, 1987; Murphy & Medin, 1985; Wattenmaker et al., 1986). It seems to us that there are possibilities for constructive cooperation and cross talk across the perceptual and conceptual areas of categorization research, though these have largely remained separate in the past. One crucial binding principle might be, for example, that explicit categorization is after all a privileged locus for conceptual and theory-based categorization.

Empirical Approach

We built matched pairs of XOR category tasks. XOR tasks have had a distinguished career in psychology (e.g., Feldman, 2000; Nosofsky et al., 1994; Shepard et al., 1961; Smith et al., 2004). For example, Shepard et al. (1961) discovered that the XOR learning trajectory and error pattern disconfirmed the unitary associative-learning theory they were exploring. Rather, participants seemed to be using rules and dimensional hypotheses, which Shepard et al. thought were likely carried by the explicit symbolic vehicle of language. It is remarkable that the literature on implicit and explicit category learning is still trying—after 60 years—to possibly find its way back to Shepard et al. That the systems debate remains one of the central debates in categorization after six decades underscores the theoretical importance of resolving it.

Our task pairs presented a crucial contrast. Though the tasks were logically identical, with the tasks’ stimuli placed and spaced apart identically in the same two-dimensional space, one task was configured so that it might make conceptual content discoverable as the task unfolded trial by trial. This task presented an abstract-conceptual affordance that could support correct classification. The other task simply had its stimulus values shifted globally (a simple coordinate translation through stimulus space), so that it did not present such an affordance that we (or, as it turned out, participants) could discern. In this latter task, we thought that perceptual appearance and associative learning might dominate. We will refer to the conceptual and perceptual tasks, respectively, to communicate this distinction.

Participants completed tasks under the contrasting reinforcement regimens of deferred and immediate reinforcement (e.g., Smith et al., 2014, 2018). Deferred reinforcement appears to suppress an important kind of reinforcement-based associative learning, while leaving relatively intact participants’ processes of hypothesis testing and explicit rule learning. Thus, this contrast seemed apt to dissociate in an additional way implicit and explicit category-learning processes.

We predicted that 1) the conceptual and perceptual tasks would be learned rapidly and slowly, respectively; 2) the tasks would be learned suddenly and gradually, respectively; 3) only the conceptual task would be robustly learnable under deferred reinforcement; 4) only the conceptual task would elicit clear verbalizations of the category principle; and 5) only the perceptual task (in our view, requiring associative learning) would become essentially unlearnable under deferred reinforcement (in our view, disabling the necessary type of associative learning).

We believe that no single-system description can make these predictions. A single-system description cannot predict both sudden insight learning and slow gradual associative learning (e.g., Smith & Ell, 2015). It cannot explain why the two kinds of category knowledge—perceptual and conceptual—would obey different processing principles. There are not perceptual and conceptual levels in a single system. It cannot explain why deferred reinforcement disrupts one kind of learning but not the other if attentional difficulty is held constant. From the single-system perspective, there must not be two forms of learning. Finally, the single-system viewpoint cannot explain why one form of category knowledge would be conscious and verbalizeable and one not. We acknowledge that specifying the single-system’s predictions is fraught, because its proponents have not clearly defined the single system they endorse. For example, Le Pelley et al. (2019, p. 1408) explicitly acknowledged that they could not provide “rigorous definitions of the interrelated constructs of cognitive complexity, memory demands, and task difficulty” on which the single-system idea depends heavily.

Nonetheless, confirming our five converging positive predictions would strongly suggest that humans have perceptual-associative processes for implicit learning, but also an overlain conceptual system that under the right circumstances constitutes a parallel explicit-declarative system for category learning. In fact, because the explicit and implicit processes considered in this article stand so diametrically opposed across many dimensions of human cognition, they could turn out to represent the clearest possible dissociation between two learning systems.

Experiment 1

Method

Participants.

One hundred and eighty-two Georgia State undergraduates¹—with normal or corrected vision—participated for partial course credit. Sessions lasted for 52 minutes or 480 trials. Participants were assigned randomly to a task and reinforcement condition using their sequential participant number. Because of the need to supply different verbal instructions, immediate reinforcement participants (receiving immediate feedback after every trial) and deferred reinforcement participants (receiving deferred feedback only after an ensuing trial had been completed) were tested separately in groups of up to four at a time. Participants’ data were excluded if they completed fewer than 480 trials or did not complete a final questionnaire². The final data set included 40, 31, 39, and 28 participants divided, respectively, among these conditions: conceptual task/immediate reinforcement (CI+), conceptual task/deferred reinforcement (CD), perceptual task/immediate reinforcement (PI), and perceptual task/deferred reinforcement (PD). Twelve participants were excluded for not enough trials (0, 2, 2, and 8, respectively, in the four conditions). Thirty-two participants were excluded for not completing a final questionnaire (4, 15, 3, and 10 participants, respectively). These exclusions were principled and necessary—they let us equate learning experience, analyze completed protocols, and study performance levels in relation to participants’ introspections and verbalized rules.

Stimuli.

The stimuli were red rectangles, varying in width and height, presented on a black background in the computer screen’s top center (Figure 1). There were four stimuli in each task, two contrasting but memorizeable stimuli in each category. Category A stimuli occupied the lower-left and upper-right quadrants of the stimulus space. Category B stimuli occupied the upper-left and lower-right quadrants. These placements honored the tasks’ XOR structure.

The widths and heights of stimuli varied between tasks. In the conceptual task, the Category A stimuli were 65 (width)-44 height and 130-88 in screen pixels. These dimensions produced squares on our running screens, given that screen pixels are taller than they are wide. This abstract property of Category A stimuli provided a conceptual entry into this task beyond just memorizing four shapes and their correct category responses. The B stimuli were 65-88 and 130-44 in pixels—respectively, nondescript standing-up and lying-down rectangles.

In the perceptual task, we simply added 100 pixels to the widths of all four stimuli. The A stimuli were 165-44 and 230-88 in pixels. The B stimuli were 165-88 and 230-44. Now all stimuli were lying-down rectangles, highly discriminable and memorizeable as shapes but presumably not conceptually codeable in some additional way that could support correct categorization. Of course, participants might have discovered some conceptual construal of the task that we could not discern. For this reason, the experiment’s results will offer their own comment on this presumption. Even absent such a conceptual cue, the four stimuli were mutually discriminable as shown in Figure 1. They varied by a factor of 3 in area and in shape as well. The stimuli were also easily perceptible, brightly colored, and large.

Categorization trials.

On each trial, the rectangle that was to be categorized appeared at the screen’s far right. Leftward were the large letters “A” and “B”, with a participant-controlled cursor between them. Participants pressed keys (S, L, labelled A, B, corresponding spatially to the two screen icons). Top and Bottom trials were displayed at the screen’s top and bottom, for reasons to be explained.

Reinforcement regimens.

Our crucial manipulation was to disrupt the normal cycle of immediate reinforcement following response. In the immediate reinforcement conditions, this cycle was sustained. Participants saw a stimulus, categorized it, and received immediate reinforcement. After correct responses, they saw Correct +1 Points Total Points N+1. After incorrect responses, they saw Incorrect −1 Points Total Points N-1. In the latter case they received a brief penalty timeout.

In the deferred reinforcement conditions, feedback was deferred as follows. Following a Bottom trial, participants received deferred feedback regarding the previous Top trial. For example, they might see (given a correct response) presented at the top of the screen in the position for Top trial feedback, Last Trial Correct +1 Points Total Points N+1. Or, following a Top trial, they might see (given an incorrect response) presented at the bottom of the screen in the position for Bottom trial feedback, Last Trial Incorrect −1 Points Total Points N-1. In the latter case they received a brief penalty timeout. The next trial then followed.

The deferred feedback was positioned spatially to show to which trial the feedback pertained—this was the purpose of the alternating Top and Bottom trials (see Figure 2 for an example). This reinforcement did not concern a presently available stimulus, or the most recent stimulus presentation, or the most recent behavioral response. Thus, associative learning was disrupted representationally, temporally, and behaviorally.

Instructions: immediate reinforcement condition.

Participants were told they would categorize boxes as Category A or B, that A and B boxes would occur equally often, and that they would have to guess at first but could learn to respond correctly. They were told that even though the boxes alternated top and bottom on the screen, this had nothing to do with their Category A or B status. They knew they would gain or lose points for correct and incorrect responses, respectively, and that they would receive a timeout for errors. They were told that errors would cost them points, and time to earn points, and possibly make their session longer. They were told they would receive immediate feedback after each trial.

Instructions: deferred reinforcement condition.

The instructions were similar, excepting the description of reinforcement. Participants were told that after Trial 2 they would receive feedback from Trial 1, and so forth, with feedback always lagging one trial behind. They were told that even though the boxes alternated top and bottom on the screen, this had nothing to do with their Category A or B status, but was done to help them keep track of whether the feedback they received applied to a top or bottom trial.

Debriefing phase.

An additional aspect of our method was to collect participants’ explicit reports of their task construals, especially in the conceptual tasks that allowed a conceptual construal. Therefore, we gave them a debriefing questionnaire containing these items:

Why did you use Response A in this task? That is, what WAS a Type A Stimulus?

Why did you use Response B in this task? That is, what WAS a Type B Stimulus?

These let us analyze participants’ declarative understanding of the categories they had learned.

Analytic methods: questionnaires.

The questionnaire items were independently blind coded by two research assistants in our laboratory. They were instructed to give a questionnaire a 1 or 2 if the participant indicated that they simply used either the height or the length of the box to categorize. They were to give a questionnaire a 3 if the participant said “A’s were squares and B’s rectangles” (the correct verbalization of the conceptual task), or a 4 if they said the equivalent of “A’s are shorter and narrower or taller and wider; B’s are taller and narrower or shorter and wider (the correct verbalization of the perceptual task). In essence, the latter verbalization simply described the four stimuli in an item-specific manner, because in this case there is no constructive conceptual reframing. Codes 5 and 6 were used for explanations like those for codes 3 and 4 but flipping which were A’s and B’s. Raters used code 7 for anything else. A third rater provided back-up ratings that were used only to resolve the rare disagreements between the two students, who were blind to each other’s ratings and also blind to the level of performance achieved by the participant who completed the questionnaire. Interrater reliability between the first two raters was 85.5% agreement, Cohen’s κ = .777. This approach to coding participant reports—using a third rater to break ties—was somewhat conservative regarding testing our hypotheses, because it ensured that we retain for analysis all of the difficult and uncertain reports that caused our original raters difficulty. Finally, a 4^th blind rater coded approximately 25% of the data, and showed a concordance of 91.43% agreement, Cohen’s κ = .863, with the final coding.

Results

Accuracy analyses.

Figure 3 presents proportions of correct responses for each condition by 20-trial block. Table 1 presents the means and standard deviations of accuracy overall and in the last 100 trials. The data from Figure 3 were entered into a three-way general linear model (GLM) with task (conceptual, perceptual) and reinforcement (immediate, deferred) as between-participant factors and trial block as a within-participant factor. The significant main effects of task F (1, 134) = 46.265, p < .001, η_p² = .257, and reinforcement, F (1, 134) = 36.109, p < .001, η_p² = .328, showed that participants performed better in the conceptual task and the immediate condition. The significant main effect of trial block F (23, 3082) = 22.770, p < .001, η_p² = .145, confirmed that learning occurred. The significant interactions between task and reinforcement, F (1, 134) = 3.930, p = .049, η_p² = .028, block and reinforcement, F (23, 3082) = 4.064, p < .001, η_p² = .033, and the three-way, F (23, 3082) = 3.930, p < .001, η_p² =.029, suggested that reinforcement differentially affected performance across tasks. The insignificant interaction between task and block, F < 2, suggested that the learning trajectory was similar in the two tasks. Parallel GLM analyses were also conducted adding in the participants dropped for missing questionnaires, as requested by an interested reviewer. These found the same pattern of results except that the task by block interaction became significant, F (23, 3818) = 2.874, p < .001, η_p² = .017 with larger N’s in each group (CI = 46, CD = 44, PI = 42, PD = 38).

*Note:* CI = Conceptual Immediate, CD = Conceptual Deferred, PI = Perceptual Immediate, PD = Perceptual Deferred.

Table 1.

Means and standard deviations (SD) of proportion correct across all trials and in the last 100 trials each condition.

Condition	N	All Mean	All SD	Last 100 Mean	Last 100 SD
Conceptual Immediate	40	.954	.090	.979	.079
Conceptual Deferred	31	.795	.209	.838	.235
Perceptual Immediate	39	.828	.163	.907	.157
Perceptual Deferred	28	.566	.127	.530	.158

Open in a new tab

To clarify the three-way interaction, we conducted three two-factor GLMs with task and reinforcement as between-participant factors. One used the difference score between the first and last block as the dependent variable to encompass the whole learning trajectory. The other two used the first and last blocks separately as the dependent measure to explore the early or late focus of the trajectory differences. When examining the learning scores, there was a significant main effect of reinforcement, F (1, 134) = 18.200, p < .001, η_p² = .120, reflecting more learning with immediate reinforcement. The crucial significant interaction, F (1, 134) = 14.416, p < .001, η_p² = .097, reflected sharply reduced learning in the PD condition (all other Fs < 1). Planned comparisons found that immediate and deferred learning levels were not statistically distinguishable in the conceptual task, t (69) = .306, p = .760, d = .071, but were significantly different in the perceptual task, t (65) = 6.385, p < .001, d = 1.662. The very high performance (.907 correct) shown by participants in the last 100 trials of the PI condition confirms the full discriminability of the stimuli in that task and their consequent individual memorizeability.

Separate analyses of the first and last blocks found significant main effects of task, F (1, 134) = 30.528, p < .001, η_p² = .186, F (1, 134) = 28.577, p < .001, η_p² = .176 (first, last block respectively) and reinforcement, F (1, 134) = 4.713, p = .032, η_p² = .034, F (1, 134) = 63.145, p < .001, η_p² = .320, indicating that conceptual tasks and immediate reinforcement produced faster and more total final learning. (Indeed, this speed advantage was expressed even in the elevated correct proportions for participants in Block 1, some of whom are already discovering the conceptual rule.) There was a significant interaction between task and reinforcement only in the last block, F < 2, F (1, 134) = 13.443, p < .001, η_p² = .091, reflecting bigger differences in final learning between the reinforcement conditions in the perceptual condition, but less differential effect in initial learning. Planned comparisons of immediate versus deferred reinforcement in each task (Conceptual and Perceptual, respectively) during the first block, t (69) = 2.027, p = .047 d = .476, t (65) = .914, p = .364, d = .229 and the last block, t (69) = 3.162, p = .002, d = .723, t (65) = 7.862, p < .001, d = 1.907, supported this interpretation.

Backward learning curves.

Following Smith et al. (2014) and Smith and Ell (2015), we examined the suddenness of learning in the different conditions. We defined learners as those who at any point completed three consecutive 20-trial blocks with 0.95 accuracy. In the four conditions (CI, CD, PI, PD), there were 39, 23, 31, and 3 learners. Chi square analyses found that the proportion of learners in the four conditions was significantly different from expected learning levels if there were no relationship between condition and learning, χ² (3, N = 138) = 61.443, p < .001; w = .667. Parallel Chi Square analyses including the participants dropped for missing questionnaires found the same significant result with a greater number of learners (CI = 43, CD = 34, PI = 34, PD = 4).

Next, we studied the trajectory by which participants arrived at criterion. To do so, we created backward learning curves (BLCs). That is, we aligned the trial blocks at which learners reached criterion, and we examined performance levels backward and forward from that point. Readers should note the following idiosyncrasy of BLCs. Participants who reached criterion very quickly in the task will have fewer pre-criterion data points but more post-criterion data points. For late learners, the pattern is reversed. Thus, different participants and different numbers of participants are captured by different data points in an BLC graph. The alignment of all participants’ performance at the start of the criterion run lets one see the consensual pathway toward criterion taken by learners. Without alignment, the arrival at criterion would fall in many different blocks for many different participants, and any consensual pathway would be muddied away. Figure 3 presented the data in this way—this is the idiosyncrasy of Forward Learning Curves.

Illustrating the BLC analysis, Figure 4 shows BLCs for each condition. The solid black circles show the performance of 39 learners in the CI condition. Performance transformed at the point of the criterion run. In the 5 blocks before that point, participants averaged .550 correct—essentially chance performance. In the 5 blocks after criterion, they averaged .992 correct (a sudden improvement of .442 or suddenly 9 more correct responses per 20-trial block) and sustained that level going forward. We believe that this pattern is consistent with, and only consistent with, the sudden discovery of a conceptual category rule (e.g., Smith & Ell, 2015). No operant associative-learning mechanism produces a learning curve of this character. This pattern of instantaneous, insightful rule discovery diagnoses uniquely the explicit category-learning processes that for some reason have remained so controversial within the categorization literature. Shortly we will add verbalizations—declarative category knowledge—into the mix.

*Note: A*. CI and CD conditions. B. PI and PD conditions. We aligned the trial blocks at which participants reached criterion (Block 0), to show the path by which they solved their task. C: Conceptual P: Perceptual I: Immediate Reinforcement D: Deferred Reinforcement.

From this conclusion follows predictions for the CD condition. First, because testing hypotheses is more complicated under deferred reinforcement, we expected fewer criterial learners. There were 23 learners, not 39 as before. Nonetheless, we expected to see the same saltatory leap of rule discovery among the successful learners. And we observed just that (.552 correct, 5 blocks before criterion, .986 correct, 5 blocks after criterion, a transition of .434—Figure 4A, grey triangles).

In contrast, in the PI condition, we expected rule discovery to occur less strongly because the conceptual content was less discoverable given the nature of the rectangle stimuli. However, under immediate reinforcement, participants could still learn associatively, gradually building up response strengths binding correct responses to the four specific stimuli. Or, they could use exemplar memorization to connect the 4 discriminable stimuli to correct responses. It is good to remember that implicit learners may well not be learning the formal XOR concept, or even a differentiation between two coherent categories, but rather learning associative connections from specific stimuli to correct response choices. In fact, we did have 31 learners in this condition. But learning obeyed a different dynamic (black circles in Figure 4B). Now there was a more gradual improvement in performance approaching the criterion run, as befits the gradual strengthening of correct stimulus-response mappings (.726 correct, 5 blocks before criterion, .976 correct, 5 blocks after criterion, a transition of .251). More precisely, performance increased from 0.777 in the last block before criterion to 0.953 at the criterion block itself. This was a performance change of .176, a minimal transition (i.e., less than 4 more correct responses per block) compared to the conceptual conditions. We also confirmed these differences statistically through comparisons of the corrected³ criterion scores (Block 0 minus Block −1) for learners in each task. We found that when compared to the perceptual immediate condition both the conceptual immediate, t(68) = 5.201 , p < .001, d =1.242, and the conceptual deferred conditions, t(52) = 4.772, p < .001, d = 1.318, showed significantly greater jumps in learning.

Moreover, some of the small transition in the PI condition is artifactual. The criterial blocks must contain high performance levels. They cannot contain any block of 16/20 (80%) or lower, for then criterion is unreachable. Such a block would be relegated to the phase before criterion, creating an artifactual pre-post separation. Similarly, the block just before criterion essentially cannot contain any block of 17/20 (85%) of higher, or this block would just serve to define criterion earlier. The artifactual separation is accentuated. Smith and Ell (2015) discussed this selective sampling aspect of BLCs in detail.

Simulations.

We used formal modeling to examine our results from this perspective. We placed simulated participants into a 100-block “task”, with performance governed by a pre-criterion and post-criterion underlying competence that we could let vary systematically from 0.5 to 1.0. In our taskless simulation, correct performance was simply determined by the throw of a 100-sided dice weighted by the operative competence (e.g., 80 successes in 100 for an 0.80 competence). Given this framework, we could isolate the simulants that 1) could reach the defined criterion used in the actual experiment; 2) could produce the performance we observed before criterion; and 3) could produce the performance we observed after criterion. In this way, we could ask what underlying competence levels our real participants likely had at different points in the task.

Figure 5A shows the result of our simulation for the CD condition we have already discussed. This graph illustrates all the levels of competence before and after criterion that can produce the data pattern we observed (i.e., .552 and .986 before and after criterion). Competence before criterion must be extremely low (maximum 0.69). Competence after criterion must be extremely high (minimum 0.90), respectively. There is a minimum gap of 0.21 between these. This inherent gap represents the pure leap of insight / competence that we are considering in this article. The formal simulation confirms that this gap must exist, just as did the extensive simulations in Smith and Ell (2015).

*Note: A*. CD condition. B. PI condition. For example (A), these participants had a maximum underlying competence of .69 before criterion and a minimum .90 competence after reaching criterion. There was an instantaneous increase of at least .21 in this transition.

Figure 5B shows in the same way the result of our simulation for the PI condition. This graph illustrates all the levels of competence before and after criterion that can produce the data pattern we observed (i.e., .726 and .976 before and after criterion). Competence after criterion can sink down into the mid 0.80’s and still produce what we observed in that phase. But competence before criterion in the mid 0.80’s can also predict what we observed in that phase. Thus, this simulation is consistent with the possibility that in the PI condition there was not any intrinsic leap at all. Only in the conceptual conditions must one conclude that there is a sudden leap in competence. Smith and Ell (2015) showed that gradualistic associative models cannot accommodate these sudden performance improvements.

Finally, we point out that the results from the PD condition also fit our predictions. Our theoretical perspective suggested that the perceptual task would deny participants the explicit learning process that sometimes survives deferred reinforcement. And the deferred reinforcement would deny participants the complementary implicit-associative process. Learning should have collapsed, and it did. There were 3 criterial learners, about 10% of the sample. In the aggregate there was almost no learning at all.

Verbalizations.

Fifty-nine of 71 participants (83%) were coded to have correctly described the conceptual condition. In the CI condition, with 40 total participants, with 39 strong learners, 38 participants (all of them learners) stated the task’s conceptual grounding. There was essentially a perfect concordance between successful learning and the task’s conceptual construal. In the CD condition, with 31 total participants, with 23 strong learners, 21 participants (all of them learners) reported the task’s conceptual grounding. All non-learners failed to report this construal. There was a nearly perfect concordance between learning and a conceptual declaration again. In both conceptual conditions, participants’ explicit-declarative category knowledge accounted extremely sensitively for their performance and for the suddenness of their rule discovery. Thus, the verbalizations were also consistent with the theoretical interpretation that explicit-declarative learning characterized the conceptual conditions. These participants discovered the alternative conceptual reframing afforded by the task.

On the other hand, only 12 of the 67 participants in the perceptual tasks (18%) were coded to have correctly described the perceptual condition. This percentage is more than four times smaller than we observed in the conceptual conditions. Moreover, of those 12, five showed their knowledge by drawing (not verbalizing) their knowledge. The others simply described more or less fully all four specific stimuli that appeared within the task. This confirms our sense as we prepared the experiment that the perceptual task would not provide to participants an alternative abstract-conceptual route to reframing the task, so that they would need to fall back on memorizing the four stimuli. These participants apparently did not discover an alternative conceptual reframing of their task. In the immediate reinforcement condition, with 39 total participants and 31 strong learners, 10 were coded as correctly describing the task (i.e., correctly describing the four stimuli presented under the correct category label). Of those, eight were learners (2 nonlearners) and four drew rather than verbalized their description. In the deferred reinforcement condition, with 28 total participants and 3 strong learners, 2 of those strong learners were coded as correctly describing the XOR rule. One drew the stimuli, and one verbalized their correct description. Learning in the perceptual task was thus more visual and more about specific-item associations to the four stimuli. The availability of these specific-item associations confirms in another way the full individual discriminability of the stimuli in the perceptual task.

Experiment 2

Experiment 2 explored these phenomena in a procedure different in three respects. We used a different stimulus domain, involving multiple circle stimuli to be judged relationally, instead of one rectangular stimulus. We adopted a different reinforcement regimen, one involving deferred reinforcement delivered only after the completion of each six-trial block. Third, the available conceptual construal was now about the abstract conceptual relation (Same or Different) between two stimuli, rather than a shape label given one stimulus. In many other respects, the methods were like those in Experiment 1 and so we only note the points of contrast.