Category learning can alter perception and its neural correlates

Fernanda Pérez-Gay Juárez; Tomy Sicotte; Christian Thériault; Stevan Harnad

doi:10.1371/journal.pone.0226000

. 2019 Dec 6;14(12):e0226000. doi: 10.1371/journal.pone.0226000

Category learning can alter perception and its neural correlates

Fernanda Pérez-Gay Juárez ^1,^2,^*, Tomy Sicotte ², Christian Thériault ², Stevan Harnad ^1,^2,³

Editor: Panos Athanasopoulos⁴

PMCID: PMC6897555 PMID: 31810079

Abstract

Learned Categorical Perception (CP) occurs when the members of different categories come to look more dissimilar (“between-category separation”) and/or members of the same category come to look more similar (“within-category compression”) after a new category has been learned. To measure learned CP and its physiological correlates we compared dissimilarity judgments and Event Related Potentials (ERPs) before and after learning to sort multi-featured visual textures into two categories by trial and error with corrective feedback. With the same number of training trials and feedback, about half the subjects succeeded in learning the categories (“Learners”: criterion 80% accuracy) and the rest did not (“Non-Learners”). At both lower and higher levels of difficulty, successful Learners showed significant between-category separation—and, to a lesser extent, within-category compression—in pairwise dissimilarity judgments after learning, compared to before; their late parietal ERP positivity (LPC, usually interpreted as decisional) also increased and their occipital N1 amplitude (usually interpreted as perceptual) decreased. LPC amplitude increased with response accuracy and N1 amplitude decreased with between-category separation for the Learners. Non-Learners showed no significant changes in dissimilarity judgments, LPC or N1, within or between categories. This is behavioral and physiological evidence that category learning can alter perception. We sketch a neural net model predictive of this effect.

Introduction

The linguists Sapir (1929) and Whorf (1940; 1956) suggested that the language we speak shapes the way we see the world. According to this “linguistic relativity” hypothesis, it is learning to put things into different categories by giving them different names that makes them look more distinct to us, rather than vice versa [1]: for example, the rainbow looks to English-speakers as if it were composed of qualitatively distinct color bands because of the way English subdivides and names the visible wavelengths of light; the different shades of green all look like greens rather than blues because in English we learn to call them “green” rather than blue. In languages that use the same word for green and blue (the equivalent of “grue,” [2], the speakers would see only one qualitative “grue” band in the rainbow, instead of a green and a blue one.

It has turned out, however, that qualitative color categories are not perceptual effects induced by category naming. The anthropologists Berlin and Kay showed that basic color perception is universal, irrespective of the names and subdivisions assigned by different languages [3]. Visual neurophysiology has confirmed that the colors we see and the boundaries between them are determined by inborn neural feature-detectors: The cones in our retinas are selectively tuned to the red, green and blue regions of the frequency spectrum and our visual cortex has color-sensitive neurons responsible for paired red/green and blue/yellow opponent processes [4–6]. Hence the perceived qualitative differences among colors are not the result of language but an inborn consequence of Darwinian evolution. Is this enough to demonstrate that Whorf and Sapir were wrong about the effects of naming on perception? To answer this question we must consider an activity more basic than naming, and a prerequisite for it: categorization.

To categorize is to do “the right thing with the right kind of thing”: responding to things differentially, manipulating them adaptively, sorting them into groups and giving them different names [7,8]. According to the “classical view” of categorization [9], what determines whether something is or is not a member of a category is the features that covary with membership in the category: present in members, absent in non-members. Features are initially sensory properties of things, such as size, color, shape, loudness or odor.

Categorical perception (CP) is a perceptual phenomenon in which the members of different categories are perceived as more dissimilar (between-category separation) and/or the members of the same category are perceived as more similar (within-category compression) than would be expected on the basis of their physical features alone [10–12]. The rainbow effect in color perception is actually a striking example of CP: The wave-length difference between a blue and a green looks much bigger than an equal-sized wave-length difference between two shades of blue within the blue band. Color CP, however, is, as noted above, dependent on inborn feature-detectors and hence not directly related to language or learning. To test the Whorf-Sapir hypothesis the right question to ask is: what happens with the categories that we have to learn through experience?

If we open a dictionary, we encounter mostly names of categories that we had to learn through either direct experience or verbal instruction [13,14]. It is very unlikely that we were born with innate detectors for all these categories. If, as suggested by the classical view of categorization, we need to detect the features that distinguish category members from nonmembers so that we can do the right thing with the right kind of thing, then with categories for which we have no inborn feature-detectors our brains need to learn to detect the features [15,16].

Many categories are obvious, or almost obvious: The differences between members and non-members already pop out. There is no need for learned CP separation/compression to distinguish zebras from giraffes: Their prominent natural difference in shape and color is enough. But the obvious similarities and differences in the sensory appearances of things are not always enough to guide us as to what to do with what—at least not for some categories, and not immediately. For categories whose covarying features are harder to detect (rather than evident upon repeated exposure without corrective feedback), learning to categorize may be more challenging and time-consuming.

Learned CP occurs when category learning induces between-category separation and/or within-category compression (the category boundary effect). This effect is not based on comparing perceived differences between and within categories for equal sized physical differences, as with colors and phonemes. It is based on comparing perceived differences between and within categories before and after having learned the categories.

It is important to distinguish CP induced by category learning from increased distinctiveness overall induced by mere repeated exposure, without corrective feedback. [17]. Classical acquired distinctiveness of stimuli (not categories), induced by repeated exposure [18] (unsupervised/unreinforced learning based on feature and feature-co-occurrence frequencies) makes all stimuli look more distinct from one another, like an expanding universe. Acquired distinctiveness of categories (comparing stimuli within and between categories) can arise through unsupervised learning if the categories are already well separated in sensory space by salient natural sensory discontinuities (like mountains vs plains vs valleys). But if the features distinguishing the categories are not already salient, trial-and-error with error-corrective feedback (supervised learning) is necessary to learn them. With acquired distinctiveness between categories (and acquired similarity within categories) the outcome is not the result of an expanding-universe effect in which all stimuli become more distinct from one another; instead, stimuli become more distinct from one another if they are in different categories (and sometimes they also become more similar to one another if they are in the same category).

Hence learning some categories does not generate category-specific CP, but merely an overall increase in all interstimulus distances [19–23] whereas learning other categories does generate CP [24–31]. Studies vary in the stimuli and tasks they use and how they measure CP effects (e.g., via similarity judgments, psychophysical discriminability or electrophysiological correlates); and CP effect-sizes vary considerably across studies in the relative degree of separation or compression they induce [32]. But the learned CP effect itself seems to be real. The question is: what factors induce it, and why?

Most authors attribute learned CP effects to feature-detection. A variety of psychophysical studies have shown that learning the features (or dimensions) relevant to category membership increases perceptual and attentional sensitivity to those features, resulting in easier detection [29,33]. The feature-detector may act as a filter, altering the perceived similarity between and within categories to make category members “pop out” [34–36] so that we can reliably go on to do the right thing with them.

Some learned CP effects are still open to the interpretation that they are not perceptual changes but a response bias from having learned to name the category (“naming bias” or “category label bias”): a tendency to judge things as less similar when their names are different and more similar when their names are the same [37–39]. One way to test whether CP effects are perceptual or verbal is to analyze brain activity during category learning.

Recent research in visual neuroscience suggests that early perceptual systems are not hardwired; they can be tuned by several types of information, including attention, expectation, perceptual tasks, working memory and motor commands [40]. These modifiable properties become important in extracting relevant information from the environment.

Among the learned CP studies cited above, some were accompanied by neuroimaging or electrophysiological analyses that detected neural changes induced by training. In a series of experiments, Sigala and Logothetis [41] have found that neurons in the inferior temporal cortex of monkeys could be selectively tuned to dimensions diagnostic of category membership, with enhanced neural activity in response to features relevant for categorization. These findings have been extended through non-invasive neuroimaging studies in humans [22,42]. Folstein and his team trained human subjects to categorize a series of cars, counterbalancing the relevant dimension across subjects. Having learned the category, subjects performed a match-to-location task inside the fMRI scanner: They were presented with two successive stimuli and asked to indicate only whether they appeared in the same location. The researchers found changes in the activity of both the anterior fusiform gyrus and the extrastriate occipital cortex when the cars differed on the category-relevant dimension (i.e., they belonged to different categories rather than the same category). This suggests that learning a category enhances the detectability of distinguishing features not only in the temporal association cortex that is related to high level visual processing, but also in earlier stages of perception that take place in the extrastriate visual cortex [29,43].

Event Related Potentials (ERPs), with their precise temporal resolution, provide information about the time course of stimulus processing: For example, semantic and visual processes during categorization can be dissociated in their time course as well as their location [44]. In category learning, ERP changes can help distinguish perceptual effects (earlier ERP components) from post-perceptual cognitive effects (later ERP components). Previous studies have found ERP correlates of categorization and category learning. [45–47], but to our knowledge none of them have explored their link with behavioural measures of CP.

The present study

To test whether category learning induces between-category separation and within-category compression (the signature of CP), subjects were trained by trial and error with corrective feedback (supervised/reinforcement learning) to sort unfamiliar visual stimuli (black and white textures) into two categories. To avoid having local, verbalizable features, the textures were designed to generate a holistic perceptual effect. Pairwise interstimulus dissimilarity judgments (between categories and within categories) as well as scalp-recorded ERPs elicited by the stimuli were compared before and after learning the categories. A neural net model for category learning and feature filtering [48] predicted that the category learning would generate between-category separation and within-category compression. This prediction was confirmed, both perceptually and physiologically, first in an exploratory study and then in an independent replication (Experiment 2). The behavioral findings have been reported in a previous paper [48]. The present paper reports the physiological correlates of these behavioral findings and identifies the ERP correlates of learned CP. An increase in the perceived difference between members of different categories, accompanied by an increase in negativity in a perceptual component of the ERP (N1), occur after categorization training, but only in the successful learners. Those who fail to learn the category show no change in their perception, and no change in their N1.

Experiment 1

Materials and methods

This project was approved by the Comité institutionnel d'éthique de la recherche avec des êtres humaines, Université du Québec à Montréal, approval number 803_e_2017.

Subjects

Forty-two right-handed subjects (20 female, 22 male) aged 18–35 years were recruited online through Kijiji and the McGill Classified Ads website. They were either native English-speakers or native French-speakers and free of significant neurological or psychiatric conditions. Each subject was assigned randomly to one of four levels of difficulty as described below. All subjects gave written consent.

Stimulus generation

To design a categorization task with unfamiliar stimuli and features that were distributed rather than local a large set of 270 x 270 pixel black and white square-shaped textures was generated (examples in Fig 1).

Fig 1 — **Above (1a):** the six pairs of binary features used to generate the two texture categories: “Kalamites” (Ks) and “Lakamites” (Ls). **Below (1b)** Left: sample of 4 Kalamites and 4 Lakamites at the easiest level (6/6, in which all six features covaried with category membership) Right: 4 Kalamites and 4 Lakamites at the hardest level (3/6, in which only three of the six features covaried with membership; the non-covarying pairs varied randomly).

The building blocks of the textures were twelve 6 x 6 squares, each consisting of 18 black and 18 white pixels arranged in different patterns. These 12 squares were then paired arbitrarily, thus providing 6 pairs of mutually exclusive binary (0/1) micro-features (Fig 1). For simplicity, we will henceforth refer to these squares as “features”. Each individual texture was thus built out of 900 features, 30 along the width dimension and 30 along the height dimension, their spatial positions distributed randomly. From left to right and top to bottom, a feature was added at random with replacement from the set of 6 features (all of the 6 features were equally represented in each stimulus) until an image of 17x17 features (510x510 pixels) was generated. The resulting 180x180 grid was then resized to 275x275 pixels using PIL. Image (ANTIALIAS) in Python 2.7, a high-quality filter based on convolution.

The stimuli were designed to produce four “a-priori” levels of difficulty. At the easiest level, all 6 binary features covaried with category membership: the 0-value of each binary pair occurred in every member of the K category (KALAMITES) and the 1-value of each pair occurred in every member of the L category (LAKAMITES). Our a-priori assumption was that stimuli in which all the features covaried with category membership would be the easiest to learn to categorize and that difficulty would increase as the proportion of covarying (relevant) features decreased and the proportion of non-covarying (irrelevant) features increased. The four levels of difficulty tested ranged from 6/6 co-variants (easiest), to 5/6, 4/6 and 3/6 (hardest). The non-covarying features varied randomly at each level, independent of category membership. Each set consisted of 180 different texture images, each of them presented two to three times for a total of 400 trials. Stimuli were generated using the PsychoPy2 open source software [49]. Although the proportion of covariant features (k/6) decreased at each difficulty level, only one arbitrary combination of k features was tested at each level, not every possible combination of k features: For example, all subjects trained at level 3/6 viewed stimuli with the very same three (arbitrarily chosen) covariant features (Fig 1).

Procedure

The experiment was conducted in a sound isolated chamber with dim lighting and no other sources of electromagnetic interference. Subjects were seated in a comfortable armchair in front of a glass window through which they saw the computer screen presenting the task. They had a keyboard placed on a table between them and the window to click on the K and L keys. Sixty-four electrode channels were used to record whole-head EEG data through the Biosemi Actiview2 amplifier. The task was built and presented using the PsychoPy2 psychology open source software [49].

Task

In this first experiment, the standard reinforcement learning task consisted of trial and error with corrective feedback. The training session lasted about forty minutes (pauses included). Subjects had to learn to categorize each texture as either a “KALAMITE” or a “LAKAMITE”. Each set included one-hundred and eighty textures generated as described above. Subjects saw a total of four hundred textures (each stimulus appearing 2–3 different times during the task).

Each trial consisted of a fixation cross (500 ms) followed by one of the stimuli, shown at the center of the screen against a white background (1.25 s). Subjects were instructed to click K or L to indicate the category. They had to respond within 2s of the onset of the stimulus; if they did not, the computer prompted them to respond faster. Responses were followed by immediate feedback (lasting 750 ms) indicating whether the response had been correct or incorrect. Inter-trial interval was 2500 ms.

The 400 trials were divided into four blocks of a hundred stimuli each. Following each block, there was a pause in which subjects had to fill out a questionnaire asking whether they thought they had detected the difference between the KALAMITES and LAKAMITES. If they replied “yes”, they were asked to describe what they were doing to categorize the stimuli. If they replied “no”, they were asked to describe the provisional strategy they were using to try to sort them. Instructions and questionnaires were in English or French depending on the subjects’ native language. Both responses and reaction times were reported during the task.

Learning assessment

The learning curves for all subjects were analyzed o determine which subjects had learned and which had not. The percentage correct in each series of 20-trial runs was calculated. Our criterion for successful learning was to reach and maintain 80% correct till the end of the 400 training trials, starting from at least 60 trials before the end. The point at which they reached the criterion was treated as as the “learning point” (see Fig 2, left). The subjects were accordingly divided into “Successful Learners” and “Non-Learners” (right). However, at the higher difficulty levels, some subjects showed an unexpected learning pattern (middle): they reached 80% but then fell below it and kept rising above and below 80% till the end. These subjects were classified as “Borderlines” because they did not show a “Non-Learner” pattern (percent correct remaining around chance, 50%), but they didn’t maintain our 80% criterion either.

Fig 2 — From left to right: (a) Successful Learner, (b) “Borderline” and (c) Non-Learner. The red line corresponds to 80% correct. For the successful subjects, the point where they reach the 80% red line (if they stay above it thenceforward) is considered the “learning point,” which then serves as a basis for splitting our EEG data for the before-after comparison.

To estimate degree of difficulty, we analyzed the number of trials required to reach the criterion and the percentage of Learners and Non-Learners for each set, assuming that with greater difficulty it would take more trials to reach criterion and fewer subjects would succeed in reaching and sustaining it.

EEG Acquisition

A Biosemi 64-electrode international reference cap was placed on the Ss’ heads according to head circumference; electrodes were connected to the cap using a column of Conductive Gel to fill the gap between the skin and the electrodes. Six facial electrodes were placed at the common reference sites: two earlobes, above and below the right eye to record the VEOG (Vertical Electrooculogram), directly to the side of the left eye and directly to the side of the right eye to record the HEOG (Horizontal Electrooculogram). The signals were received by a Biosemi ActiveTwo amplifier at a sampling rate of 2048 Hz with a band pass of 0.01–70 Hz. Impedance of all electrodes was kept below 5kOhms. Data collection was time-locked to time point zero at the onset of visual stimulus presentation.

EEG data analysis

EEGLab 13.4.4b open source software [50] was used to process raw EEG files via the following steps: (a) The data were down-sampled to 500 Hz to decrease computational requirements. (b) A low pass (100 Hz) filter, high pass (3 Hz) filter and notch filter (60 Hz) were then applied. (c) Bad channels were identified by EEGLAB and were then interpolated. (d) The electrodes were re-referenced to a virtual average reference including all head electrodes but excluding the facial ones. (e) The data were divided into 3000 ms segments with individual epochs spanning from 1000 to 2000 ms around time zero. (f) A baseline was corrected based on the 200 ms before each stimulus onset. (g) EEGLAB function Runica [51] was used to identify independent components. (h) The first 10 components were visually inspected. Components associated with blinks and eye-movements according to topography and power spectrum were rejected. (i) The data were then separated into two parts—before learning and after learning for the Learners or first-half and second-half for other subjects. (j) Noisy epochs were filtered using an extreme value filter (+/- 100 μV) and then a probability filter with a 2 standard deviation limit for single channel and a 6 standard deviation limit across channels.

For the ERP analysis, Successful Learners’ data was divided into two segments, based on the point when the subject reached (and sustained) our 80% learning criterion, as illustrated in Fig 2. The average ERP waveform elicited by the stimuli for the trials before and after this point was compared. For Non-Learners the datasets were divided in half and the first half of the trials was compared to the second half to control for ERP effects that were not due to learning to categorize (i. e. mere exposure/repetition effects). Once the datasets were split, grand averages for comparisons within subjects (before vs. after learning or first half vs. last half trials) and between subjects (Learners vs. Non-Learners) were computed. Limitations of this approach are considered in the Discussion section. (Reported in S1 File is an alternative approach in which subdivided both Succesful Learners’ and Non-Learners’ trials block by block).

ERPs from -200 to 1100 ms were plotted around time zero. Statistical analyses were conducted using the EEGLab software (parametric statistics, p<0.05, with Bonferroni correction). After identifying our Regions of Interest and significant time windows (see below), mean ERP voltages were measured in time windows centered on the peak of each component of interest. Amplitude (mean voltage) differences within subjects were assessed with student t distributions; effect sizes were calculated using Cohen’s d and differences between subjects were assessed with repeated-measures ANOVA, all using the IBM SPSS 23 Statistical Software. Scalp distributions were plotted for each condition in the time-windows of interest.

Results

Analysis of learning

Forty-two subjects (aged 19–34, 22 male, 20 female) completed the visual category-learning task; each was assigned to one of our four difficulty levels (Table 1). Overall, 28 of the 42 subjects successfully attained our a-priori criterion (reaching and maintaining at least 80% correct for at least the last 60 trials); four additional subjects were classed as Borderlines. The remaining eleven subjects did not reach the learning criterion throughout the task and were classed as Non-Learners.

Table 1. Outcome profile for each a-priori difficulty level in Experiment 1.

A-priori difficulty	Covarying features	Learners	Borderline	Non-Learners	Trials to learn: mean (SE)	Accuracy: mean (SE)
1	6/6	7	1	3	140 (20)	79% (4.1)
2	5/6	8	0	3	194 (41)	71% (4.7)
3	4/6	5	3	2	278 (61)	65% (3.04)
4	3/6	8	0	2	138 (20)	82% (3.7)
Total		28	4	10	173 (18)	74%(2.2)

Open in a new tab

The number of trials it took to learn to categorize as well as the overall accuracy through the categorization task for each difficulty level (Table 2) were examined. A one-way ANOVA with difficulty as between-subjects factor revealed that the number of trials it took to learn did not differ significantly between difficulties (F(3,40) = 0.840, p = 0.481; linear contrast F(1,40) = 0.006, p = 0.940), while the mean accuracy did (F(3,40) = 5.576, p = 0.003; linear contrast, F(1,40) = 0.072, p = 0.789). A HSD-Tukey post-hoc analysis of the accuracy between difficulties revealed the only significant difference was between level 2 (5/6) and level 3 (4/6): mean difference = 16.09%, p = 0.013. A detailed analysis of the difficulty assessment has already been reported in a previous paper (Pérez-Gay, et al., 2017).

Table 2. Number of learners and number of trials before reaching the learning criterion for easy and hard level.

Level	Immediate Learners	Successful Learners	Borderlines	Non-Learners	Trials to learn: Mean (SE)	Mean accuracy (SE)
Easier (5/6)	6	10	0	5	106 (33)	62% (3.17)
Harder (4/6)	0	8	2	10	262 (32)	81% (3.09)
Total	6	18	2	15	175 (29)	73% (2.66)

Open in a new tab

Repeated-measures ANOVAs tested how Reaction Times and Response Accuracy changed across the four successive blocks. An interaction between block and Learning group showed that reaction times and accuracy across blocks were significantly different between Learners, Non-Learners and Borderlines (Accuracy: Wilks’ Lambda = 0.560, F(6,72) = 4.041, p = 0.02, η2 = 0.252; reaction times: Wilks’ Lambda = 0.622, F(6,74) = 3.212, p = 0.008, η2 = 0.221).

For Learners, response accuracy increased linearly (Wilks’ Lambda = 0.917, F(3,24) = 88.420, p<0.01, η2 = 0.917; linear contrast, F(1,26) = 170.022, p<0.001, η2 = 0.867) and reaction times decreased linearly (Wilks’ Lambda = 0.359, F(3,24) = 14.307, p<0.001, η2 = 0.641; linear contrast, F(1,26) = 29.166, p<0.001, η2 = 0.529). This pattern was absent in the Non-Learners whose accuracy did not change significantly across blocks (Wilks’ Lambda = 0.710, F(3,8) = 1.091, p = 0.407, η2 = 0.290), and whose Reaction Times changed, but not linearly (Wilks Lambda = 0.318, F(3,8) = 5.714, p = 0.022, η2 = 0.682,; linear contrast, F(1,10) = 0.871, p = 0373, η2 = 0.080).

ERP results

In this first experiment, our goal was to explore the changes in early and late ERP components throughout the category learning task. Grand average ERPs were computed, combining the data from the four difficulty levels so as to have enough Learners and Non-Learners for comparison. As explained in the methods section, for the within-subjects analysis the data of the Learners were divided into trials before and after learning. The threshold and statistical methods described in Section 2.1.7 resulted in the rejection of an average of about 9 epochs (6%) from the before-learning trials (range: 2 to 26 trials, 1–10%) and about 16 epochs, (7%) from the after-learning trials (range: 1 to 38 trials, 2 to 12%) per subject.

For the Non-Learners, the first and second half of the trials was compared to rule out effects of repeated exposure. An average of about 14 epochs (7%) of the first half of the trials (range: 6 to 19 trials, 3–10%) and about 12 epochs (6%) of the second half of the trials (range: 9 to 22, 4–6%) per subject were rejected. Four subjects (3 Learners and 1 Non-Learner) were also excluded for for artifacts in more than 20% of the trials per condition or overall noisy recordings.

The midline electrodes Fz, Cz, Pz and Oz were examined visually as a first approach to assessing changes in ERP components after training (Fig 3). These plots revealed significant effects in an occipital N1 component and in a parietal Late Positivity, two components that have been reported as involved in category learning [47]. Mean voltages were extracted for these components in windows previously described in the literature (150–220 ms for the N1 [47,52–54] and 600–800 ms for the LPC [47,55,56]). Scalp distributions for after-minus-before difference waves (Fig 4) were then plotted.

Fig 4 — Topological heatmaps of after-minus-before difference waves in the N1 and LPC windows (Learners, upper row, Non-Learners, lower row). The vertical bar shows average voltage change.

Effect of Learning on the occipital N1 (first negative peak). Mean voltages (amplitudes) in the N1 window were extracted for the before-learning (first half) and after-learning (second half) conditions in a cluster of five occipital electrodes (Iz, Oz, O1, O2. POz) and for each individual electrode in the cluster. A two-way mixed ANOVA with time (before reaching criterion vs. after reaching criterion [for Learners] or first half vs. second half [for Non-Learners]) as a within subject factor and group (Learners vs. Non-Learners) as a between-subject factor failed to show a significant interaction between learning to categorize and changes in the N1 amplitude in the occipital cluster (F(1,32) = 1.605, p = 0.202, η2 = 0.026).

Despite the absence of a significant interaction there were significant simple before-after effects within subjects for Learners only. These were noteworthy given the effect sizes. There was a statistically significant decrease in N1 negativity from before to after learning in our occipital cluster [mean change = 0.625, t(24) = 3.406, p = 0.002, Cohen’s d = 0.683] for Learners but not for Non-Learners [mean change = 0.1516, t(8) = 0.514, p = 0.621, Cohen’s d = 0.171]. (These effects were subsequently replicated independently and emerged statistically significant in Experiment 2).

Effect of learning on the parietal LPC (late positive component). The mean voltage (amplitude) in the LPC window before learning (first half) and after learning (second half) was extracted from a cluster of eight parietal electrodes (Pz, P1, P2, P3 P4, POz, PO3, pO4), and for each individual electrode in the cluster. A two-way mixed ANOVA with time (before/first half vs after/last half) as a within-subject factor and group (Learners vs. Non-Learners) as a between-subject factor showed a near significant interaction between learning to categorize and changes in the LPC amplitude in the parietal cluster (F(1,32) = 3.867, p = 0.0.058, η2 = 0.108).

Post-hoc tests revealed a statistically significant increase in LPC positivity from before to after learning in the chosen cluster [mean change = 0.5325, t(24) = 3.560, p = 0.002, Cohen’s d = 0.712] for Learners but not for Non-Learners [mean change = 0.0205, t(8) = 0.514, p = 0.861, Cohen’s d = 0.060].

Continuous correlational analysis. To complement our discrete analysis (Learner vs. Non-Learner based on an 80% criterion) a continuous correlational analysis treated learning as a matter of degree instead of as all-or-none. Learners and Non-Learners were combined to measure the correlation between their performance (measured as percent correct in categorization on the last learning block) and the ERP changes (after-minus-before differences in N1 and LPC amplitudes).

Spearman’s rank-order correlations between the size of the N1 change in the occipital cluster and accuracy in the last block of 100 trials were positive and significant: rho(34) = 0.359, p = 0.034, Fisher z = 0.3769, 95% CI = 0.024–0.622: The better the performance, the smaller the N1 amplitude after learning. The correlation between the change in LPC in the parietal cluster and the accuracy in the last block was also positive and significant (rho(34) = 0.366, p = 0.025, z = 0.3884, 95% CI = -0.012 0.578): The higher the performance, the bigger the LPC amplitude after learning.

Experiment 2

Experiment 1 revealed two noteworthy ERP effects induced by learning to categorize novel visual textures: a decrease in negativity of an early (perceptual) component, N1, and an increase in positivity in a late (memory-related) component, the LPC. The data were analyzed in two complementary ways: (i) by partitioning our sample into two populations, Learners and Non-Learners, using a discrete performance threshold of 80% correct and (ii) by treating learning as continuous, combining all subjects and testing the correlation of each of the two ERP components of interest with their learning performance (percent correct) in the last learning block) with. To replicate and build on this initial outcome from exploratory Experiment 1, we did a second experiment using a new independent sample, testing the previous findings as a-priori predictions.