Abstract
In two experiments, electric brain waves of 14 subjects were recorded under several different conditions to study the invariance of brain-wave representations of simple patches of colors and simple visual shapes and their names, the words blue, circle, etc. As in our earlier work, the analysis consisted of averaging over trials to create prototypes and test samples, to both of which Fourier transforms were applied, followed by filtering and an inverse transformation to the time domain. A least-squares criterion of fit between prototypes and test samples was used for classification. The most significant results were these. By averaging over different subjects, as well as trials, we created prototypes from brain waves evoked by simple visual images and test samples from brain waves evoked by auditory or visual words naming the visual images. We correctly recognized from 60% to 75% of the test-sample brain waves. The general conclusion is that simple shapes such as circles and single-color displays generate brain waves surprisingly similar to those generated by their verbal names. These results, taken together with extensive psychological studies of auditory and visual memory, strongly support the solution proposed for visual shapes, by Bishop Berkeley and David Hume in the 18th century, to the long-standing problem of how the mind represents simple abstract ideas.
In earlier work, we have reported on brain-wave representations of language. Initially we concentrated on being able to recognize correctly a single word being processed in the cortex (1). We next focused on brain-wave recognition of sentences (2). Most recently, we extended this work to a larger set of 48 sentences, presented as either spoken or printed text. The important finding was that brain-wave recognition rate was notably improved by averaging over subjects as well as trials (3). The results provide surprisingly strong evidence of the invariance between subjects of brain-wave representations of language as first processed upon reaching the cortex. The brain-wave representations we have studied are based on electroencephalographic (EEG) recordings of electrical activity in the cortex. Review of related research is given in the references cited.
Using the methods of analysis developed in our earlier work, the present study reports the findings of two new experiments focused on the brain-wave representation of simple visual images and their names. The images are patches of color or familiar shapes such as circles and squares. We analyze the representations of the images and words separately, but our main focus is on the comparison of the brain waves representing images with those representing names of the images. The results support in a quite direct way the solution proposed by Bishop Berkeley and David Hume to a long-standing controversy that began in the 18th century of how the mind represents simple abstract ideas.
Methods
For all subjects, electroencephalographic recordings were made in our laboratory by using 15 or 22 model-12 Grass Instruments (Quincy, MA) amplifiers and Neuroscan’s scan 4 software (Sterling, VA). Sensors were attached to the scalp of a subject according to the standard 10–20 EEG system, either as bipolar pairs, with the recorded measurement in millivolts being the potential difference between each such pair of sensors, or single sensors referenced to the left or right mastoid. For both experiments, the recording bandwidth was from 0.3 to 100 Hz with a sampling rate of 1,000 Hz. The length of recording of individual trials varied with the experiments, as described below. A computer was used to present auditory stimuli (digitized speech at 22 kHz) to subjects via small loudspeakers. Visual stimuli were presented on a standard computer screen.
Fourteen subjects were used in the experiments. We numbered the subjects consecutively with those used in refs. 1–3, because we continue to apply new methods of analysis to our earlier data. Subjects S10–19 participated in experiment I, which took place in January 1999; S10–14, 16, 19, 25, and 28–30 participated in experiment II, which took place in June and July 1999. S29 participated in two sessions on different days; in the later analysis of experiment II, each session is counted as a subject, S29.1 and S29.2. Nine of the subjects were female and five were male, ranging in age from 23 to 54 years. One was left-handed, one was ambidextrous, and three were not native English speakers.
In experiment I, S10–15 and S19 had the following 16 unipolar sensors attached to the scalp: Fp1, Fp2, F7, F3, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, and T6. S16–18 had 22 bipolar pairs: Cz-Fz, Cz-F4, Cz-C4, Cz-P4, Cz-Pz, Cz-P3, Cz-C3, Cz-F3, Fz-Fp2, Fz-Fp1, F4-Fp2, F4-F8, C4-F8, C4-T4, C4-T6, P4-T6, P3-T5, C3-T5, C3-T3, C3-F7, F3-F7, and F3-Fp1. In experiment II, all subjects had the following 15 bipolar pairs attached to the scalp: Cz-C4, Cz-P4, Cz-Pz, Cz-P3, Cz-C3, F4-Fp2, F4-F8, C4-F8, C4-T4, C4-T6, P4-T6, P3-T5, C3-T5, C3-T3, and F3-F7.
Using the methods of refs. 1–3, averaging half of the trials for prototypes and the other half for test samples, then a fast Fourier transform, followed by filtering, and an inverse transform to the time domain, we estimated four parameters for each subject in each of the conditions in the two experiments. First, we estimated the low frequency and the high frequency of the optimal bandpass filter (optimal defined, as in ref. 1, in terms of correct recognition rate). Second, we estimated, again for the best recognition rate, the starting point (s), after the onset of the stimulus, and ending point (e) in ms of the sample sequence of observations used for recognition, with the same s and e for a given set of stimuli to be recognized. The parameters s and e are omitted in the tables of results of experiment I, because quite often the gradients were too flat to make the selection of s or e other than arbitrary within a couple of hundred ms. Some detailed results for s and e are given for experiment II. Some typical recognition-rate surfaces are shown in refs. 2 and 3.
In both experiments, we followed the methodology of our earlier article (3) and averaged brain waves over subjects, as well as trials, to achieve in many cases, but not all, better recognition results. The notation we use is: AvgUS, averaged over unipolar subjects; TypUS, typical unipolar subject (artificial subject made up of two trials from each of the best five individual unipolar subjects); AvgBS, averaged over bipolar subjects; AvgS, averaged over all subjects; SepAS, separately averaged subjects for prototypes and test samples; and TypS, typical subject (made up from two trials each from best five individual subjects).
Experiment I: Visual Shapes and Their Names
Procedures.
In experiment I, 10 stimuli in each of four conditions were presented to subjects with the same interstimulus interval of 1,550 ms. Each stimulus was presented 10 times in random blocks of 10 trials, each block containing all 10 stimuli. In the two auditory conditions, one with a female voice and one with a male voice, the 10 stimulus words were: circle, square, line, arrow, dog, man, fish, cube, face, star. The duration of each auditory stimulus was about 400 ms. Words spoken by the two speakers were randomized together, for a total of 20 blocks. In the visual-word condition the same 10 words were presented visually for 500 ms on a computer screen. In the visual-image condition, the stimuli were stick drawings on the computer screen representing the 10 words and also were presented visually for 500 ms. The order of presentation to all the subjects was the same, first 200 auditory-word trials, then 100 visual-word trials, and finally 100 visual-image trials.
Results.
The results for the four conditions are shown in Table 1. The first column of data for each condition in the table shows the recognition rate achieved, expressed in percent. The best EEG sensor, or bipolar pair of sensors, is shown in the second column of data for each condition, and in the third column the optimal bandpass filter in Hz. We note first that the recognition rate of 100%, the highest achieved in this experiment, was for averaged data (AvgUS, S10–15, S19) in the visual-image condition where subjects saw stick drawings of 10 familiar objects. In the visual-word condition both averages (AvgUS and AvgBS), at 50%, were not as good as the best individual recognition rate (70%). In the auditory-word condition, using both female and male voices as stimuli, all four averages were excellent, with the best being 90% for the male voice, AvgUS. It is worth noting that the best individual result in the four conditions of this experiment was 90% (S18), also for the male voice. In the auditory (female voice), visual-word and visual-image conditions, the TypUS rates were tied and equaled the 70% rate of the best (unipolar) subjects from whom the trials were drawn. In the auditory (male voice) condition, TypUS was only 50%.
Table 1.
Subject | Auditory words
|
Visual words
|
Visual images
|
|||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Female voice
|
Male voice
|
|||||||||||
% | Sensor | Filter | % | Sensor | Filter | % | Sensor | Filter | % | Sensor | Filter | |
S10 | 70 | T4 | 3-10 | 70 | C4 | 3-12 | 60 | F3 | 5-20 | 60 | Cz | 6-19 |
S11 | 50 | F4 | 7-16 | 70 | T6 | 0.5-6 | 60 | F8 | 3-21 | 70 | T5, T6 | 3-7 |
S12 | 70 | F8 | 4-9 | 60 | F8 | 6-21 | 60 | F3 | 1-16 | 60 | T5 | 4-10 |
S13 | 70 | T3 | 1-11 | 70 | P4 | 2-19 | 60 | Pz | 4-13 | 70 | T6 | 1-3 |
S14 | 60 | F8 | 7-11 | 60 | T6 | 1-17 | 60 | T5 | 9-17 | 70 | T5 | 4-8 |
S15 | 50 | T6 | 5-18 | 60 | F4 | 9-18 | 70 | C4 | 5.5-17 | 60 | F4 | 6-17 |
S16 | 70 | Cz-P4 | 3-6 | 70 | Cz-Pz | 2-4 | 50 | Cz-C4 | 5-22 | 60 | P4-T6 | 3-19 |
S17 | 60 | Cz-P3 | 3-19 | 70 | C4-F8 | 5-15 | 60 | F3-Fp1 | 8-18 | 70 | Cz-Pz | 1-17 |
S18 | 70 | Cz-Fz | 4-21 | 90 | C4-T4 | 3-7 | 60 | P4-T6 | 0.5-15 | 70 | P4-T6 | 3-19 |
S19 | 60 | Pz | 4-21 | 70 | T4 | 4-9 | 60 | F7 | 2-19 | 70 | T6 | 7-22 |
TypUS | 70 | Cz | 2-10 | 50 | T3 | 4-14 | 70 | Pz | 2-4 | 70 | T3 | 3-11 |
AvgUS | 70 | P4 | 3-8 | 90 | Cz | 0.5-17 | 50 | T4 | 0.5-17 | 100 | T6 | 3-20 |
AvgBS | 70 | C3-T5 | 2-7 | 70 | Cz-Pz | 2-5 | 50 | C4-F8 | 7-24 | 70 | C3-T5 | 5-16 |
The most important result about the visual-image condition is this. We used as prototypes the average (over all unipolar subjects) for each of the 10 words in the visual-image condition to classify as test samples the average of each of the 10 words in the visual-word condition. The results were that we recognized six of the 10 test samples. We show in Fig. 1 (Upper) the filtered averaged prototype and test-sample waves from sensor T6 for circle. We show the waves for 1,000 ms after onset of stimulus. The waves for the two conditions are remarkably similar, with the timing of peaks nearly identical in the first 700 ms.
Equally surprising, using the same visual-image brain-wave prototypes, but now using as test samples the average auditory-word brain waves (female voice), we also recognized six of the 10 test samples correctly. We show in Fig. 1 (Lower) how similar the two averaged brain waves from sensor F3 for square are, one generated by the visual image of a square and the other by the spoken word square.
For a comparison of brain waves generated by the visual image of a circle, but in different laboratories, we show in Fig. 2 the averaged wave from sensor T3 along with the T3 wave generated in connection with our earlier experiments (1, 2) at the Scripps Institute of Research (La Jolla, CA). The institute data are averaged from three subjects shown a visual image of a circle. We note that the visual-image generated wave from T6 for a circle in Fig. 1 is different from that from T3 shown in Fig. 2. Such differences are common. This is why our least-squares criterion of fit is almost without exception applied only to comparison of waves recorded by the same sensor in the 10–20 system.
Experiment II: Colors, Shapes, and Their Names
Procedures.
Each experimental session consisted of four different conditions. The general instruction was displayed to the subject at the beginning of the session and instructions specific to each condition right before its start. Before condition IV began, the subject also was given two representative examples of the stimuli to be presented.
Every trial in all conditions contained a pair of stimuli, presented in temporal sequence. For each stimulus in the pair, recording started 50 ms before the stimulus onset and lasted until 1,350 ms after the stimulus onset. Each stimulus itself lasted for 200 ms in the nonauditory cases and ranged from 275 ms to 421 ms for auditory stimuli. There was a 100-ms pause within each trial between the recordings of the two stimuli and another 1,100-ms pause after recording for the second stimulus, before the next trial started. After presentation of the second member of a pair, the subject used the numeric pad on the computer keyboard to respond “1” if the two stimuli in the pair were the same and “2” if they were different. Subjects were instructed that the same-different distinction was obvious and did not require a subtle perceptual discrimination. The length of each trial was 4 s in total. Interstimulus interval was 1,500 ms within a trial and 2,500 ms between onset of the second stimulus of a trial and onset of the first stimulus of the next trial. Trials were randomized within each condition. But all subjects were presented, in a given condition, with the same sequence of randomized stimuli.
There were four colors: blue, green, red, and yellow, and four shapes: circle, square, triangle, and line (at 135° angle, bottom to the left), as the contents of the stimuli.
Condition I of each session presented visual images of colors and shapes. For example, the color red was represented by a blank screen with red background, and a square shape was represented by a white line drawing of a square displayed on the screen against a black background. Condition I consisted of 15 blocks. Each block contained the same 16 pairs, randomized in different order in different blocks. Eight of the pairs were for colors: four pairs of same colors, and four pairs of different colors: blue-yellow, green-blue, red-green and yellow-red. The other eight of the pairs were for shapes: four pairs of same shapes, and four pairs of different shapes: circle-line, line-square, square-triangle, and triangle-circle. The randomization was restricted so that trials alternated in pair of colors and pair of shapes.
Condition II of each session presented visual words and auditory words. Instead of visual images, we used auditory words, blue, etc., to represent the colors and visual words, circle, etc., displayed on the screen to represent the shapes. The rest of the experimental setup was the same as in condition I, except that there were only 12 blocks of 16 trials each. Because of the way we represented colors and shapes, trials within each block alternated not only in pairs of colors and pairs of shapes, but also pairs of auditory words and pairs of visual words.
Condition III was very similar to condition II, except that auditory words were shape words and visual words were color words.
In condition IV, all pairs were auditory words and all pairs contained the two same words. The possible difference between the two words in a pair was only whether they were said by a female or a male speaker. Subjects were instructed to respond whether the voices in a pair were the same voice or two different voices. Eight pairs of same words (four colors and four shapes) were presented with each of the four possible combinations of speakers (female-female, male-male, female-male, male-female) in each block. Hence, we had six blocks of 32 trials each. As before, randomization within each block was done in such a way that trials alternated in pairs of color words and pairs of shape words.
Results.
The results for conditions I, II, and III of experiment II are shown in Table 2. The percent recognition rates shown for both visual images and words are in terms of recognizing the eight visual images, four colors and four shapes, or their visual or auditory names. As in experiment I, for each subject, half of the trials were averaged to create eight brain-wave prototypes and the other half to create eight brain-wave test samples. The results for visual images were the best. Recognition of four subjects’ brain waves was at 100%, six at 88% (one error), and two at 75% (two errors). Moreover, recognition for AvgS was at 100%, SepAS was at 88% (one error), and TypS was at 75% (two errors).
Table 2.
Subject | Visual images
|
Visual words
|
Auditory words II & III
|
|||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
% | Sensor | Filter | Time, ms | % | Sensor | Filter | Time, ms | % | Sensor | Filter | Time, ms | |
S10 | 88 | Cz-P4 | 0.5-17 | 330-550 | 75 | F3-F7 | 10-17 | 30-900 | 88 | C4-T6 | 2-10 | 210-500 |
S11 | 75 | P4-T6 | 3-7 | 150-700 | 75 | C4-F8 | 10-19 | 390-900 | 75 | Cz-Pz | 5-18 | 390-550 |
S12 | 88 | Cz-Pz | 5-20 | 240-500 | 75 | C3-T5 | 0.5-3 | 90-500 | 88 | C3-T5 | 3-20 | 270-800 |
S13 | 88 | Cz-P4 | 0.5-13 | 90-500 | 63 | F3-F7 | 5-20 | 120-550 | 88 | C4-T4 | 3-16 | 240-650 |
S14 | 100 | Cz-Pz | 6-20 | 180-750 | 75 | C4-F8 | 7-15 | 210-1250 | 88 | C4-T6 | 6-9 | 210-1050 |
S16 | 100 | C3-T5 | 9-20 | 150-600 | 88 | C4-F8 | 0.5-4 | 30-550 | 75 | Cz-C3 | 2-19 | 270-500 |
S19 | 88 | Cz-P4 | 8-22 | 90-550 | 63 | C4-T6 | 1-16 | 90-500 | 88 | C4-T6 | 2-10 | 210-600 |
S25 | 75 | Cz-Pz | 1-12 | 210-800 | 75 | F3-F7 | 2-4 | 120-700 | 88 | C4-T6 | 3-8 | 240-750 |
S28 | 88 | Cz-P4 | 10-20 | 180-500 | 75 | Cz-P4 | 3-20 | 240-600 | 75 | C4-T4 | 5-17 | 270-500 |
S29.1 | 100 | P4-T6 | 1-14 | 210-550 | 63 | C4-T6 | 2-17 | 390-550 | 88 | C3-T5 | 3-13 | 270-600 |
S29.2 | 88 | P4-T6 | 1-14 | 210-600 | 75 | C4-T6 | 2-5 | 360-650 | 88 | C3-T5 | 3-13 | 240-500 |
S30 | 100 | Cz-P4 | 10-17 | 210-500 | 63 | Cz-C4 | 5-9 | 270-850 | 88 | P4-T6 | 3-20 | 150-500 |
AvgS | 100 | Cz-P4 | 3-20 | 210-550 | 63 | C4-T6 | 6-20 | 270-900 | 100 | C4-T6 | 2-7 | 270-650 |
SepAS | 88 | Cz-P4 | 9-20 | 30-650 | 75 | C4-T6 | 4-13 | 330-650 | 100 | C4-T6 | 3-11 | 30-550 |
TypS | 75 | P4-T6 | 3-8 | 210-500 | 63 | C4-T6 | 3-12 | 240-700 | 75 | C4-T6 | 1-8 | 30-500 |
As in experiment I, the recognition results for visual words, i.e., the names of the visual images as stimuli, were not as good, but well above chance. The highest recognition rate was 88% (for S16), 75% (two errors) for seven subjects, and 63% for four subjects. The rate for AvgS was a surprisingly low 63%, compared to 75% for SepAS and again 63% for TypS.
The results for the auditory presentation of the eight names of the visual images in conditions II and III are shown in Table 2. The recognition results are in between those for visual images and their printed, i.e., visual, names, but close to the good results for the visual images. The recognition rate of the brain waves for nine subjects was 88% (only one error per subject), and for the remaining three subjects, 75% (two errors). The recognition rates for both AvgS and SepAS were 100% and 75% for TypS. Of the three types of averaging, SepAS, the averaging of separate subjects for prototypes and test samples, was, at 100%, for the auditory presentation, scientifically the most significant. It strongly supports the invariance results across subjects reported in ref. 3.
In Table 3, the recognition results for the brain waves generated by a female-voice and by a male-voice presentation of the eight names, both in condition IV, are shown. The recognition results are not as good as for the auditory presentation of conditions II and III. They are somewhat better for the male voice than the female voice, perhaps because the male speaker’s voice was also the one heard in conditions II and III. We do not review in detail the results for individual subjects, which are shown in Table 3, but we note that for both voices, AvgS and SepAS were the same and at a good level of recognition, 88% (one error) for each of the four cases in the table.
Table 3.
Subject | Auditory words FVoice IV
|
Auditory words MVoice IV
|
||||||
---|---|---|---|---|---|---|---|---|
% | Sensor | Filter | Time, ms | % | Sensor | Filter | Time, ms | |
S10 | 75 | P3-T5 | 3-5 | 90-550 | 75 | C3-T5 | 2-9 | 30-550 |
S11 | 63 | C4-T6 | 7-20 | 210-600 | 75 | C4-T4 | 1-13 | 360-500 |
S12 | 75 | C4-T4 | 10-14 | 60-1,250 | 63 | Cz-C3 | 10-18 | 210-1,000 |
S13 | 63 | C4-T4 | 4-10 | 60-800 | 75 | P4-T6 | 3-13 | 180-800 |
S14 | 75 | F4-Fp2 | 2-16 | 90-650 | 75 | P4-T6 | 2-6 | 30-800 |
S16 | 63 | C4-T6 | 0.5-9 | 120-1,150 | 88 | Cz-C3 | 1-6 | 360-650 |
S19 | 75 | C4-T4 | 3-11 | 300-800 | 75 | Cz-C3 | 0.5-2 | 180-500 |
S25 | 63 | F4-Fp2 | 6-20 | 270-1,100 | 75 | F4-F8 | 1-10 | 360-500 |
S28 | 63 | C4-T6 | 0.5-7 | 270-1,300 | 63 | C4-T6 | 4-10 | 180-1,200 |
S29.1 | 75 | F4-Fp2 | 6-10 | 30-1,150 | 75 | Cz-C3 | 8-13 | 270-700 |
S29.2 | 63 | C4-T6 | 0.5-8 | 150-800 | 75 | C4-T6 | 1-7 | 270-900 |
S30 | 75 | C4-T6 | 3-8 | 300-1,150 | 88 | C3-T5 | 5-19 | 150-1,250 |
AvgS | 88 | C4-T6 | 2-8 | 240-600 | 88 | C4-T6 | 4-6 | 90-500 |
SepAS | 88 | C4-T6 | 5-16 | 300-650 | 88 | P4-T6 | 2-6 | 30-1,150 |
TypS | 63 | C4-T6 | 1-5 | 180-500 | 63 | P4-T6 | 2-4 | 270-550 |
FVoice, female voice; MVoice, male voice.
We now turn to the two tables that summarize the results of experiment II that are most directly relevant to the focus of this article, invariant brain waves for visual images and their names. Table 4 summarizes the results when data for all the subjects were averaged together for each condition. So, for example, the first row of Table 4 is based on the averaged EEG brain-wave data for visual images to form eight prototypes and the corresponding data for visual words to form eight test samples. In the rest of Table 4, as in this example, the condition used for forming the prototypes is given first, and the condition for the test samples second. As can be seen, and as would be expected, the recognition results are not quite as good as those found in Tables 2 and 3, but are comparable to those for the visual-word condition alone in Table 2. More directly relevant is the fact that the cross-modality results are generally better than those obtained in experiment I. In particular, all 14 recognition percentages of Table 4 are better than the two rates of 60% reported for experiment I.
Table 4.
% | Sensor | Filter | Time, ms | |
---|---|---|---|---|
VI-VW | 75 | P4-T6 | 0.5-6 | 450-1,250 |
VI-AW | 63 | C4-T4 | 10-34 | 410-1,000 |
VI-AWF | 75 | C4-T6 | 6-20 | 90-800 |
VI-AWM | 63 | C4-T4 | 6-20 | 340-550 |
VW-AW | 63 | C4-T6 | 7-26 | 390-775 |
VW-AWF | 75 | P4-T6 | 6-21 | 420-775 |
VW-AWM | 63 | C3-T3 | 3-20 | 450-925 |
VI, visual image; VW, visual word; AW, auditory word, conditions II and III; AWF, female voice, condition IV; AWM, male voice, condition IV.
In Table 5, we show the SepAS cross-modality results for experiment II, with six subjects being used for the prototypes and a different six for the test samples. To give a complete analysis, we ran a given six subjects’ averaged brain waves as prototypes, for instance, from the visual-image condition, labeled in the first row of Table 5, VI1, vs. the other six subjects’ averaged brain waves, from the visual-word condition, labeled VW2, as test samples. The second row reverses the six subjects’ role of prototype and test sample, so that now the “other” six subjects’ EEG data, from the visual-image condition, labeled VI2, form the prototypes. The 14 rows of Table 5 give the complete set of cross-modal analyses, with the prototypes and test samples always averaged (SepAS) over disjoint sets of six subjects.
Table 5.
% | Sensor | Filter | Time, ms | |
---|---|---|---|---|
VI1-VW2 | 75 | F3-F7 | 9-20 | 330-825 |
VI2-VW1 | 63 | Cz-C4 | 10-22 | 440-700 |
VI1-AW2 | 75 | C4-T4 | 6-14 | 290-875 |
VI2-AW1 | 63 | C3-T3 | 7-19 | 270-700 |
VI1-AWF2 | 75 | C3-T3 | 10-23 | 90-500 |
VI2-AWF1 | 63 | P4-T6 | 6-12 | 330-800 |
VI1-AWM2 | 63 | C4-T6 | 4-10 | 370-975 |
VI2-AWM1 | 63 | C4-T6 | 6-10 | 260-950 |
VW1-AW2 | 75 | C3-T3 | 10-20 | 320-725 |
VW2-AW1 | 63 | C3-T5 | 3-20 | 450-825 |
VW1-AWF2 | 63 | F4-F8 | 2-6 | 290-525 |
VW2-AWF1 | 63 | P4-T6 | 6-20 | 310-725 |
VW1-AWM2 | 75 | C4-T4 | 2-15 | 450-500 |
VW2-AWM1 | 88 | F4-F8 | 0.5-20 | 400-725 |
VI, visual image; VW, visual word; AW, auditory word; M, male; F, female.
For the eight cases in Table 5 of visual images as prototypes, three of them are at a recognition rate of 75% (two errors) and the remaining five at 63% (three errors). These results are comparable to those in experiment I. In Fig. 3 we show six pairs of waves from the VI1-AW2 condition, four for the colors blue, green, red, yellow, and two for the shapes line and triangle. Each pair consists of the average of brain waves generated by a visual image (solid line) and the average of brain waves generated by the corresponding auditory word (dotted line). Of the six pairs shown, only the pair for the image and word red was misclassified, i.e., not correctly recognized. The fits of the six pairs are not perfect, as is also the case for Figs. 1 and 2. On the other hand, and this is the point to be emphasized, the fits reflect five of six correct recognitions for the waves shown. Only the case of red was misclassified. Moreover, the quantitative least-squares measure of fit is actually lower, and therefore better, for the VI1-AW2 cross-modality case, with separate subjects for prototypes and test samples, than are the corresponding fits for the visual image and auditory word conditions, as shown in Table 4, both of which had better recognition rates than VI1-AW2. These least-squares data are summarized in Table 6. They make clear that averaging and bandpass filtering by no means eliminate all the noise or information irrelevant to the recognition of the brain waves as representations of images or words. All the same, it is surprising that by the least-squares quantitative criterion the cross-modal case of visual image paired with spoken word had easily the best fit. For the results shown in Table 6, the three misclassifications of brain waves are indicated by *. In the cross-modal condition VI1-AW2, the word square was recognized as the yellow image with least-squares value of 63.2, and the word red was recognized as the triangle image with least-squares value of 56.1. In the visual-image condition, the image of a square was misclassified as the color yellow with a least-squares value of 113.7. As these errors show, similarities and differences in oscillating brain waves do not respect traditional cognitive categories, even something as fundamental as the categories of color and shape.
Table 6.
Cross-modal VI1-AW2 | Auditory AW | Visual VI | |
---|---|---|---|
Circle | 39.2 | 292.9 | 97.0 |
Line | 40.1 | 288.0 | 124.6 |
Triangle | 31.3 | 212.2 | 71.0 |
Square | 63.2* | 158.9 | 80.1 |
Red | 56.1* | 67.7 | 113.1 |
Green | 24.7 | 394.0 | 113.7* |
Blue | 24.8 | 260.6 | 121.3 |
Yellow | 33.9 | 63.9 | 124.8 |
VI, visual image; AW, auditory word.
Misclassification of brain waves.
Discussion
Brain Representation of Abstract Ideas.
The controversy about how the brain or the mind represents abstract ideas such as the general concept of a color or a circle, square, or triangle is older than psychology as an independent scientific discipline. Early in the 18th century, Bishop Berkeley (4) famously criticized John Locke’s theory of abstract ideas (5). David Hume (ref. 6, p. 17) later summarized succinctly Berkeley’s argument. “A great philosopher [Berkeley] has disputed the receiv’d opinion in this particular, and has asserted, that all general ideas are nothing but particular ones, annexed to a certain term, which gives them a more extensive signification, and makes them recall upon occasion other individuals, which are similar to them.” Berkeley’s views are well supported by our results. After visual display of a patch of red or of a circle, the image is represented in the cortex by the brain wave of the word red or circle within a few hundred ms of the display and somewhat quicker than is the representation in the cortex of the spoken word red or circle. To the skeptical response that we do not really know it is the word red or circle that is being represented in the cortex, as opposed to the particular visual image, we respond that everything we have learned thus far about the one-dimensional temporal representation of words, presented either auditorily or visually, supports our inference, the spatial unidimensionality of the temporal representation used for recognition, above all. Perhaps just as important, the filtered brain waves representing the spoken color or shape words conform closely to the brain waves of the many other words whose brain waves we have identified in our earlier work. However, we emphasize, as we did in ref. 3, that the invariance we are observing between brain-wave representations of visual images and words is consistent with the existence of other significant information we have averaged and filtered out.
Related Psychological Studies.
Various related psychological studies of memory support our conclusion as well. For example, when words are presented visually for immediate recall, the errors tend to be acoustic in character (7), or, if for longer storage, an auditory representation is used (8). More detailed results and a survey of many relevant experiments on the primacy of the auditory representation of words in memory are to be found in ref. 9. There is much evidence that the memory of purely visual images decays quickly, almost always less than 200 ms (10, 11), even though years of research have generated a lot of controversial results (12, 13) on visual sensory memory. On the other hand, the field seems to have reached some consensus that visual sensory memory consists of several components: those that can be masked and decay within 100–300 ms; and a limited-capacity, longer-lasting short-term memory (14–18), with some authors (12, 14, 18) attributing the limited capacity short-term memory to verbal memory. In contrast, short-term auditory memory lasts 2–5 sec (19, 20), so it is most efficient to represent simple visual images in memory by the auditory representation of their names or simple descriptions. The brain-wave experiments reported here support in an unusually direct way that this is indeed what Berkeley and Hume conjectured long ago, but for different reasons than the brevity of visual memory.
Acknowledgments
We thank David Spiegel, Department of Psychiatry, Stanford Medical School, for loan of the Grass amplifiers used in experiment I and Adrian Raine for loan of the Grass amplifiers used in experiment II. We thank Paul Dimitre for producing the three figures and Ann Gunderson for preparation of the manuscript. We also received useful comments and suggestions for revision from George Sperling.
Abbreviations
- EEG
electroencephalography
- AvgUS
averaged over unipolar subjects
- TypUS
typical unipolar subject
- AvgBS
averaged over bipolar subjects
- AvgS
averaged over all subjects
- SepAS
separately averaged subjects
- TypS
typical subject
References
- 1.Suppes P, Lu Z-L, Han B. Proc Natl Acad Sci USA. 1997;94:14965–14969. doi: 10.1073/pnas.94.26.14965. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Suppes P, Han B, Lu Z-L. Proc Natl Acad Sci USA. 1998;95:15861–15866. doi: 10.1073/pnas.95.26.15861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Suppes P, Han B, Epelboim J, Lu Z-L. Proc Natl Acad Sci USA. 1999;96:12953–12958. doi: 10.1073/pnas.96.22.12953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Berkeley G. Principles of Human Knowledge. Dublin: Jeremy Pepyat; 1710. [Google Scholar]
- 5.Locke J. An Essay Concerning Human Understanding. London: Thomas Basset; 1690. [Google Scholar]
- 6.Hume D. A Treatise on Human Nature. London: John Noon; 1739. [Google Scholar]
- 7.Conrad R. Br J Psychol. 1964;55:75–84. doi: 10.1111/j.2044-8295.1964.tb00928.x. [DOI] [PubMed] [Google Scholar]
- 8.Baddeley A D, Hitch G. In: Recent Advances in Learning and Motivation. Bower G, editor. III. New York: Academic; 1974. pp. 47–89. [Google Scholar]
- 9.Sperling G, Speelman R G. In: Models of Human Memory. Norman D A, editor. New York: Academic; 1970. pp. 151–202. [Google Scholar]
- 10.Phillips W A. Percept Psychophys. 1974;16:283–290. [Google Scholar]
- 11.Sperling G. In: Eye Movements and Their Role in Visual and Cognitive Processes. Kowler E, editor. New York: Elsevier; 1990. , Chapter 7, pp. 307–351. [Google Scholar]
- 12.Coltheart M. Percept Psychophys. 1980;27:183–228. doi: 10.3758/bf03204258. [DOI] [PubMed] [Google Scholar]
- 13.Cowan N. Attention and Memory: An Integrated Framework. New York: Oxford Univ. Press; 1995. [Google Scholar]
- 14.Sperling G. Acta Psychol. 1967;27:285–293. doi: 10.1016/0001-6918(67)90070-4. [DOI] [PubMed] [Google Scholar]
- 15.Coltheart M. In: New Horizons in Psychology. Dodwell P C, editor. Harmondsworth, U.K.: Penguin; 1972. [Google Scholar]
- 16.Campbell A J, Mewhort D J K. Can J Psychol. 1980;34:134–154. doi: 10.1037/h0081033. [DOI] [PubMed] [Google Scholar]
- 17.Irwin D E, Brown J S. Can J Psychol. 1987;41:317–338. doi: 10.1037/h0084162. [DOI] [PubMed] [Google Scholar]
- 18.Dixon P, Di Lollo V. Can J Psychol. 1991;45:54–74. doi: 10.1037/h0084271. [DOI] [PubMed] [Google Scholar]
- 19.Treisman A. J Verb Learn Verb Behav. 1964;3:449–459. [Google Scholar]
- 20.Wickelgren W A. J Math Psychol. 1969;6:13–61. [Google Scholar]