Abstract
Monkeys can easily form lasting central representations of visual and tactile stimuli, yet they seem unable to do the same with sounds. Humans, by contrast, are highly proficient in auditory long-term memory (LTM). These mnemonic differences within and between species raise the question of whether the human ability is supported in some way by speech and language, e.g., through subvocal reproduction of speech sounds and by covert verbal labeling of environmental stimuli. If so, the explanation could be that storing rapidly fluctuating acoustic signals requires assistance from the motor system, which is uniquely organized to chain-link rapid sequences. To test this hypothesis, we compared the ability of normal participants to recognize lists of stimuli that can be easily reproduced, labeled, or both (pseudowords, nonverbal sounds, and words, respectively) versus their ability to recognize a list of stimuli that can be reproduced or labeled only with great difficulty (reversed words, i.e., words played backward). Recognition scores after 5-min delays filled with articulatory-suppression tasks were relatively high (75–80% correct) for all sound types except reversed words; the latter yielded scores that were not far above chance (58% correct), even though these stimuli were discriminated nearly perfectly when presented as reversed-word pairs at short intrapair intervals. The combined results provide preliminary support for the hypothesis that participation of the oromotor system may be essential for laying down the memory of speech sounds and, indeed, that speech and auditory memory may be so critically dependent on each other that they had to coevolve.
Keywords: evolution, mimic, arcuate fasciculus
The proficiency with which monkeys perform tests of both visual and tactile recognition does not extend to auditory recognition. In vision and touch, monkeys master the rule for one-trial recognition memory extremely rapidly, within several daily sessions (1, 2); and once they have learned the rule, it can be shown that they have stimulus–retention thresholds (performance at 75% accuracy) of 10–20 min after viewing or palpating a novel stimulus for only 1–2 s (3, 4). In audition, by contrast, monkeys acquire the rule for one-trial memory exceedingly slowly, requiring a full year or two of training before they can master it, if they succeed at all; and if they do succeed, their stimulus–retention thresholds are found to extend no longer than 30–40 s after stimulus presentation (5). This marked disparity in mnemonic ability across sensory modalities suggests that, in audition alone, monkeys seem unable to store stimulus representations in long-term memory (LTM) and, consequently, appear to be limited mnemonically to the time period covered by short-term memory. Humans, on the other hand, are highly proficient at storing lasting representations of auditory stimuli, such as words and tunes, thereby enabling their later recognition. What accounts for these striking mnemonic differences between audition and other sensory modalities in the monkey, and between audition in the monkey and audition in humans?
In the present study, we tested a hypothesis derived from the following considerations. Whereas trial-unique visual and tactile stimuli used in recognition memory tasks are commonly presented as stationary items, most trial-unique sounds used in such tasks fluctuate at high, millisecond speeds. Perhaps unlike stimuli that are stationary, stimuli that fluctuate rapidly cannot be packaged for storage in the relevant sensory processing system alone. Such packaging or integration of a fluctuating stimulus, even within a very short temporal window, may require the assistance of the motor system, which seems uniquely organized to chain link rapid sequences. Unlike monkeys, humans have a dense and complex pathway connecting the auditory system in the posterior temporal region with the oromotor system in the ventrolateral frontal region; this pathway, the arcuate fasciculus (AF) (e.g., refs. 6–8), can often transform an acoustic sequence into a subvocal oromotor sequence, as evidenced by the listener's ability to vocally reproduce or mimic the sound (9).* This integrated acoustic/oromotor sequence might constitute the stored central representation of that sound, enabling its later recognition. Alternatively, if the acoustic stimulus cannot be easily mimicked—which can occur, for example, with certain environmental sounds—it can often be tagged with a label, either a name or an already stored representation of a visual or other nonacoustic sensory stimulus; this newly associated label could then serve as a mnemonic surrogate and obviate the need for storing the representation of the sound, per se.
A corollary hypothesis derived from the foregoing considerations is that an acoustic stimulus that can be neither mimicked nor labeled cannot be stored for subsequent recognition. We tested this hypothesis by comparing the ability of participants to recognize sounds of different types that varied widely in the degree to which they could be reproduced or labeled. Each stimulus type (Words, Pseudowords, Nonverbal Sounds, and Reversed Words) was presented as a study list, and this list was followed after a 5-min delay by a test list requiring old/new judgments. Pairs of reversed words were also presented later in a separate, same/different discrimination test.
Results
As illustrated in Fig. 1, recognition scores were highest for Words (80.8 ± 2.0% correct responses), somewhat lower for both Pseudowords (76.4 ± 1.95) and Nonverbal Sounds (74.7 ± 1.9), and lowest for Reversed Words (57.5 ± 2.5), but significantly above chance (one-sample t test (t[31] = 3.02, P < 0.005). An ANOVA with stimulus type as the within-subject factor showed a highly significant main effect (F[3,29] = 28.99, P < 0.0001). Although post hoc paired comparisons (Bonferroni corrected α-level: P = 0.008) between Words and Pseudowords (t[31] = 1.90, P = 0.07), between Words and Nonverbal Sounds (t[31] = 2.49, P = 0.02), and between Pseudowords and Nonverbal Sounds (t[31] = 0.78, P = 0.44) failed to reach significance, comparisons between Reversed Words and each of the three other stimulus types were highly significant (Words: t[31] = 7.74, P < 0.0001; Pseudowords: t[31] = 6.60, P < 0.0001; and Nonverbal Sounds: t[31] = 5.48, P < 0.0001).
Fig. 1.
Percentage of correct responses on the auditory recognition and auditory discrimination tasks, considered to be measures of memory and perception, respectively. Recognition of each of the four stimulus types was tested separately after 5-min study-test delays filled with articulatory-suppression tasks; same/different discrimination of the reversed-word pairs was tested with intrapair delays of 0.5 s (error bars indicate SEM). Mean recognition score on reversed words fell significantly below the score on each of the other stimulus types (all P values <0.0001).
Because a Shapiro–Wilk test indicated that the distribution of recognition scores for Pseudowords (P = 0.014) was significantly abnormal, we performed additional, nonparametric analyses, and these analyses confirmed the results of the ANOVA and of each post hoc comparison. Thus, a Friedman test with stimulus type as the within-subject factor was highly significant [χ2(3, n = 32) = 40.3, P < 0.0001] and the significant post hoc comparisons (Wilcoxon signed rank tests, Bonferroni corrected α-level: P = 0.008) were those between reversed words and each of the three other stimulus types (Words, Pseudowords, and Nonverbal Sounds, Zs of 4.59, 4.45, and 4.15, respectively, all P values <0.0001).
Recognition performance was also analyzed using the bias-free measure of sensitivity, d′, of signal detection theory by comparing hits (i.e., number of correct responses for new items/number of new items) and false alarms (i.e., number of incorrect responses for old items/number of old items). This index was highest for Words (2.18 ± 0.2), somewhat lower for both Pseudowords (1.72 ± 0.17) and Nonverbal Sounds (1.54 ± 0.15), and lowest for Reversed Words (0.39 ± 0.14). Because the distribution of the performance data was not completely normal (for Pseudowords: P = 0.017 and for Nonverbal Sounds: P = 0.021; Shapiro–Wilk test), the nonparametric Friedman test was carried out. This test with stimulus type as the within-subject factor showed a highly significant main effect [χ2(3, n = 32) = 44.1, P < 0.0001]. Post hoc paired comparisons (Wilcoxon signed rank test, Bonferroni corrected α-level: P = 0.008) for reversed words and each of the three other stimulus types were highly significant (words: Z = 4.86, P < 0.0001; Pseudowords: Z = 4.34, P < 0.0001; and Nonverbal Sounds: Z = 4.23, P < 0.0001). In addition, Words and Nonverbal Sounds also differed significantly (Z = 2.77, P = 0.006).
The difficulty in recognizing Reversed Words does not appear to be attributable to a failure of sensory processing, inasmuch as the participants scored 97.4 ± 0.93% correct in distinguishing between the members of reversed-word pairs, when the paired stimuli were separated by 0.5-s intervals in the auditory discrimination task (Fig. 1).
To determine whether there was any relationship between scores on the different stimulus types, or between scores on a particular stimulus type and on discrimination of reversed-word pairs, we calculated bivariate correlation coefficients. No significant correlations were observed (all P values >0.065 before correction for multiple comparisons).
A mixed-type ANOVA performed with stimulus type as the within-subject factor and language (native vs. nonnative English speakers) as a between-subject factor confirmed the main effect of stimulus type (F[3,28] = 29.22, P < 0.0001), but yielded neither a main effect of language (F[3,28] = 0.175, P = 0.68) nor an interaction between language and stimulus type (F[3,28] = 1.25, P = 0.30).
The other variables we investigated likewise failed to affect auditory recognition memory. Specifically, with regard to the different articulatory-suppression filler tasks, subvocal counting of tones had no greater effect than subvocal counting of visual symbols on the later recognition of any type of auditory stimulus (all P values >0.26). Also, there were no differences between males and females or between participants with and without experience playing a musical instrument, in the recognition of any stimulus type (all P values >0.13), or on the discrimination of reversed-word pairs (both P values >0.65).
Discussion
We investigated the potential contribution to human auditory recognition memory of the degree to which acoustic signals can be mimicked or labeled, by using sounds that differed widely along these two dimensions. Recognition scores after 5-min delays filled with articulatory-suppression tasks were highest for Words (81%), somewhat lower for both Pseudowords and Nonverbal Sounds (76 and 75%, respectively), and lowest by far for Reversed Words (i.e., the Words played backward, 58%). We also found that the difficulty the participants had in recognizing the Reversed Words was not due to a failure in stimulus perception, because they performed nearly perfectly in discriminating reversed-word pairs when the within-pair delay was limited to 0.5 s. Our results thus provide preliminary support for the hypothesis that humans cannot perform one-trial recognition of novel auditory stimuli that they can neither reproduce nor label.
Many more tests of this proposal need to be conducted. For example, just as intensive exposure of an adult learner to the atypical word forms of an unfamiliar language can lead to success in recognizing the words of that language, perhaps familiarizing a study participant with an extensive list of reversed words might in time improve the memorability of a reversed-word study list. However, if it does, and if, despite the familiarization, those stimuli should remain unpronounceable, then according to our hypothesis the improvement could come about through the association of a label with each reversed word, just as with any other unmimickable nonverbal sound (see below).
However, what is the neural circuitry that enables one-trial recognition of auditory stimuli when fluctuating acoustic signals can be either mimicked or labeled? Before elaborating on the proposals outlined earlier, we must note first that, if our hypothesis is correct, each of the three stimulus lists on which the participants performed relatively well—Words, Pseudowords, and Nonverbal Sounds—would have activated partly different circuits. Although none of these proposed circuits has yet been worked out in any detail, some of the major structures and interconnections that would be needed to recognize Words and Nonverbal Sounds can be surmised from the available evidence. For example, recognition of the familiar Words would require only that their previously stored representations be reactivated at test and identified as having recently been activated during presentation of the study list; presumably, this form of episodic memory would depend on the interconnections of the superior temporal auditory processing stream (12–14), the lateral temporal semantic system (14, 15), and the medial temporal lobe, including the hippocampus (16–20).
By contrast, successful “recognition” of the Nonverbal Sounds would presumably require reactivation at test of previously associated verbal and/or visual tags for some of the recently presented study-list sounds and newly associated tags for others of those sounds, thereby using much the same temporal lobe circuitry as that used for words (21) except for the substitution of cross-modal labeling—e.g., sound/name or sound/visual image—for the sound itself. This type of memory would therefore not be acoustic recognition in the strict sense, in that the centrally stored representation that carries the memory would not be the sound, per se, but its retrieved associate, which could often be the stored representation of a stimulus in another sensory modality.
Still another mechanism would be needed, however, to account for recognition of the pronounceable but unfamiliar Pseudowords. As indicated at the outset, we hypothesized that one way in which one-trial auditory recognition of novel speech sounds might occur is through the automatic transformation of the acoustic sequence into a subvocal oromotor sequence, and that this integrated acoustic/oromotor signal could then be stored as a lasting central representation. Such automatic transformation and integration would of course require a strong bidirectional link between the temporoparietal auditory system and the ventrolateral frontal oromotor system, and, indeed, just such a link is provided by the arcuate fasciculus (e.g., refs. 6, 22). Once encoded and stored in the auditory processing stream during study, the central representation of the Pseudoword would enable auditory recognition at test following the same sequence of events and using the same circuits as those used for Words, i.e., short-circuiting the arcuate fasciculus, or at least not requiring its reactivation. Of course, unlike a Word, the recognized Pseudoword would have no meaning until it was associated with the central representation of another stimulus; at that point it would be recognized in exactly the same way as a Word, through direct activation of its stored representation in the auditory system, independent of the auditory–oromotor connection.
Motor Theory of Speech Perception.
Invoking the arcuate fasciculus in support of our proposal raises the issue of what role this tract plays in speech perception. The notion that speech production and processing speech sounds are intimately linked was proposed long ago (23, 24). However, the idea was introduced at that time in an attempt to explain, not long-term auditory memory, but the perception of speech. In its most recent version (24), this theory proposed that speech is perceived as “specific patterns of intended (oromotor) gestures” and that speech perception is possible because listeners do not hear speech as ordinary sounds; rather, they use the relation between acoustic signal and oromotor gesture to perceive speech.
Although the motor theory of speech perception lay dormant for several years, it recently received renewed attention with the discovery in monkeys of “mirror neurons,” namely, neurons in the ventral premotor cortex (area F5) that discharged not only when the monkey made a visually guided movement, such as reaching for an object and grasping it, but also when the monkey observed the experimenter perform the same movement (25, 26). Audiovisual mirror neurons were soon observed as well (27), this type of cell having been uncovered in monkeys in a more ventral part of area F5 (28); these neurons were activated not only when the subject’s own action (e.g., dropping a stick) resulted in a sound but also when the subject heard the same sound produced by the same, but unseen, action of another. By providing a neural correlate for a link between perception and action, discovery of the mirror-neuron system initiated a new wave of research on the motor theory of speech perception (e.g., ref. 29), uncovering evidence of activation in oromotor-related areas while listening to speech (30–32) and of interference with the perception of speech by transcranial magnetic stimulation (TMS) of premotor and motor cortex (33–35).
Although these new findings indicate that the oromotor system plays a role in speech processing, the motor theory of speech perception in its strong form—which should predict perception failure in the absence of oromotor activation—remains widely challenged (for an overview see 36, 37–39), and particularly so by evidence that speech perception is preserved in “Broca’s aphasics” (40). However, whether or not oromotor activation contributes to the perception of speech, it appears to play little or no role in the perception of reversed words, inasmuch as these were perceived well enough to be easily discriminated from each other when they were presented in pairs separated by 0.5-s intrapair intervals.
Motor Theory of Speech Recognition.
At the same time, the participants’ extremely poor recognition of the nonmimickable Reversed Words supports our hypothesis that a novel speech sound’s pronounceability—i.e., its potential to activate the speech production system automatically and subvocally, perhaps via an auditory mirror-neuron system—may well be essential for laying down a lasting representation of that sound. The neural mechanism is unknown, but, to elaborate on our speculative proposal, this mechanism could involve a multistep process mediated in large part by the AF, in which: (i) signals in temporoparietal cortex produced by processing a novel, fluctuating speech sound would be transmitted via the AF to the ventrolateral frontal cortex and, from there, via corticostriatal pathways (41), signals can be transmitted to the striatum for mapping onto representations of a sequence of articulation and coarticulation plans for reproducing that sound (39, 42); (ii) signals representing this covert, oromotor sequence would then be fed back to the temporoparietal cortex via the AF’s reciprocal projections (43, 44); and (iii) these repackaged signals representing the newly integrated auditory–oromotor sequence would be processed by the lateral temporal auditory stream and transmitted to the medial temporal lobe, where they would be initially encoded and stored. This proposal of a motor theory for speech recognition is similar in some respects to one proposed for vocal learning by songbirds (45, 46), including the critical step (ii above) of an oromotor feedback or corollary discharge to auditory cortex, thereby establishing a precise sensorimotor correspondence between audition and vocalization.
A potential objection to a motor theory of recognition memory for new words is that some patients with severe problems in speech production are nonetheless capable not only of speech comprehension but also of learning language through reception alone (47, 48). Although, to our knowledge, such patients have not been tested on verbal LTM tasks of the type used here, studies of verbal working memory (WM) indicate that the outcome in verbal LTM may depend on the neural basis of the speech impairment. Thus, whereas either congenital or acquired anarthria, a neuromotor disorder that disrupts the control of muscles required for speech (49), can leave verbal WM largely intact (50, 51), apraxia of speech, a higher-level disorder in which speech production impairment results from a disorder of speech planning and programming (49), impairs verbal WM (52). Given the evidence that WM and LTM are highly interactive (53, 54), auditory LTM for new words may well be preserved in patients with anarthria but not in those with speech apraxia, a predicted dissociation of deficits in need of testing.
Another potential argument against our proposal is the fact that, in infants, recognition of auditory stimuli, including verbal stimuli, develops in advance of expressive language (55, 56), a sequence that would appear to contradict the notion that recognition memory of novel speech sounds requires the participation of a motor circuit. As in anarthria, however, the absence of speech production does not preclude a contribution from the oromotor system to speech recognition memory, and, indeed, there is evidence (57–59) that even in early infancy speech sounds activate Broca’s area. However, perhaps the more important mechanism for “auditory recognition memory” in early infancy is associative memory. Like other nonvocal learners, young infants presumably learn to associate speech sounds with visual and other sensory stimuli, although, in their case, only as a substitute for the more slowly developing and incredibly useful vocal mimicry system.
Importantly, the arcuate fasciculus in humans differs from that in nonhuman primates, including apes. Both its density and complexity increased dramatically during human evolution, presumably driven largely by the highly advantageous appearance and progressive development of speech and language. As described recently in a series of diffusion tensor imaging and resting-state connectivity studies (6, 7, 10, 22, 60), this tract consists not only of a direct connection between caudal superior temporal and ventrolateral frontal areas, but also of a parallel, indirect connection through an intermediate station in the inferior parietal lobule.† with this dual-tract system extending ventrally into the middle and inferior temporal gyri as well as dorsally into the ventral premotor gyrus. By comparison, the arcuate fasciculus in monkeys is a primitive one (7, 8, 11, 62), providing a possible explanation for the monkey’s apparent inability to store the representations of fluctuating acoustic stimuli in long-term memory (5). Conversely, the large size and complexity of this dual tract in humans may have enabled long-term memory for speech sounds by providing the auditory system with an input that transforms an intractable, fluctuating, acoustic stimulus into an integrated acoustic/oromotor sequence that can be stored for subsequent recognition. If confirmed, the results would imply that speech and auditory memory are so indissolubly linked that neither could have evolved without the other.
Materials and Methods
Participants.
Thirty-two young adults (mean age, 29 y; range, 20–40 y; 24 females) took part in the study. Half the participants were native speakers of English, and the other half spoke nonnative English. Twenty-five of the participants had learned to play a musical instrument (mean starting age, 9 y; range, 6–23 y) and continued playing for an average of 7 y.
Stimuli.
Four types of acoustic stimuli were used: Words, Pseudowords, Nonverbal Sounds, and Reversed Words. The four stimulus types differ in the ease with which they can be reproduced vocally and/or be labeled with a verbal associate and, consequently, in the degree to which speech, language, and musical ability might be able to support their storage for later recognition.
Words, easily reproduced and easily labeled, had a mean duration of 783 msec (range, 600–964) and a mean length of six letters (range, 4–8) and two syllables (range, 1–3).
Pseudowords, easily reproduced but less easily labeled than words, had a mean duration of 756 msec (range, 525–955) and mean length of six letters (range, 4–9) and two syllables (range, 1–3).
Nonverbal sounds, difficult to reproduce but labelable, had a duration of 760 msec each, reduced from an original duration of 2–3 s. The auditory stimuli in this category sounded as if they were produced by a xylophone, a boat horn, a saw, bells, etc.
Reversed words, difficult to reproduce or label, were the Words played backward, and so had spectrotemporal compositions and sound envelopes identical to those of the Words.
There were 20 stimuli of each type. In compiling the lists of Words and Pseudowords, we rejected items that were semantically and/or phonologically similar to any of the items we kept. These speech sounds were generated with a speech synthesizer using a UK English female voice (Cepstral) and modified for length and loudness using Adobe Audition 3.0. One hundred Nonverbal Sounds, drawn from the same large sound library that was used by Squire et al. (21) and Fritz et al. (5), were each assigned to one of several categories differing in rhythm, pitch, degree of fluctuation, etc., and then compared by the experimenter with all other sounds in the same category. Only subjectively distinctive stimuli were used to compose the set of 20 Nonverbal Sounds. The intensities of all 80 sounds across the four stimulus types were adjusted by the experimenter to be subjectively equal. For each stimulus type, 10 of the 20 items were randomly selected to form the study list, and the remaining 10 were assigned to be the new items in the recognition test. The procedure for every stimulus type, described below, was adapted from the one used by Squire et al. (21) for nonverbal sounds.
Procedure.
Study list.
Participants listened to a study list containing 10 stimuli of one type and were asked to memorize them. Presentation of the study list took ∼12 s (interstimulus intervals, 500 msec), and the presentation order of the 10 stimuli within the list was randomized across participants.
Filler tasks.
To prevent participants from rehearsing the list between its presentation and the recognition test given 5 min later, they were engaged in one of four different subvocal counting (i.e., articulatory suppression) tasks, two visual and two auditory, taken from the Test of Everyday Attention (TEA) (63). The visual tasks required counting of symbols (either on a city map or in a simulated telephone directory), and the auditory tasks required counting of tones (either all tones if only low tones were presented or only low tones if low and high tones were presented). Each task filled the 5-min delay interval, including explanation of the task by the experimenter (∼2 min) and test time (∼3 min). Each participant received a different filler task after each study list, and the four filler tasks were counterbalanced across participants and stimulus types.
Recognition test.
After the filler task, participants listened to 20 stimuli (the 10 from the study list and the 10 novel ones) presented in a randomized order, and, after each stimulus, indicated whether it was “old” or “new.” They then pressed a keyboard space bar to hear the next sound. The same procedure—study list, filler task, recognition test—was repeated for each type of stimulus, with the order of stimulus type counterbalanced across participants using a Latin square design.
Discrimination task.
After completing the recognition memory test, participants were tested for their ability to discriminate reversed words. In this task, the participants listened to 40 pairs of reversed words (20 “same” pairs and 20 “different” pairs) and indicated whether the two members of each pair were the same or different (intrapair interval, 500 msec; mean difference in duration of intrapair stimuli, 44 msec, range 2–93 msec). The response to each pair was self-paced, and participants pressed the space bar to hear the next pair. Four different orders of the 40 stimulus pairs were programmed for presentation, and each order was administered to eight randomly selected participants.
Acknowledgments
This study was supported by the Intramural Research Program of the National Institute of Mental Health, National Institutes of Health, Department of Health and Human Services, as well as by the Medical Research Council (G0300117/65439) and University College London Pump Prime Grant 06CN05.
Footnotes
The authors declare no conflict of interest.
*Unlike the arcuate fasciculus, a different temporofrontal pathway, this one coursing through the extreme capsule to link the middle section of temporal cortex with the ventrolateral and dorsolateral frontal regions, is highly developed not only in humans but in nonhuman primates as well (8, 10, 11).
†This indirect connection observed with diffusion tensor imaging in humans is supported by evidence from tract tracing in monkeys, indicating that the superior temporal gyrus projects to the inferior parietal lobule via the middle longitudinal fasciculus (61), and the inferior parietal lobule projects to the ventral premotor cortex via the superior longitudinal fasciculus (8).
References
- 1.Mishkin M. Memory in monkeys severely impaired by combined but not by separate removal of amygdala and hippocampus. Nature. 1978;273:297–298. doi: 10.1038/273297a0. [DOI] [PubMed] [Google Scholar]
- 2.Murray EA, Mishkin M. Severe tactual as well as visual memory deficits follow combined removal of the amygdala and hippocampus in monkeys. J Neurosci. 1984;4:2565–2580. doi: 10.1523/JNEUROSCI.04-10-02565.1984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Murray EA, Mishkin M. Object recognition and location memory in monkeys with excitotoxic lesions of the amygdala and hippocampus. J Neurosci. 1998;18:6568–6582. doi: 10.1523/JNEUROSCI.18-16-06568.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Buffalo EA, et al. Dissociation between the effects of damage to perirhinal cortex and area TE. Learn Mem. 1999;6:572–599. doi: 10.1101/lm.6.6.572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Fritz J, Mishkin M, Saunders RC. In search of an auditory engram. Proc Natl Acad Sci USA. 2005;102:9359–9364. doi: 10.1073/pnas.0503998102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Catani M, et al. Symmetries in human brain language pathways correlate with verbal recall. Proc Natl Acad Sci USA. 2007;104:17163–17168. doi: 10.1073/pnas.0702116104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Rilling JK, et al. The evolution of the arcuate fasciculus revealed with comparative DTI. Nat Neurosci. 2008;11:426–428. doi: 10.1038/nn2072. [DOI] [PubMed] [Google Scholar]
- 8.Petrides M, Pandya DN. Distinct parietal and temporal pathways to the homologues of Broca’s area in the monkey. PLoS Biol. 2009;7:e1000170. doi: 10.1371/journal.pbio.1000170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Saur D, et al. Ventral and dorsal pathways for language. Proc Natl Acad Sci USA. 2008;105:18035–18040. doi: 10.1073/pnas.0805234105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Frey S, Campbell JS, Pike GB, Petrides M. Dissociating the human language pathways with high angular resolution diffusion fiber tractography. J Neurosci. 2008;28:11435–11444. doi: 10.1523/JNEUROSCI.2388-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Rilling JK, Glasser MF, Jbabdi S, Andersson J, Preuss TM. Continuity, divergence, and the evolution of brain language pathways. Front Evol Neurosci. 2012;3:1–6. doi: 10.3389/fnevo.2011.00011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Samson S. Musical function and temporal lobe structures: A review of brain lesion studies. J New Music Res. 1999;28:217–228. [Google Scholar]
- 13.Samson S, Zatorre RJ. Learning and retention of melodic and verbal information after unilateral temporal lobectomy. Neuropsychologia. 1992;30:815–826. doi: 10.1016/0028-3932(92)90085-z. [DOI] [PubMed] [Google Scholar]
- 14.DeWitt I, Rauschecker JP. Phoneme and word recognition in the auditory ventral stream. Proc Natl Acad Sci USA. 2012;109:E505–E514. doi: 10.1073/pnas.1113427109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Turken AU, Dronkers NF. The neural architecture of the language comprehension network: Converging evidence from lesion and connectivity analyses. Front Syst Neurosci. 2011;5:1. doi: 10.3389/fnsys.2011.00001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Gardiner JM, Brandt KR, Vargha-Khadem F, Baddeley AD, Mishkin M. Effects of level of processing but not of task enactment on recognition memory in a case of developmental amnesia. Cogn Neuropsychol. 2006;23:930–948. doi: 10.1080/02643290600588442. [DOI] [PubMed] [Google Scholar]
- 17.Brandt KR, Gardiner JM, Vargha-Khadem F, Baddeley AD, Mishkin M. Impairment of recollection but not familiarity in a case of developmental amnesia. Neurocase. 2008;15:60–65. doi: 10.1080/13554790802613025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Sagar HJ, Gabrieli JD, Sullivan EV, Corkin S. Recency and frequency discrimination in the amnesic patient H.M. Brain. 1990;113:581–602. doi: 10.1093/brain/113.3.581. [DOI] [PubMed] [Google Scholar]
- 19.Haist F, Shimamura AP, Squire LR. On the relationship between recall and recognition memory. J Exp Psychol Learn Mem Cogn. 1992;18:691–702. doi: 10.1037//0278-7393.18.4.691. [DOI] [PubMed] [Google Scholar]
- 20.Zola-Morgan S, Squire LR, Amaral DG. Human amnesia and the medial temporal region: Enduring memory impairment following a bilateral lesion limited to field CA1 of the hippocampus. J Neurosci. 1986;6:2950–2967. doi: 10.1523/JNEUROSCI.06-10-02950.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Squire LR, Schmolck H, Stark SM. Impaired auditory recognition memory in amnesic patients with medial temporal lobe lesions. Learn Mem. 2001;8:252–256. doi: 10.1101/lm.42001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Catani M, Jones DK, ffytche DH. Perisylvian language networks of the human brain. Ann Neurol. 2005;57:8–16. doi: 10.1002/ana.20319. [DOI] [PubMed] [Google Scholar]
- 23.Liberman AM, Cooper FS, Shankweiler DP, Studdert-Kennedy M. Perception of the speech code. Psychol Rev. 1967;74:431–461. doi: 10.1037/h0020279. [DOI] [PubMed] [Google Scholar]
- 24.Liberman AM, Mattingly IG. The motor theory of speech perception revised. Cognition. 1985;21:1–36. doi: 10.1016/0010-0277(85)90021-6. [DOI] [PubMed] [Google Scholar]
- 25.di Pellegrino G, Fadiga L, Fogassi L, Gallese V, Rizzolatti G. Understanding motor events: A neurophysiological study. Exp Brain Res. 1992;91:176–180. doi: 10.1007/BF00230027. [DOI] [PubMed] [Google Scholar]
- 26.Rizzolatti G, Fadiga L, Gallese V, Fogassi L. Premotor cortex and the recognition of motor actions. Brain Res Cogn Brain Res. 1996;3:131–141. doi: 10.1016/0926-6410(95)00038-0. [DOI] [PubMed] [Google Scholar]
- 27.Kohler E, et al. Hearing sounds, understanding actions: Action representation in mirror neurons. Science. 2002;297:846–848. doi: 10.1126/science.1070311. [DOI] [PubMed] [Google Scholar]
- 28.Rizzolatti G, Craighero L. The mirror-neuron system. Annu Rev Neurosci. 2004;27:169–192. doi: 10.1146/annurev.neuro.27.070203.144230. [DOI] [PubMed] [Google Scholar]
- 29.Galantucci B, Fowler CA, Turvey MT. The motor theory of speech perception reviewed. Psychon Bull Rev. 2006;13:361–377. doi: 10.3758/bf03193857. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Wilson SM, Saygin AP, Sereno MI, Iacoboni M. Listening to speech activates motor areas involved in speech production. Nat Neurosci. 2004;7:701–702. doi: 10.1038/nn1263. [DOI] [PubMed] [Google Scholar]
- 31.Pulvermüller F, Hauk O, Nikulin VV, Ilmoniemi RJ. Functional links between motor and language systems. Eur J Neurosci. 2005;21:793–797. doi: 10.1111/j.1460-9568.2005.03900.x. [DOI] [PubMed] [Google Scholar]
- 32.Wilson SM, Iacoboni M. Neural responses to non-native phonemes varying in producibility: Evidence for the sensorimotor nature of speech perception. Neuroimage. 2006;33:316–325. doi: 10.1016/j.neuroimage.2006.05.032. [DOI] [PubMed] [Google Scholar]
- 33.Meister IG, Wilson SM, Deblieck C, Wu AD, Iacoboni M. The essential role of premotor cortex in speech perception. Curr Biol. 2007;17:1692–1696. doi: 10.1016/j.cub.2007.08.064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Möttönen R, Watkins KE. Motor representations of articulators contribute to categorical perception of speech sounds. J Neurosci. 2009;29:9819–9825. doi: 10.1523/JNEUROSCI.6018-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Fadiga L, Craighero L, Buccino G, Rizzolatti G. Speech listening specifically modulates the excitability of tongue muscles: A TMS study. Eur J Neurosci. 2002;15:399–402. doi: 10.1046/j.0953-816x.2001.01874.x. [DOI] [PubMed] [Google Scholar]
- 36.Lotto AJ, Hickok GS, Holt LL. Reflections on mirror neurons and speech perception. Trends Cogn Sci. 2009;13:110–114. doi: 10.1016/j.tics.2008.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hickok G. Eight problems for the mirror neuron theory of action understanding in monkeys and humans. J Cogn Neurosci. 2009;21:1229–1243. doi: 10.1162/jocn.2009.21189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Hickok G. The functional neuroanatomy of language. Phys Life Rev. 2009;6:121–143. doi: 10.1016/j.plrev.2009.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Hickok G, Houde J, Rong F. Sensorimotor integration in speech processing: Computational basis and neural organization. Neuron. 2011;69:407–422. doi: 10.1016/j.neuron.2011.01.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Hickok G, Costanzo M, Capasso R, Miceli G. The role of Broca’s area in speech perception: Evidence from aphasia revisited. Brain Lang. 2011;119:214–220. doi: 10.1016/j.bandl.2011.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Yeterian EH, Pandya DN. Prefrontostriatal connections in relation to cortical architectonic organization in rhesus monkeys. J Comp Neurol. 1991;312:43–67. doi: 10.1002/cne.903120105. [DOI] [PubMed] [Google Scholar]
- 42.Price CJ, Crinion JT, Macsweeney M. A generative model of speech production in Broca’s and Wernicke’s areas. Front Psychol. 2011;2:237. doi: 10.3389/fpsyg.2011.00237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Seltzer B, Pandya DN. Frontal lobe connections of the superior temporal sulcus in the rhesus monkey. J Comp Neurol. 1989;281:97–113. doi: 10.1002/cne.902810108. [DOI] [PubMed] [Google Scholar]
- 44.Romanski LM, et al. Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nat Neurosci. 1999;2:1131–1136. doi: 10.1038/16056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Prather JF, Peters S, Nowicki S, Mooney R. Precise auditory-vocal mirroring in neurons for learned vocal communication. Nature. 2008;451:305–310. doi: 10.1038/nature06492. [DOI] [PubMed] [Google Scholar]
- 46.Mooney R. Neural mechanisms for learned birdsong. Learn Mem. 2009;16:655–669. doi: 10.1101/lm.1065209. [DOI] [PubMed] [Google Scholar]
- 47.Lenneberg EH. Understanding language without ability to speak: A case report. J Abnorm Soc Psychol. 1962;65:419–425. doi: 10.1037/h0041906. [DOI] [PubMed] [Google Scholar]
- 48.Christen HJ, et al. Foix-Chavany-Marie (anterior operculum) syndrome in childhood: A reappraisal of Worster-Drought syndrome. Dev Med Child Neurol. 2000;42:122–132. doi: 10.1017/s0012162200000232. [DOI] [PubMed] [Google Scholar]
- 49.Liégeois FJ, Morgan AT. Neural bases of childhood speech disorders: Lateralization and plasticity for speech functions during development. Neurosci Biobehav Rev. 2012;36:439–458. doi: 10.1016/j.neubiorev.2011.07.011. [DOI] [PubMed] [Google Scholar]
- 50.Bishop DVM, Robson J. Unimpaired short-term-memory and rhyme judgment in congenitally speechless individuals: Implications for the notion of articulatory coding. Q J Exp Psychol. 1989;41:123–140. [Google Scholar]
- 51.Baddeley AD, Wilson B. Phonological coding and short-term-memory in patients without speech. J Mem Lang. 1985;24:490–502. [Google Scholar]
- 52.Waters GS, Rochon E, Caplan D. The role of high-level speech planning in rehearsal: Evidence from patients with apraxia of speech. J Mem Lang. 1992;31:54–73. [Google Scholar]
- 53.Cowan N. Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information-processing system. Psychol Bull. 1988;104:163–191. doi: 10.1037/0033-2909.104.2.163. [DOI] [PubMed] [Google Scholar]
- 54.Cowan N. What are the differences between long-term, short-term, and working memory? Prog Brain Res. 2008;169:323–338. doi: 10.1016/S0079-6123(07)00020-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Trainor LJ, Wu L, Tsang CD. Long-term memory for music: Infants remember tempo and timbre. Dev Sci. 2004;7:289–296. doi: 10.1111/j.1467-7687.2004.00348.x. [DOI] [PubMed] [Google Scholar]
- 56.Spence MJ. Young infants’ long-term auditory memory: Evidence for changes in preference as a function of delay. Dev Psychobiol. 1996;29:685–695. doi: 10.1002/(SICI)1098-2302(199612)29:8<685::AID-DEV4>3.0.CO;2-P. [DOI] [PubMed] [Google Scholar]
- 57.Imada T, et al. Infant speech perception activates Broca’s area: A developmental magnetoencephalography study. Neuroreport. 2006;17:957–962. doi: 10.1097/01.wnr.0000223387.51704.89. [DOI] [PubMed] [Google Scholar]
- 58.Dehaene-Lambertz G, et al. Functional organization of perisylvian activation during presentation of sentences in preverbal infants. Proc Natl Acad Sci USA. 2006;103:14240–14245. doi: 10.1073/pnas.0606302103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Bristow D, et al. Hearing faces: How the infant brain matches the face it sees with the speech it hears. J Cogn Neurosci. 2009;21:905–921. doi: 10.1162/jocn.2009.21076. [DOI] [PubMed] [Google Scholar]
- 60.Kelly C, et al. Broca’s region: Linking human brain functional connectivity data and non-human primate tracing anatomy studies. Eur J Neurosci. 2010;32:383–398. doi: 10.1111/j.1460-9568.2010.07279.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Seltzer B, Pandya DN. Afferent cortical connections and architectonics of the superior temporal sulcus and surrounding cortex in the rhesus monkey. Brain Res. 1978;149:1–24. doi: 10.1016/0006-8993(78)90584-x. [DOI] [PubMed] [Google Scholar]
- 62.Thiebaut de Schotten M, Dell’Acqua F, Valabregue R, Catani M. Monkey to human comparative anatomy of the frontal lobe association tracts. Cortex. 2012;48:82–96. doi: 10.1016/j.cortex.2011.10.001. [DOI] [PubMed] [Google Scholar]
- 63.Robertson IH, Ward T, Ridgeway V, Nimmo-Smith I. The Test of Everyday Attention (TEA) Test Reviews. 2001;4:51–55. [Google Scholar]

