Abstract
Spoken word recognition requires complex, invariant representations. Using a meta-analytic approach incorporating more than 100 functional imaging experiments, we show that preference for complex sounds emerges in the human auditory ventral stream in a hierarchical fashion, consistent with nonhuman primate electrophysiology. Examining speech sounds, we show that activation associated with the processing of short-timescale patterns (i.e., phonemes) is consistently localized to left mid-superior temporal gyrus (STG), whereas activation associated with the integration of phonemes into temporally complex patterns (i.e., words) is consistently localized to left anterior STG. Further, we show left mid- to anterior STG is reliably implicated in the invariant representation of phonetic forms and that this area also responds preferentially to phonetic sounds, above artificial control sounds or environmental sounds. Together, this shows increasing encoding specificity and invariance along the auditory ventral stream for temporally complex speech sounds.
Keywords: functional MRI, meta-analysis, auditory cortex, object recognition, language
Spoken word recognition presents several challenges to the brain. Two key challenges are the assembly of complex auditory representations and the variability of natural speech (SI Appendix, Fig. S1) (1). Representation at the level of primary auditory cortex is precise: fine-grained in scale and local in spectrotemporal space (2, 3). The recognition of complex spectrotemporal forms, like words, in higher areas of auditory cortex requires the transformation of this granular representation into Gestalt-like, object-centered representations. In brief, local features must be bound together to form representations of complex spectrotemporal contours, which are themselves the constituents of auditory “objects” or complex sound patterns (4, 5). Next, representations must be generalized and abstracted. Coding in primary auditory cortex is sensitive even to minor physical transformations. Object-centered coding in higher areas, however, must be invariant (i.e., tolerant of natural stimulus variation) (6). For example, whereas the phonemic structure of a word is fixed, there is considerable variation in physical, spectrotemporal form—attributable to accent, pronunciation, body size, and the like—among utterances of a given word. It has been proposed for visual cortical processing that a feed-forward, hierarchical architecture (7) may be capable of simultaneously solving the problems of complexity and variability (8–12). Here, we examine these ideas in the context of auditory cortex.
In a hierarchical pattern-recognition scheme (8), coding in the earliest cortical field would reflect the tuning and organization of primary auditory cortex (or core) (2, 3, 13). That is, single-neuron receptive fields (more precisely, frequency-response areas) would be tuned to particular center frequencies and would have minimal spectrotemporal complexity (i.e., a single excitatory zone and one-to-two inhibitory side bands). Units in higher fields would be increasingly pattern selective and invariant to natural variation. Pattern selectivity and invariance respectively arise from neural computations similar in effect to “logical-AND” and “logical-OR” gates. In the auditory system, neurons whose tuning is combination sensitive (14–21) perform the logical-AND gate–like operation, conjoining structurally simple representations in lower-order units into the increasingly complex representations (i.e., multiple excitatory and inhibitory zones) of higher-order units. In the case of speech sounds, these neurons conjoin representations for adjacent speech formants or, at higher levels, adjacent phonemes. Although the mechanism by which combination sensitivity (CS) is directionally selective in the temporal domain is not fully understood, some propositions exist (22–26). As an empirical matter, direction selectivity is clearly present early in auditory cortex (19, 27). It is also observed to operate at time scales (50–250 ms) sufficient for phoneme concatenation, as long as 250 ms in the zebra finch (15) and 100 to 150 ms in macaque lateral belt (18). Logical-OR gate–like computation, technically proposed to be a soft maximum operation (28–30), is posited to be performed by spectrotemporal-pooling units. These units respond to suprathreshold stimulation from any member of their connected lower-order pool, thus creating a superposition of the connected lower-order representations and abstracting them. With respect to speech, this might involve the pooling of numerous, rigidly tuned representations of different exemplars of a given phoneme into an abstracted representation of the entire pool. Spatial pooling is well documented in visual cortex (7, 31, 32) and there is some evidence for its analog, spectrotemporal pooling, in auditory cortex (33–35), including the observation of complex cells when A1 is developmentally reprogrammed as a surrogate V1 (36). However, a formal equivalence is yet to be demonstrated (37, 38).
Auditory cortex's predominant processing pathways, ventral and dorsal (39, 40), appear to be optimized for pattern recognition and action planning, respectively (17, 18, 40–44). Speech-specific models generally concur (45–48), creating a wide consensus that word recognition is performed in the auditory ventral stream (refs. 42, 45, 47–50, but see refs. 51–53). The hierarchical model predicts an increase in neural receptive field size and complexity along the ventral stream. With respect to speech, there is a discontinuity in the processing demands associated with the recognition of elemental phonetic units (i.e., phonemes or something phone-like) and concatenated units (i.e., multisegmental forms, both sublexical forms and word forms). Phoneme recognition requires sensitivity to the arrangement of constellations of spectrotemporal features (i.e., the presence and absence of energy at particular center frequencies and with particular temporal offsets). Word-form recognition requires sensitivity to the temporal arrangement of phonemes. Thus, phoneme recognition requires spectrotemporal CS and operates on low-level acoustic features (SI Appendix, Fig. S1B, second layer), whereas word-form recognition requires only temporal CS (i.e., concatenation of phonemes) and operates on higher-order features that may also be perceptual objects in their own right (SI Appendix, Fig. S1B, top layer). If word-form recognition is implemented hierarchically, we might expect this discontinuity in processing to be mirrored in cortical organization, with concatenative phonetic recognition occurring distal to elemental phonetic recognition.
Primate electrophysiology identifies CS as occurring as early as core's supragranular layers and in lateral belt (16, 17, 19, 37). In the macaque, selectivity for communication calls—similar in spectrotemporal structure to phonemes or consonant-vowel (CV) syllables—is observed in belt area AL (54) and, to an even greater degree, in a more anterior field, RTp (55). Further, for macaques trained to discriminate human phonemes, categorical coding is present in the single-unit activity of AL neurons as well as in the population activity of area AL (1, 56). Human homologs to these sites putatively lie on or about the anterior-lateral aspect of Heschl's gyrus and in the area immediately posterior to it (13, 57–59). Macaque PET imaging suggests there is also an evolutionary predisposition to left-hemisphere processing for conspecific communication calls (60). Consistent with macaque electrophysiology, human electrocorticography recordings from superior temporal gyrus (STG), in the region immediately posterior to the anterior-lateral aspect of Heschl's gyrus (i.e., mid-STG), show the site to code for phoneme identity at the population level (61). Mid-STG is also the site of peak high-gamma activity in response to CV sounds (62–64). Similarly, human functional imaging studies suggest left mid-STG is involved in processing elemental speech sounds. For instance, in subtractive functional MRI (fMRI) comparisons, after partialing out variance attributable to acoustic factors, Leaver and Rauschecker (2010) showed selectivity in left mid-STG for CV speech sounds as opposed to other natural sounds (5). This implies the presence of a local density of neurons with receptive-field tuning optimized for the recognition of elemental phonetic sounds [i.e., areal specialization (AS)]. Furthermore, the region exhibits fMRI-adaptation phenomena consistent with invariant representation (IR) (65, 66). That is, response diminishes when the same phonetic content is repeatedly presented even though a physical attribute of the stimulus, one unrelated to phonetic content, is changed; here, the speaker's voice (5). Similarly, using speech sound stimuli on the /ga/ — /da/ continuum and comparing response to exemplar pairs that varied only in acoustics or which varied both in acoustics and in phonetic content, Joanisse and colleagues (2007) found adaptation specific to phonetic content in left mid-STG, again implying IR (67).
The site downstream of mid-STG, performing phonetic concatenation, should possess neurons that respond to late components of multisegmental sounds (i.e., latencies >60 ms). These units should also be selective for specific phoneme orderings. Nonhuman primate data for regions rostral to A1 confirm that latencies increase rostrally along the ventral stream (34, 55, 68, 69), with the median latency to peak response approaching 100 ms in area RT (34), consistent with the latencies required for phonetic concatenation. In a rare human electrophysiology study, Creutzfeldt and colleagues (1989) report vigorous single-unit responses to words and sentences in mid- to anterior STG (70). This included both feature-tuned units and late-component-tuned units. Although the relative location of feature and late-component units is not reported, and the late component units do not clearly evince temporal CS, the mixture of response types supports the supposition of temporal combination-sensitive units in human STG. Imaging studies localize processing of multisegmental forms to anterior STG/superior temporal sulcus (STS). This can be seen in peak activation to word-forms in electrocorticography (71) and magnetoencephalography (72). FMRI investigations of stimulus complexity, comparing activation to word-form and pure-tone stimuli, report similar localization (47, 73, 74). Invariant tuning for word forms, as inferred from fMRI-adaptation studies, also localizes to anterior STG/STS (75–77). Studies investigating cross-modal repetition effects for auditory and visual stimuli confirm anterior STG/STS localization and, further, show it to be part of unimodal auditory cortex (78, 79). Finally, application of electrical cortical interference to anterior STG disrupts auditory comprehension, producing patient reports of speech as being like “a series of meaningless utterances” (80).
Here, we use a coordinate-based meta-analytic approach [activation likelihood estimation (ALE)] (81) to make an unbiased assessment of the robustness of functional-imaging evidence for the aforementioned speech-recognition model. In short, the method assesses the stereotaxic concordance of reported effects. First, we investigate the strength of evidence for the predicted anatomical dissociation between elemental phonetic recognition (mid-STG) and concatenative phonetic recognition (anterior STG). To assess this, two functional imaging paradigms are meta-analyzed: speech vs. acoustic-control sounds (a proxy for CS, as detailed later) and repetition suppression (RS). For each paradigm, separate analyses are performed for studies of elemental phonetic processing (i.e., phoneme- and CV-length stimuli) and for studies involving concatenative phonetic processing (i.e., word-length stimuli). Although the aforementioned model is principally concerned with word-from recognition, for comparative purposes, we meta-analyze studies of phrase-length stimuli as well. Second, we investigate the strength of evidence for the predicted ventral-stream colocalization of CS and IR phenomena. To assess this, the same paradigms are reanalyzed with two modifications: (i) For IR, a subset of RS studies meeting heightened criteria for fMRI-adaptation designs is included (Methods); (ii) to attain sufficient sample size, analyses are collapsed across stimulus lengths.
We also investigate the strength of evidence for AS, which has been suggested as an organizing principle in higher-order areas of the auditory ventral stream (5, 82–85) and is a well established organizing principle in the visual system's analogous pattern recognition pathway (86–89). In the interest of comparing the organizational properties of the auditory ventral stream with those of the visual ventral stream, we assess the colocalization of AS phenomena with CS and IR phenomena. CS and IR are examined as described earlier. AS is examined by meta-analysis of speech vs. nonspeech natural-sound paradigms.
At a deep level, both our AS and CS analyses putatively examine CS-dependent tuning for complex patterns of spectrotemporal energy. Acoustic-control sounds lack the spectrotemporal feature combinations requisite for driving combination-sensitive neurons tuned to speech sounds. For nonspeech natural sounds, the same is true, but there should also exist combination-sensitive neurons tuned to these stimuli, as they have been repeatedly encountered over development. For an effect to be observed in the AS analyses, not only must there be a population of combination-sensitive speech-tuned neurons, but these neurons must also cluster together such that a differential response is observable at the macroscopic scale of fMRI and PET.
Results
Phonetic-length-based analyses of CS studies (i.e., speech sounds vs. acoustic control sounds) were performed twice. In the first analyses, tonal control stimuli were excluded on grounds that they do not sufficiently match the spectrotemporal energy distribution of speech. That is, for a strict test of CS, we required acoustic control stimuli to model low-level properties of speech (i.e., contain spectrotemporal features coarsely similar to speech), not merely to drive primary and secondary auditory cortex. Under this preparation, spatial concordance was greatest in STG/STS across each phonetic length-based analysis (Table 1). Within STG/STS, results were left-biased across peak ALE-statistic value, cluster volume, and the percentage of studies reporting foci within a given cluster, hereafter “cluster concordance.” The predicted differential localization for phoneme- and word-length processing was confirmed, with phoneme-length effects most strongly associated with left mid-STG and word-length effects with left anterior STG (Fig. 1 and SI Appendix, Fig. S2). Phrase-length studies showed a similar leftward processing bias. Further, peak processing for phrase-length stimuli localized to a site anterior and subjacent to that of word-length stimuli, suggesting a processing gradient for phonetic stimuli that progresses from mid-STG to anterior STG and then into STS. Although individual studies report foci for left frontal cortex in each of the length-based cohorts, only in the phrase-length analysis do focus densities reach statistical significance.
Table 1.
Results for phonetic length-based analyses
Analysis/anatomy | BA | Cluster Concordance | Volume, mm3 | Center of mass |
Peak coordinates |
Peak ALE | ||||
x | y | z | x | y | z | |||||
CS | ||||||||||
Phoneme length | ||||||||||
Left STG | 42/22 | 0.93 | 3,624 | −57 | −25 | 1 | −58 | −20 | 2 | 0.028 |
Right STG/RT | 42/22 | 0.21 | 512 | 56 | −11 | −2 | 54 | −2 | 2 | 0.015 |
Word length | ||||||||||
Left STG | 42/22 | 0.56 | 2,728 | −57 | −17 | −1 | −56 | −16 | −2 | 0.021 |
Right STG | 22 | 0.13 | 192 | 55 | −17 | 0 | 56 | −16 | 0 | 0.014 |
Phrase length | ||||||||||
Left STS | 21 | 0.58 | 2,992 | −56 | −8 | −8 | −56 | −8 | −8 | 0.038 |
Left STS | 21 | 0.42 | 1,456 | −52 | 7 | −16 | −52 | 8 | −16 | 0.035 |
Right STS | 21 | 0.32 | 2,264 | 54 | −3 | −9 | 56 | −6 | −6 | 0.032 |
Left STS | 22 | 0.32 | 840 | −54 | −35 | 1 | −54 | −34 | 0 | 0.028 |
Left PreCG | 6 | 0.32 | 664 | −47 | −7 | 47 | −48 | −8 | 48 | 0.025 |
Left IFG | 47 | 0.21 | 456 | −42 | 25 | −12 | −42 | 24 | −12 | 0.021 |
Left IFG | 44 | 0.16 | 200 | −48 | 11 | 20 | −48 | 10 | 20 | 0.020 |
RS | ||||||||||
Phoneme length | ||||||||||
Left STG | 42/22 | 0.33 | 640 | −58 | −21 | 4 | −58 | −20 | 4 | 0.018 |
Word length | ||||||||||
Left STG | 42/22 | 0.50 | 1408 | −56 | −9 | −3 | −56 | −10 | −4 | 0.027 |
Left STG | 42/22 | 0.19 | 288 | −58 | −28 | 2 | −58 | −28 | 2 | 0.017 |
BA, Brodmann area; IFG, inferior frontal gyrus; PreCG, precentral gyrus; RT, rostrotemporal area.
Fig. 1.
Foci meeting inclusion criteria for length-based CS analyses (A–C) and ALE-statistic maps for regions of significant concordance (D–F) (p < 10−3, k > 150 cm3). Analyses show leftward bias and an anterior progression in peak effects with phoneme-length studies showing greatest concordance in left mid-STG (A and D; n = 14), word-length studies showing greatest concordance in left anterior STG (B and E; n = 16), and phrase-length analyses showing greatest concordance in left anterior STS (C and F; n = 19). Sample size is given with respect to the number of contrasts from independent experiments contributing to an analysis.
Second, to increase sample size and enable lexical status-based subanalyses, we included studies that used tonal control stimuli. Under this preparation the same overall pattern of results was observed with one exception: the addition of a pair of clusters in left ventral prefrontal cortex for the word-length analysis (SI Appendix, Fig. S3 and Table S1). Next, we further subdivided word-length studies according to lexical status: real word or pseudoword. A divergent pattern of concordance was observed in left STG (Fig. 2 and SI Appendix, Fig. S4 and Table S1). Peak processing for real-word stimuli robustly localized to anterior STG. For pseudoword stimuli, a bimodal distribution was observed, peaking both in mid- and anterior STG and coextensive with the real-word cluster.
Fig. 2.
Foci meeting liberal inclusion criteria for lexically based word-length CS analyses (A and B) and ALE-statistic maps for regions of significant concordance (C and D) (p < 10−3, k > 150 cm3). Similar to the CS analyses in Fig. 1, a leftward bias and an anterior progression in peak effects are shown. Pseudoword studies show greatest concordance in left mid- to anterior STG (A and C; n = 13). Notably, the distribution of concordance effects is bimodal, peaking both in mid- (−60, −26, 6) and anterior (−56, −10, 2) STG. Real-word studies show greatest concordance in left anterior STG (B and D; n = 22).
Third, to assess the robustness of the predicted STG stimulus-length processing gradient, length-based analyses were performed on foci from RS studies. For both phoneme- and word-length stimuli, concordant foci were observed to be strictly left-lateralized and exclusively within STG (Table 1). The predicted processing gradient was also observed. Peak concordance for phoneme-length stimuli was seen in mid-STG, whereas peak concordance for word-length stimuli was seen in anterior STG (Fig. 3 and SI Appendix, Fig. S5). For the word-length analysis, a secondary cluster was observed in mid-STG. This may reflect repetition effects concurrently observed for phoneme-level representation or, as the site is somewhat inferior to that of phoneme-length effects, it may be tentative evidence of a secondary processing pathway within the ventral stream (63, 90).
Fig. 3.
Foci meeting inclusion criteria for length-based RS analyses (A and B) and ALE-statistic maps for regions of significant concordance (C and D) (p < 10−3, k > 150 cm3). Analyses show left lateralization and an anterior progression in peak effects with phoneme-length studies showing greatest concordance in left mid-STG (A and C; n = 12) and word-length studies showing greatest concordance in left anterior STG (B and D; n = 16). Too few studies exist for phrase-length analyses (n = 4).
Fourth, to assess colocalization of CS, IR, and AS, we performed length-pooled analyses (Fig. 4, Table 2, and SI Appendix, Fig. S6). Robust CS effects were observed in STG/STS. Again, they were left-biased across peak ALE-statistic value, cluster volume, and cluster concordance. Significant concordance was also found in left frontal cortex. A single result was observed in the IR analysis, localizing to left mid- to anterior STG. This cluster was entirely coextensive with the primary left-STG CS cluster. Finally, analysis of AS foci found concordance in STG/STS. It was also left-biased in peak ALE-statistic value, cluster volume, and cluster concordance. Further, a left-lateralized ventral prefrontal result was observed. The principal left STG/STS cluster was coextensive with the region of overlap between the CS and IR analyses. Within superior temporal cortex, the AS analysis was also generally coextensive with the CS analysis. In left ventral prefrontal cortex, the AS and CS results were not coextensive but were nonetheless similarly localized. Fig. 5 shows exact regions of overlap across length-based and pooled analyses.
Fig. 4.
Foci meeting inclusion criteria for length-pooled analyses (A–C) and ALE-statistic maps for regions of significant concordance (D–F) (p < 10−3, k > 150 cm3). Analyses show leftward bias in the CS (A and D; n = 49) and AS (C and F; n = 15) analyses and left lateralization in the IR (B and E; n = 11) analysis. Foci are color coded by stimulus length: phoneme length, red; word length, green; and phrase length, blue.
Table 2.
Results for aggregate analyses
Analysis/anatomy | BA | Cluster Concordance | Volume, mm3 | Center of Mass |
Peak Coordinates |
Peak ALE | ||||
x | y | z | x | y | z | |||||
CS | ||||||||||
Left STG | 42/22 | 0.82 | 11,944 | −57 | −19 | −1 | −58 | −18 | 0 | 0.056 |
Right STG | 42/22 | 0.47 | 6,624 | 55 | −10 | −3 | 56 | −6 | −6 | 0.045 |
Left STS | 21 | 0.18 | 1,608 | −51 | 8 | −14 | −50 | 8 | −14 | 0.039 |
Left PreCG | 6 | 0.12 | 736 | −47 | −7 | 48 | −48 | −8 | 48 | 0.031 |
Left IFG | 44 | 0.10 | 744 | −45 | 12 | 21 | −46 | 12 | 20 | 0.025 |
Left IFG | 47 | 0.08 | 240 | −42 | 25 | −12 | −42 | 24 | −12 | 0.022 |
Left IFG | 45 | 0.04 | 200 | −50 | 21 | 12 | −50 | 22 | 12 | 0.020 |
IR* | ||||||||||
Left STG | 22/21 | 0.45 | 1,200 | −58 | −16 | −1 | −56 | −14 | −2 | 0.020 |
AS | ||||||||||
Left STG | 42/22 | 0.87 | 3,976 | −58 | −22 | 2 | −58 | −24 | 2 | 0.031 |
Right STG | 42/22 | 0.53 | 2,032 | 51 | −23 | 2 | 54 | −16 | 0 | 0.026 |
Left IFG | 47/45 | 0.13 | 368 | −45 | 17 | 3 | −44 | 18 | 2 | 0.018 |
*Broader inclusion criteria for the IR analysis (SI Appendix, Table S3) yield equivalent results with the following qualifications: cluster volume 1,008 mm3 and cluster concordance 0.33.
Fig. 5.
Flat-map presentation of ALE cluster overlap for (A) the CS analyses shown in Fig. 1, (B) the word-length lexical status analyses shown in Fig. 2, (C) the RS analyses shown in Fig. 3, and (D) the length-pooled analyses shown in Fig. 4. For orientation, prominent landmarks are shown on the left hemisphere of A, including the circular sulcus (CirS), central sulcus (CS), STG, and STS.
Discussion
Meta-analysis of speech processing shows a left-hemisphere optimization for speech and an anterior-directed processing gradient. Two unique findings are presented. First, dissociation is observed for the processing of phonemes, words, and phrases: elemental phonetic processing is most strongly associated with mid-STG; auditory word-form processing is most strongly associated with anterior STG, and phrasal processing is most strongly associated with anterior STS. Second, evidence for CS, IR, and AS colocalize in mid- to anterior STG. Each finding supports the presence of an anterior-directed ventral-stream pattern-recognition pathway. This is in agreement with Leaver and Rauschecker (2010), who tested colocalization of AS and IR in a single sample using phoneme-length stimuli (5). Recent meta-analyses that considered related themes affirm aspects of the present work. In a study that collapsed across phoneme and pseudoword processing, Turkeltaub and Coslett (2010) localized sublexical processing to mid-STG (91). This is consistent with our more specific localization of elemental phonetic processing. Samson and colleagues (2011), examining preferential tuning for speech over music, report peak concordance in left anterior STG/STS (92), consistent with our more general areal-specialization analysis. Finally, our results support Binder and colleagues’ (2000) anterior-directed, hierarchical account of word recognition (47) and Cohen and colleagues’ (2004) hypothesis of an auditory word-form area in left anterior STG (78).
Classically, auditory word-form recognition was thought to localize to posterior STG/STS (93). This perspective may have been biased by the spatial distribution of middle cerebral artery accidents. The artery's diameter decreases along the Sylvian fissure, possibly increasing the prevalence of posterior infarcts. Current methods in aphasia research are better controlled and more precise. They implicate mid- and anterior temporal regions in speech comprehension, including anterior STG (94, 95). Although evidence for an anterior STG/STS localization of auditory word-form processing has been present in the functional imaging literature since inception (96–99), perspectives advancing this view have been controversial and the localization is still not uniformly accepted. We find strong agreement among word-processing experiments, both within and across paradigms, each supporting relocation of auditory word-form recognition to anterior STG. Through consideration of phoneme- and phrasal-processing experiments, we show the identified anterior-STG word form-recognition site to be situated between sites robustly associated with phoneme and phrase processing. This comports with hierarchical processing and thereby further supports anterior-STG localization for auditory word-form recognition.
It is important to note that some authors define “posterior” STG to be posterior of the anterior-lateral aspect of Heschl's gyrus or of the central sulcus. These definitions include the region we discuss as “mid-STG,” the area lateral of Heschl's gyrus. We differentiate mid- from posterior STG on the basis of proximity to primary auditory cortex and the putative course of the ventral stream. As human core auditory fields lie along or about Heschl's gyrus (13, 57–59, 100), the ventral streams’ course can be inferred to traverse portions of planum temporale. Specifically, the ventral stream is associated with macaque areas RTp and AL (54–56), which lie anterior to and lateral of A1 (13). As human A1 lies on or about the medial aspect of Heschl's gyrus, with core running along its extent (57, 100), a processing cascade emanating from core areas, progressing both laterally, away from core itself, and anteriorly, away from A1, will necessarily traverse the anterior-lateral portion of planum temporale. Further, this implies mid-STG is the initial STG waypoint of the ventral stream.
Nominal issues aside, support for a posterior localization could be attributed to a constellation of effects pertaining to aspects of speech or phonology that localize to posterior STG/STS (69), for instance: speech production (101–108), phonological/articulatory working memory (109, 110), reading (111–113) [putatively attributable to orthography-to-phonology translation (114–116)], and aspects of audiovisual language processing (117–122). Although these findings relate to aspects of speech and phonology, they do so in terms of multisensory processing and sensorimotor integration and are not the key paradigms indicated by computational theory for demonstrating the presence of pattern recognition networks (8–12, 123). Those paradigms (CS and adaptation), systematically meta-analyzed here, find anterior localization.
The segregation of phoneme and word-form processing along STG implies a growing encoding specificity for complex phonetic forms by higher-order ventral-stream areas. More specifically, it suggests the presence of a hierarchical network performing phonetic concatenation at a site anatomically distinct from and downstream of the site performing elemental phonetic recognition. Alternatively, the phonetic-length effect could be attributed to semantic confound: semantic content increases from phonemes to word forms. In an elegant experiment, Thierry and colleagues (2003) report evidence against this (82). After controlling for acoustics, they show that left anterior STG responds more to speech than to semantically matched environmental sounds. Similarly, Belin and colleagues (2000, 2002), after controlling for acoustics, show that left anterior STG is not merely responding to the vocal quality of phonetic sounds; rather, it responds preferentially to the phonetic quality of vocal sounds (83, 84).
Additional comment on the localization and laterality of auditory word and pseudoword processing, as well as on processing gradients in STG/STS, is provided in SI Appendix, Discussion.
The auditory ventral stream is proposed to use CS to conjoin lower-order representations and thereby to synthesize complex representations. As the tuning of higher-order combination-sensitive units is contingent upon sensory experience (124, 125), phrases and sentences would not generally be processed as Gestalt-like objects. Although we have analyzed studies involving phrase- and sentence-level processing, their inclusion is for context and because word-form recognition is a constituent part of sentence processing. In some instances, however, phrases are processed as objects (126). This status is occasionally recognized in orthography (e.g., “nonetheless”). Such phrases ought to be recognized by the ventral-stream network. This, however, would be the exception, not the rule. Hypothetically, the opposite may also occur: a word form's length might exceed the network's integrative capacity (e.g., “antidisestablishmentarianism”). We speculate the network is capable of concatenating sequences of at least five to eight phonemes: five to six phonemes is the modal length of English word forms and seven- to eight-phoneme-long word forms comprise nearly one fourth of English words (SI Appendix, Fig. S7 and Discussion). This estimate is also consistent with the time constant of echoic memory (∼2 s). (Notably, there is a similar issue concerning the processing of text in the visual system's ventral stream, where, for longer words, fovea-width representations must be “temporally” conjoined across microsaccades.) Although some phrases may be recognized in the word-form recognition network, the majority of STS activation associated with phrase-length stimuli (Fig. 1F) is likely related to aspects of syntax and semantics. This observation enables us to subdivide the intelligibility network, broadly defined by Scott and colleagues (2000) (127). The first two stages involve elemental and concatenative phonetic recognition, extending from mid-STG to anterior STG and, possibly, into subjacent STS. Higher-order syntactic and semantic processing is conducted throughout STS and continues into prefrontal cortex (128–133).
A qualification to the propositions advanced here for word-form recognition is that this account pertains to perceptually fluent speech recognition (e.g., native language conversational discourse). Both left ventral and dorsal networks likely mediate nonfluent speech recognition (e.g., when processing neologisms or recently acquired words in a second language). Whereas ventral networks are implicated in pattern recognition, dorsal networks are implicated in forward- and inverse-model computation (42, 44), including sensorimotor integration (42, 45, 48, 134). This supports a role for left dorsal networks in mapping auditory representations onto the somatomotor frame of reference (135–139), yielding articulator-encoded speech. This ventral–dorsal dissociation is illustrated in an experiment by Buchsbaum and colleagues (2005) (110). Using a verbal working memory task, they demonstrated the time course of left anterior STG/STS activation to be consistent with strictly auditory encoding: activation was locked to auditory stimulation and it was not sustained throughout the late phase of item rehearsal. In contrast, they observed the activation time course in the dorsal stream to be modality independent and to coincide with late-phase rehearsal (i.e., it was associated with verbal rehearsal independent of input modality, auditory or visual). Importantly, late-phase rehearsal can be demonstrated behaviorally, by articulatory suppression, to be mediated by subvocalization (i.e., articulatory rehearsal in the phonological loop) (140).
There are some notable differences between auditory and visual word recognition. Spoken language was intensely selected for during evolution (141), whereas reading is a recent cultural innovation (111). The age of acquisition of phoneme representation is in the first year of life (124), whereas it is typically in the third year for letters. A similar developmental lag is present with respect to acquisition of the visual lexicon. Differences aside, word recognition in each modality requires similar processing, including the concatenation of elemental forms, phonemes or letters, into sublexical forms and word forms. If the analogy between auditory and visual ventral streams is correct, our results predict a similar anatomical dissociation for elemental and concatenative representation in the visual ventral stream. This prediction is also made by models of text processing (10). Although we are aware of no study that has investigated letter and word recognition in a single sample, support for the dissociation is present in the literature. The visual word-form area, the putative site of visual word-form recognition (142), is located in the left fusiform gyrus of inferior temporal cortex (IT) (143). Consistent with expectation, the average site of peak activation to single letters in IT (144–150) is more proximal to V1, by approximately 13 mm. A similar anatomical dissociation can be seen in paradigms probing IR. Ordinarily, nonhuman primate IT neurons exhibit a degree of mirror-symmetric invariant tuning (151). Letter recognition, however, requires nonmirror IR (e.g., to distinguish “b” from “d”). When assessing identity-specific RS (i.e., repetition effects specific to non–mirror-inverted repetitions), letter and word effects differentially localize: effects for word stimuli localize to the visual word-form area (152), whereas effects for single-letter stimuli localize to the lateral occipital complex (153), a site closer to V1. Thus, the anatomical dissociation observed in auditory cortex for phonemes and words appears to reflect a general hierarchical processing architecture also present in other sensory cortices.
In conclusion, our analyses show the human functional imaging literature to support a hierarchical model of object recognition in auditory cortex, consistent with nonhuman primate electrophysiology. Specifically, our results support a left-biased, two-stage model of auditory word-form recognition with analysis of phonemes occurring in mid-STG and word recognition occurring in anterior STG. A third stage extends the model to phrase-level processing in STS. Mechanistically, left mid- to anterior STG exhibits core qualities of a pattern recognition network, including CS, IR, and AS.
Methods
To identify prospective studies for inclusion, a systematic search of the PubMed database was performed for variations of the query, “(phonetics OR ‘speech sounds’ OR phoneme OR ‘auditory word’) AND (MRI OR fMRI OR PET).” This yielded more than 550 records (as of February 2011). These studies were screened for compliance with formal inclusion criteria: (i) the publication of stereotaxic coordinates for group-wise fMRI or PET results in a peer-reviewed journal and (ii) report of a contrast of interest (as detailed later). Exclusion criteria were the use of pediatric or clinical samples. Inclusion/exclusion criteria admitted 115 studies. For studies reporting multiple suitable contrasts per sample, to avoid sampling bias, a single contrast was selected. For CS analyses, contrasts of interest compared activation to speech stimuli (i.e., phonemes/syllables, words/pseudowords, and phrases/sentences/pseudoword sentences) with activation to matched, nonnaturalistic acoustic control stimuli (i.e., various tonal, noise, and complex artificial nonspeech stimuli). A total of 84 eligible contrasts were identified, representing 1,211 subjects and 541 foci. For RS analyses, contrasts compared activation to repeated and nonrepeated speech stimuli. A total of 31 eligible contrasts were identified, representing 471 subjects and 145 foci. For IR analyses, a subset of the RS cohort was selected that used designs in which “repeated” stimuli also varied acoustically but not phonetically (e.g., two different utterances of the same word). The RS cohort was used for phonetic length-based analyses as the more restrictive criteria for IR yielded insufficient sample sizes (as detailed later). For AS analyses, contrasts compared activation to speech stimuli and to other naturalistic stimuli (e.g., animal calls, music, tool sounds). A total of 17 eligible contrasts were identified, representing 239 subjects and 100 foci. All retained contrasts were binned for phonetic length-based analyses according to the estimated mean number of phonemes in their stimuli: (i) “phoneme length,” one or two phonemes, (ii) “word length,” three to 10 phonemes, and (iii) “phrase length,” more than 10 phonemes. SI Appendix, Tables S2–S4, identify the contrasts included in each analysis.
The minimum sample size for meta-analyses was 10 independent contrasts. Foci reported in Montreal Neurological Institute coordinates were transformed into Talairach coordinates according to the ICBM2TAL transformation (154). Foci concordance was assessed by the method of ALE (81) in a random-effects implementation (155) that controls for within-experiment effects (156). Under ALE, foci are treated as Gaussian probability distributions, which reflect localization uncertainty. Pooled Gaussian focus maps were tested against a null distribution reflecting a random spatial association between different experiments. Correction for multiple comparisons was obtained through estimation of false discovery rate (157). Two significance criteria were used: minimum p value was set at 10−3 and minimum cluster extent was set at 150 mm3. Analyses were conducted in GINGERALE (Research Imaging Institute), AFNI (National Institute of Mental Health), and MATLAB (Mathworks). For visualization, CARET (Washington University in St. Louis) was used to project foci and ALE clusters from volumetric space onto the cortical surface of the Population-Average, Landmark- and Surface-based atlas (158). Readers should note that this procedure can introduce slight localization artifacts (e.g., projection may distribute one volumetric cluster discontinuously over two adjacent gyri).
Supplementary Material
Acknowledgments
We thank Max Riesenhuber, Marc Ettlinger, and two anonymous reviewers for comments helpful to the development of this manuscript. This work was supported by National Science Foundation Grants BCS-0519127 and OISE-0730255 (to J.P.R.) and National Institute on Deafness and Other Communication Disorders Grant 1RC1DC010720 (to J.P.R.).
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
See Author Summary on page 2709 (volume 109, number 8).
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1113427109/-/DCSupplemental.
References
- 1.Steinschneider M. Unlocking the role of the superior temporal gyrus for speech sound categorization. J Neurophysiol. 2011;105:2631–2633. doi: 10.1152/jn.00238.2011. [DOI] [PubMed] [Google Scholar]
- 2.Brugge JF, Merzenich MM. Responses of neurons in auditory cortex of the macaque monkey to monaural and binaural stimulation. J Neurophysiol. 1973;36:1138–1158. doi: 10.1152/jn.1973.36.6.1138. [DOI] [PubMed] [Google Scholar]
- 3.Bitterman Y, Mukamel R, Malach R, Fried I, Nelken I. Ultra-fine frequency tuning revealed in single neurons of human auditory cortex. Nature. 2008;451:197–201. doi: 10.1038/nature06476. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Griffiths TD, Warren JD. What is an auditory object? Nat Rev Neurosci. 2004;5:887–892. doi: 10.1038/nrn1538. [DOI] [PubMed] [Google Scholar]
- 5.Leaver AM, Rauschecker JP. Cortical representation of natural complex sounds: effects of acoustic features and auditory object category. J Neurosci. 2010;30:7604–7612. doi: 10.1523/JNEUROSCI.0296-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Luce P, McLennan C. Spoken word recognition: The challenge of variation. In: Pisoni D, Remez R, editors. Handbook of Speech Perception. Malden, MA: Blackwell; 2005. pp. 591–609. [Google Scholar]
- 7.Hubel DH, Wiesel TN. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. J Physiol. 1962;160:106–154. doi: 10.1113/jphysiol.1962.sp006837. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Riesenhuber M, Poggio TA. Neural mechanisms of object recognition. Curr Opin Neurobiol. 2002;12:162–168. doi: 10.1016/s0959-4388(02)00304-5. [DOI] [PubMed] [Google Scholar]
- 9.Husain FT, Tagamets M-A, Fromm SJ, Braun AR, Horwitz B. Relating neuronal dynamics for auditory object processing to neuroimaging activity: A computational modeling and an fMRI study. Neuroimage. 2004;21:1701–1720. doi: 10.1016/j.neuroimage.2003.11.012. [DOI] [PubMed] [Google Scholar]
- 10.Dehaene S, Cohen L, Sigman M, Vinckier F. The neural code for written words: a proposal. Trends Cogn Sci. 2005;9:335–341. doi: 10.1016/j.tics.2005.05.004. [DOI] [PubMed] [Google Scholar]
- 11.Hoffman KL, Logothetis NK. Cortical mechanisms of sensory learning and object recognition. Philos Trans R Soc Lond B Biol Sci. 2009;364:321–329. doi: 10.1098/rstb.2008.0271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Larson E, Billimoria CP, Sen K. A biologically plausible computational model for auditory object recognition. J Neurophysiol. 2009;101:323–331. doi: 10.1152/jn.90664.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hackett TA. Information flow in the auditory cortical network. Hear Res. 2011;271:133–146. doi: 10.1016/j.heares.2010.01.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Suga N, O'Neill WE, Manabe T. Cortical neurons sensitive to combinations of information-bearing elements of biosonar signals in the mustache bat. Science. 1978;200:778–781. doi: 10.1126/science.644320. [DOI] [PubMed] [Google Scholar]
- 15.Margoliash D, Fortune ES. Temporal and harmonic combination-sensitive neurons in the zebra finch's HVc. J Neurosci. 1992;12:4309–4326. doi: 10.1523/JNEUROSCI.12-11-04309.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Rauschecker JP, Tian B, Hauser M. Processing of complex sounds in the macaque nonprimary auditory cortex. Science. 1995;268:111–114. doi: 10.1126/science.7701330. [DOI] [PubMed] [Google Scholar]
- 17.Rauschecker JP. Processing of complex sounds in the auditory cortex of cat, monkey, and man. Acta Otolaryngol Suppl. 1997;532:34–38. doi: 10.3109/00016489709126142. [DOI] [PubMed] [Google Scholar]
- 18.Rauschecker JP. Parallel processing in the auditory cortex of primates. Audiol Neurootol. 1998;3:86–103. doi: 10.1159/000013784. [DOI] [PubMed] [Google Scholar]
- 19.Sadagopan S, Wang X. Nonlinear spectrotemporal interactions underlying selectivity for complex sounds in auditory cortex. J Neurosci. 2009;29:11192–11202. doi: 10.1523/JNEUROSCI.1286-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Medvedev AV, Chiao F, Kanwal JS. Modeling complex tone perception: grouping harmonics with combination-sensitive neurons. Biol Cybern. 2002;86:497–505. doi: 10.1007/s00422-002-0316-3. [DOI] [PubMed] [Google Scholar]
- 21.Willmore BDB, King AJ. Auditory cortex: representation through sparsification? Curr Biol. 2009;19:1123–1125. doi: 10.1016/j.cub.2009.11.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Voytenko SV, Galazyuk AV. Intracellular recording reveals temporal integration in inferior colliculus neurons of awake bats. J Neurophysiol. 2007;97:1368–1378. doi: 10.1152/jn.00976.2006. [DOI] [PubMed] [Google Scholar]
- 23.Peterson DC, Voytenko S, Gans D, Galazyuk A, Wenstrup J. Intracellular recordings from combination-sensitive neurons in the inferior colliculus. J Neurophysiol. 2008;100:629–645. doi: 10.1152/jn.90390.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ye CQ, Poo MM, Dan Y, Zhang XH. Synaptic mechanisms of direction selectivity in primary auditory cortex. J Neurosci. 2010;30:1861–1868. doi: 10.1523/JNEUROSCI.3088-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Rao RP, Sejnowski TJ. Predictive sequence learning in recurrent neocortical circuits. In: Solla SA, Leen TK, Muller KR, editors. Advances in Neural Information Processing Systems, Vol 12. Cambridge: MIT Press; 2000. [Google Scholar]
- 26.Carr CE, Konishi M. Axonal delay lines for time measurement in the owl's brainstem. Proc Natl Acad Sci USA. 1988;85:8311–8315. doi: 10.1073/pnas.85.21.8311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Tian B, Rauschecker JP. Processing of frequency-modulated sounds in the lateral auditory belt cortex of the rhesus monkey. J Neurophysiol. 2004;92:2993–3013. doi: 10.1152/jn.00472.2003. [DOI] [PubMed] [Google Scholar]
- 28.Fukushima K. Neocognitron: A self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern. 1980;36:193–202. doi: 10.1007/BF00344251. [DOI] [PubMed] [Google Scholar]
- 29.Riesenhuber M, Poggio TA. Hierarchical models of object recognition in cortex. Nat Neurosci. 1999;2:1019–1025. doi: 10.1038/14819. [DOI] [PubMed] [Google Scholar]
- 30.Kouh M, Poggio TA. A canonical neural circuit for cortical nonlinear operations. Neural Comput. 2008;20:1427–1451. doi: 10.1162/neco.2008.02-07-466. [DOI] [PubMed] [Google Scholar]
- 31.Lampl I, Ferster D, Poggio T, Riesenhuber M. Intracellular measurements of spatial integration and the MAX operation in complex cells of the cat primary visual cortex. J Neurophysiol. 2004;92:2704–2713. doi: 10.1152/jn.00060.2004. [DOI] [PubMed] [Google Scholar]
- 32.Finn IM, Ferster D. Computational diversity in complex cells of cat primary visual cortex. J Neurosci. 2007;27:9638–9648. doi: 10.1523/JNEUROSCI.2119-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Bendor D, Wang X. Differential neural coding of acoustic flutter within primate auditory cortex. Nat Neurosci. 2007;10:763–771. doi: 10.1038/nn1888. [DOI] [PubMed] [Google Scholar]
- 34.Bendor D, Wang X. Neural response properties of primary, rostral, and rostrotemporal core fields in the auditory cortex of marmoset monkeys. J Neurophysiol. 2008;100:888–906. doi: 10.1152/jn.00884.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Atencio CA, Sharpee TO, Schreiner CE. Cooperative nonlinearities in auditory cortical neurons. Neuron. 2008;58:956–966. doi: 10.1016/j.neuron.2008.04.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Roe AW, Pallas SL, Kwon YH, Sur M. Visual projections routed to the auditory pathway in ferrets: receptive fields of visual neurons in primary auditory cortex. J Neurosci. 1992;12:3651–3664. doi: 10.1523/JNEUROSCI.12-09-03651.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Atencio CA, Sharpee TO, Schreiner CE. Hierarchical computation in the canonical auditory cortical circuit. Proc Natl Acad Sci USA. 2009;106:21894–21899. doi: 10.1073/pnas.0908383106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Ahmed B, Garcia-Lazaro JA, Schnupp JWH. Response linearity in primary auditory cortex of the ferret. J Physiol. 2006;572:763–773. doi: 10.1113/jphysiol.2005.104380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rauschecker JP, Tian B. Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proc Natl Acad Sci USA. 2000;97:11800–11806. doi: 10.1073/pnas.97.22.11800. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Romanski LM, et al. Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nat Neurosci. 1999;2:1131–1136. doi: 10.1038/16056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kaas JH, Hackett TA. ‘What’ and ‘where’ processing in auditory cortex. Nat Neurosci. 1999;2:1045–1047. doi: 10.1038/15967. [DOI] [PubMed] [Google Scholar]
- 42.Rauschecker JP, Scott SK. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat Neurosci. 2009;12:718–724. doi: 10.1038/nn.2331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Romanski LM, Averbeck BB. The primate cortical auditory system and neural representation of conspecific vocalizations. Annu Rev Neurosci. 2009;32:315–346. doi: 10.1146/annurev.neuro.051508.135431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Rauschecker JP. An expanded role for the dorsal auditory pathway in sensorimotor control and integration. Hear Res. 2011;271:16–25. doi: 10.1016/j.heares.2010.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Hickok G, Poeppel D. The cortical organization of speech processing. Nat Rev Neurosci. 2007;8:393–402. doi: 10.1038/nrn2113. [DOI] [PubMed] [Google Scholar]
- 46.Scott SK, Wise RJS. The functional neuroanatomy of prelexical processing in speech perception. Cognition. 2004;92:13–45. doi: 10.1016/j.cognition.2002.12.002. [DOI] [PubMed] [Google Scholar]
- 47.Binder JR, et al. Human temporal lobe activation by speech and nonspeech sounds. Cereb Cortex. 2000;10:512–528. doi: 10.1093/cercor/10.5.512. [DOI] [PubMed] [Google Scholar]
- 48.Wise RJ, et al. Separate neural subsystems within ‘Wernicke's area’. Brain. 2001;124:83–95. doi: 10.1093/brain/124.1.83. [DOI] [PubMed] [Google Scholar]
- 49.Patterson RD, Johnsrude IS. Functional imaging of the auditory processing applied to speech sounds. Philos Trans R Soc Lond B Biol Sci. 2008;363:1023–1035. doi: 10.1098/rstb.2007.2157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Weiller C, Bormann T, Saur D, Musso M, Rijntjes M. How the ventral pathway got lost: and what its recovery might mean. Brain Lang. 2011;118:29–39. doi: 10.1016/j.bandl.2011.01.005. [DOI] [PubMed] [Google Scholar]
- 51.Whalen DH, et al. Differentiation of speech and nonspeech processing within primary auditory cortex. J Acoust Soc Am. 2006;119:575–581. doi: 10.1121/1.2139627. [DOI] [PubMed] [Google Scholar]
- 52.Nelken I. Processing of complex sounds in the auditory system. Curr Opin Neurobiol. 2008;18:413–417. doi: 10.1016/j.conb.2008.08.014. [DOI] [PubMed] [Google Scholar]
- 53.Recanzone GH, Cohen YE. Serial and parallel processing in the primate auditory cortex revisited. Behav Brain Res. 2010;206:1–7. doi: 10.1016/j.bbr.2009.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Tian B, Reser D, Durham A, Kustov A, Rauschecker JP. Functional specialization in rhesus monkey auditory cortex. Science. 2001;292:290–293. doi: 10.1126/science.1058911. [DOI] [PubMed] [Google Scholar]
- 55.Kikuchi Y, Horwitz B, Mishkin M. Hierarchical auditory processing directed rostrally along the monkey's supratemporal plane. J Neurosci. 2010;30:13021–13030. doi: 10.1523/JNEUROSCI.2267-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Tsunada J, Lee JH, Cohen YE. Representation of speech categories in the primate auditory cortex. J Neurophysiol. 2011;105:2634–2646. doi: 10.1152/jn.00037.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Galaburda AM, Sanides F. Cytoarchitectonic organization of the human auditory cortex. J Comp Neurol. 1980;190:597–610. doi: 10.1002/cne.901900312. [DOI] [PubMed] [Google Scholar]
- 58.Chevillet M, Riesenhuber M, Rauschecker JP. Functional correlates of the anterolateral processing hierarchy in human auditory cortex. J Neurosci. 2011;31:9345–9352. doi: 10.1523/JNEUROSCI.1448-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Glasser MF, Van Essen DC. Mapping human cortical areas in vivo based on myelin content as revealed by T1- and T2-weighted MRI. J Neurosci. 2011;31:11597–11616. doi: 10.1523/JNEUROSCI.2180-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Poremba A, et al. Species-specific calls evoke asymmetric activity in the monkey's temporal poles. Nature. 2004;427:448–451. doi: 10.1038/nature02268. [DOI] [PubMed] [Google Scholar]
- 61.Chang EF, et al. Categorical speech representation in human superior temporal gyrus. Nat Neurosci. 2010;13:1428–1432. doi: 10.1038/nn.2641. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Chang EF, et al. Cortical spatio-temporal dynamics underlying phonological target detection in humans. J Cogn Neurosci. 2011;23:1437–1446. doi: 10.1162/jocn.2010.21466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Steinschneider M, et al. Intracranial study of speech-elicited activity on the human posterolateral superior temporal gyrus. Cereb Cortex. 2011;21:2332–2347. doi: 10.1093/cercor/bhr014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Edwards E, et al. Comparison of time-frequency responses and the event-related potential to auditory speech stimuli in human cortex. J Neurophysiol. 2009;102:377–386. doi: 10.1152/jn.90954.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Miller EK, Li L, Desimone R. A neural mechanism for working and recognition memory in inferior temporal cortex. Science. 1991;254:1377–1379. doi: 10.1126/science.1962197. [DOI] [PubMed] [Google Scholar]
- 66.Grill-Spector K, Malach R. fMR-adaptation: A tool for studying the functional properties of human cortical neurons. Acta Psychol (Amst) 2001;107:293–321. doi: 10.1016/s0001-6918(01)00019-1. [DOI] [PubMed] [Google Scholar]
- 67.Joanisse MF, Zevin JD, McCandliss BD. Brain mechanisms implicated in the preattentive categorization of speech sounds revealed using FMRI and a short-interval habituation trial paradigm. Cereb Cortex. 2007;17:2084–2093. doi: 10.1093/cercor/bhl124. [DOI] [PubMed] [Google Scholar]
- 68.Scott BH, Malone BJ, Semple MN. Transformation of temporal processing across auditory cortex of awake macaques. J Neurophysiol. 2011;105:712–730. doi: 10.1152/jn.01120.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Kusmierek P, Rauschecker JP. Functional specialization of medial auditory belt cortex in the alert rhesus monkey. J Neurophysiol. 2009;102:1606–1622. doi: 10.1152/jn.00167.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Creutzfeldt O, Ojemann G, Lettich E. Neuronal activity in the human lateral temporal lobe. I. Responses to speech. Exp Brain Res. 1989;77:451–475. doi: 10.1007/BF00249600. [DOI] [PubMed] [Google Scholar]
- 71.Pei X, et al. Spatiotemporal dynamics of electrocorticographic high gamma activity during overt and covert word repetition. Neuroimage. 2011;54:2960–2972. doi: 10.1016/j.neuroimage.2010.10.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Marinkovic K, et al. Spatiotemporal dynamics of modality-specific and supramodal word processing. Neuron. 2003;38:487–497. doi: 10.1016/s0896-6273(03)00197-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Binder JR, Frost JA, Hammeke TA, Rao SM, Cox RW. Function of the left planum temporale in auditory and linguistic processing. Brain. 1996;119:1239–1247. doi: 10.1093/brain/119.4.1239. [DOI] [PubMed] [Google Scholar]
- 74.Binder JR, et al. Human brain language areas identified by functional magnetic resonance imaging. J Neurosci. 1997;17:353–362. doi: 10.1523/JNEUROSCI.17-01-00353.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Dehaene-Lambertz G, et al. Functional segregation of cortical language areas by sentence repetition. Hum Brain Mapp. 2006;27:360–371. doi: 10.1002/hbm.20250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Sammler D, et al. The relationship of lyrics and tunes in the processing of unfamiliar songs: A functional magnetic resonance adaptation study. J Neurosci. 2010;30:3572–3578. doi: 10.1523/JNEUROSCI.2751-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Hara NF, Nakamura K, Kuroki C, Takayama Y, Ogawa S. Functional neuroanatomy of speech processing within the temporal cortex. Neuroreport. 2007;18:1603–1607. doi: 10.1097/WNR.0b013e3282f03f39. [DOI] [PubMed] [Google Scholar]
- 78.Cohen L, Jobert A, Le Bihan D, Dehaene S. Distinct unimodal and multimodal regions for word processing in the left temporal cortex. Neuroimage. 2004;23:1256–1270. doi: 10.1016/j.neuroimage.2004.07.052. [DOI] [PubMed] [Google Scholar]
- 79.Buchsbaum BR, D'Esposito M. Repetition suppression and reactivation in auditory-verbal short-term recognition memory. Cereb Cortex. 2009;19:1474–1485. doi: 10.1093/cercor/bhn186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Matsumoto R, et al. Left anterior temporal cortex actively engages in speech perception: A direct cortical stimulation study. Neuropsychologia. 2011;49:1350–1354. doi: 10.1016/j.neuropsychologia.2011.01.023. [DOI] [PubMed] [Google Scholar]
- 81.Turkeltaub PE, Eden GF, Jones KM, Zeffiro TA. Meta-analysis of the functional neuroanatomy of single-word reading: method and validation. Neuroimage. 2002;16:765–780. doi: 10.1006/nimg.2002.1131. [DOI] [PubMed] [Google Scholar]
- 82.Thierry G, Giraud AL, Price CJ. Hemispheric dissociation in access to the human semantic system. Neuron. 2003;38:499–506. doi: 10.1016/s0896-6273(03)00199-5. [DOI] [PubMed] [Google Scholar]
- 83.Belin P, Zatorre RJ, Lafaille P, Ahad P, Pike B. Voice-selective areas in human auditory cortex. Nature. 2000;403:309–312. doi: 10.1038/35002078. [DOI] [PubMed] [Google Scholar]
- 84.Belin P, Zatorre RJ, Ahad P. Human temporal-lobe response to vocal sounds. Brain Res Cogn Brain Res. 2002;13:17–26. doi: 10.1016/s0926-6410(01)00084-2. [DOI] [PubMed] [Google Scholar]
- 85.Petkov CI, et al. A voice region in the monkey brain. Nat Neurosci. 2008;11:367–374. doi: 10.1038/nn2043. [DOI] [PubMed] [Google Scholar]
- 86.Desimone R, Albright TD, Gross CG, Bruce C. Stimulus-selective properties of inferior temporal neurons in the macaque. J Neurosci. 1984;4:2051–2062. doi: 10.1523/JNEUROSCI.04-08-02051.1984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Gaillard R, et al. Direct intracranial, FMRI, and lesion evidence for the causal role of left inferotemporal cortex in reading. Neuron. 2006;50:191–204. doi: 10.1016/j.neuron.2006.03.031. [DOI] [PubMed] [Google Scholar]
- 88.Tsao DY, Freiwald WA, Tootell RBH, Livingstone MS. A cortical region consisting entirely of face-selective cells. Science. 2006;311:670–674. doi: 10.1126/science.1119983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Kanwisher N, Yovel G. The fusiform face area: A cortical region specialized for the perception of faces. Philos Trans R Soc Lond B Biol Sci. 2006;361:2109–2128. doi: 10.1098/rstb.2006.1934. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Edwards E, et al. Spatiotemporal imaging of cortical activation during verb generation and picture naming. Neuroimage. 2010;50:291–301. doi: 10.1016/j.neuroimage.2009.12.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Turkeltaub PE, Coslett HB. Localization of sublexical speech perception components. Brain Lang. 2010;114:1–15. doi: 10.1016/j.bandl.2010.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Samson F, Zeffiro TA, Toussaint A, Belin P. Stimulus complexity and categorical effects in human auditory cortex: An activation likelihood estimation meta-analysis. Front Psychol. 2011;1:241. doi: 10.3389/fpsyg.2010.00241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Geschwind N. The organization of language and the brain. Science. 1970;170:940–944. doi: 10.1126/science.170.3961.940. [DOI] [PubMed] [Google Scholar]
- 94.Bates E, et al. Voxel-based lesion-symptom mapping. Nat Neurosci. 2003;6:448–450. doi: 10.1038/nn1050. [DOI] [PubMed] [Google Scholar]
- 95.Dronkers NF, Wilkins DP, Van Valin RD, Jr, Redfern BB, Jaeger JJ. Lesion analysis of the brain areas involved in language comprehension. Cognition. 2004;92:145–177. doi: 10.1016/j.cognition.2003.11.002. [DOI] [PubMed] [Google Scholar]
- 96.Mazziotta JC, Phelps ME, Carson RE, Kuhl DE. Tomographic mapping of human cerebral metabolism: Auditory stimulation. Neurology. 1982;32:921–937. doi: 10.1212/wnl.32.9.921. [DOI] [PubMed] [Google Scholar]
- 97.Petersen SE, Fox PT, Posner MI, Mintun M, Raichle ME. Positron emission tomographic studies of the cortical anatomy of single-word processing. Nature. 1988;331:585–589. doi: 10.1038/331585a0. [DOI] [PubMed] [Google Scholar]
- 98.Wise RJS, et al. Distribution of cortical neural networks involved in word comprehension and word retrieval. Brain. 1991;114:1803–1817. doi: 10.1093/brain/114.4.1803. [DOI] [PubMed] [Google Scholar]
- 99.Démonet JF, et al. The anatomy of phonological and semantic processing in normal subjects. Brain. 1992;115:1753–1768. doi: 10.1093/brain/115.6.1753. [DOI] [PubMed] [Google Scholar]
- 100.Rademacher J, et al. Probabilistic mapping and volume measurement of human primary auditory cortex. Neuroimage. 2001;13:669–683. doi: 10.1006/nimg.2000.0714. [DOI] [PubMed] [Google Scholar]
- 101.Hamberger MJ, Seidel WT, Goodman RR, Perrine K, McKhann GM. Temporal lobe stimulation reveals anatomic distinction between auditory naming processes. Neurology. 2003;60:1478–1483. doi: 10.1212/01.wnl.0000061489.25675.3e. [DOI] [PubMed] [Google Scholar]
- 102.Hashimoto Y, Sakai KL. Brain activations during conscious self-monitoring of speech production with delayed auditory feedback: An fMRI study. Hum Brain Mapp. 2003;20:22–28. doi: 10.1002/hbm.10119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Warren JE, Wise RJS, Warren JD. Sounds do-able: Auditory-motor transformations and the posterior temporal plane. Trends Neurosci. 2005;28:636–643. doi: 10.1016/j.tins.2005.09.010. [DOI] [PubMed] [Google Scholar]
- 104.Guenther FH. Cortical interactions underlying the production of speech sounds. J Commun Disord. 2006;39:350–365. doi: 10.1016/j.jcomdis.2006.06.013. [DOI] [PubMed] [Google Scholar]
- 105.Tourville JA, Reilly KJ, Guenther FH. Neural mechanisms underlying auditory feedback control of speech. Neuroimage. 2008;39:1429–1443. doi: 10.1016/j.neuroimage.2007.09.054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Towle VL, et al. ECoG gamma activity during a language task: Differentiating expressive and receptive speech areas. Brain. 2008;131:2013–2027. doi: 10.1093/brain/awn147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Takaso H, Eisner F, Wise RJS, Scott SK. The effect of delayed auditory feedback on activity in the temporal lobe while speaking: A positron emission tomography study. J Speech Lang Hear Res. 2010;53:226–236. doi: 10.1044/1092-4388(2009/09-0009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Zheng ZZ, Munhall KG, Johnsrude IS. Functional overlap between regions involved in speech perception and in monitoring one's own voice during speech production. J Cogn Neurosci. 2010;22:1770–1781. doi: 10.1162/jocn.2009.21324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Buchsbaum BR, Padmanabhan A, Berman KF. The neural substrates of recognition memory for verbal information: Spanning the divide between short- and long-term memory. J Cogn Neurosci. 2011;23:978–991. doi: 10.1162/jocn.2010.21496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Buchsbaum BR, Olsen RK, Koch P, Berman KF. Human dorsal and ventral auditory streams subserve rehearsal-based and echoic processes during verbal working memory. Neuron. 2005;48:687–697. doi: 10.1016/j.neuron.2005.09.029. [DOI] [PubMed] [Google Scholar]
- 111.Vinckier F, et al. Hierarchical coding of letter strings in the ventral stream: dissecting the inner organization of the visual word-form system. Neuron. 2007;55:143–156. doi: 10.1016/j.neuron.2007.05.031. [DOI] [PubMed] [Google Scholar]
- 112.Dehaene S, et al. How learning to read changes the cortical networks for vision and language. Science. 2010;330:1359–1364. doi: 10.1126/science.1194140. [DOI] [PubMed] [Google Scholar]
- 113.Pallier C, Devauchelle A-D, Dehaene S. Cortical representation of the constituent structure of sentences. Proc Natl Acad Sci USA. 2011;108:2522–2527. doi: 10.1073/pnas.1018711108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Graves WW, Desai R, Humphries C, Seidenberg MS, Binder JR. Neural systems for reading aloud: A multiparametric approach. Cereb Cortex. 2010;20:1799–1815. doi: 10.1093/cercor/bhp245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Jobard G, Crivello F, Tzourio-Mazoyer N. Evaluation of the dual route theory of reading: A metanalysis of 35 neuroimaging studies. Neuroimage. 2003;20:693–712. doi: 10.1016/S1053-8119(03)00343-4. [DOI] [PubMed] [Google Scholar]
- 116.Turkeltaub PE, Gareau L, Flowers DL, Zeffiro TA, Eden GF. Development of neural mechanisms for reading. Nat Neurosci. 2003;6:767–773. doi: 10.1038/nn1065. [DOI] [PubMed] [Google Scholar]
- 117.Hamberger MJ, Goodman RR, Perrine K, Tamny T. Anatomic dissociation of auditory and visual naming in the lateral temporal cortex. Neurology. 2001;56:56–61. doi: 10.1212/wnl.56.1.56. [DOI] [PubMed] [Google Scholar]
- 118.Hamberger MJ, McClelland S, III, McKhann GM, II, Williams AC, Goodman RR. Distribution of auditory and visual naming sites in nonlesional temporal lobe epilepsy patients and patients with space-occupying temporal lobe lesions. Epilepsia. 2007;48:531–538. doi: 10.1111/j.1528-1167.2006.00955.x. [DOI] [PubMed] [Google Scholar]
- 119.Blau V, van Atteveldt N, Formisano E, Goebel R, Blomert L. Task-irrelevant visual letters interact with the processing of speech sounds in heteromodal and unimodal cortex. Eur J Neurosci. 2008;28:500–509. doi: 10.1111/j.1460-9568.2008.06350.x. [DOI] [PubMed] [Google Scholar]
- 120.van Atteveldt NM, Blau VC, Blomert L, Goebel R. fMR-adaptation indicates selectivity to audiovisual content congruency in distributed clusters in human superior temporal cortex. BMC Neurosci. 2010;11:11. doi: 10.1186/1471-2202-11-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Beauchamp MS, Nath AR, Pasalar S. fMRI-Guided transcranial magnetic stimulation reveals that the superior temporal sulcus is a cortical locus of the McGurk effect. J Neurosci. 2010;30:2414–2417. doi: 10.1523/JNEUROSCI.4865-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Nath AR, Beauchamp MS. Dynamic changes in superior temporal sulcus connectivity during perception of noisy audiovisual speech. J Neurosci. 2011;31:1704–1714. doi: 10.1523/JNEUROSCI.4853-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Ison MJ, Quiroga RQ. Selectivity and invariance for visual object perception. Front Biosci. 2008;13:4889–4903. doi: 10.2741/3048. [DOI] [PubMed] [Google Scholar]
- 124.Kuhl PK. Early language acquisition: Cracking the speech code. Nat Rev Neurosci. 2004;5:831–843. doi: 10.1038/nrn1533. [DOI] [PubMed] [Google Scholar]
- 125.Glezer LS, Jiang X, Riesenhuber M. Evidence for highly selective neuronal tuning to whole words in the “visual word form area”. Neuron. 2009;62:199–204. doi: 10.1016/j.neuron.2009.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Cappelle B, Shtyrov Y, Pulvermüller F. Heating up or cooling up the brain? MEG evidence that phrasal verbs are lexical units. Brain Lang. 2010;115:189–201. doi: 10.1016/j.bandl.2010.09.004. [DOI] [PubMed] [Google Scholar]
- 127.Scott SK, Blank CC, Rosen S, Wise RJS. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 2000;123:2400–2406. doi: 10.1093/brain/123.12.2400. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Binder JR, Desai RH, Graves WW, Conant LL. Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cereb Cortex. 2009;19:2767–2796. doi: 10.1093/cercor/bhp055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Rogalsky C, Hickok G. The role of Broca's area in sentence comprehension. J Cogn Neurosci. 2011;23:1664–1680. doi: 10.1162/jocn.2010.21530. [DOI] [PubMed] [Google Scholar]
- 130.Obleser J, Meyer L, Friederici AD. Dynamic assignment of neural resources in auditory comprehension of complex sentences. Neuroimage. 2011;56:2310–2320. doi: 10.1016/j.neuroimage.2011.03.035. [DOI] [PubMed] [Google Scholar]
- 131.Humphries C, Binder JR, Medler DA, Liebenthal E. Syntactic and semantic modulation of neural activity during auditory sentence comprehension. J Cogn Neurosci. 2006;18:665–679. doi: 10.1162/jocn.2006.18.4.665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Tyler LK, Marslen-Wilson W. Fronto-temporal brain systems supporting spoken language comprehension. Philos Trans R Soc Lond B Biol Sci. 2008;363:1037–1054. doi: 10.1098/rstb.2007.2158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.Friederici AD, Kotz SA, Scott SK, Obleser J. Disentangling syntax and intelligibility in auditory language comprehension. Hum Brain Mapp. 2010;31:448–457. doi: 10.1002/hbm.20878. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Guenther FH. A neural network model of speech acquisition and motor equivalent speech production. Biol Cybern. 1994;72:43–53. doi: 10.1007/BF00206237. [DOI] [PubMed] [Google Scholar]
- 135.Cohen YE, Andersen RA. A common reference frame for movement plans in the posterior parietal cortex. Nat Rev Neurosci. 2002;3:553–562. doi: 10.1038/nrn873. [DOI] [PubMed] [Google Scholar]
- 136.Hackett TA, et al. Sources of somatosensory input to the caudal belt areas of auditory cortex. Perception. 2007;36:1419–1430. doi: 10.1068/p5841. [DOI] [PubMed] [Google Scholar]
- 137.Smiley JF, et al. Multisensory convergence in auditory cortex, I. Cortical connections of the caudal superior temporal plane in macaque monkeys. J Comp Neurol. 2007;502:894–923. doi: 10.1002/cne.21325. [DOI] [PubMed] [Google Scholar]
- 138.Hackett TA, et al. Multisensory convergence in auditory cortex, II. Thalamocortical connections of the caudal superior temporal plane. J Comp Neurol. 2007;502:924–952. doi: 10.1002/cne.21326. [DOI] [PubMed] [Google Scholar]
- 139.Dhanjal NS, Handunnetthi L, Patel MC, Wise RJS. Perceptual systems controlling speech production. J Neurosci. 2008;28:9969–9975. doi: 10.1523/JNEUROSCI.2607-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Baddeley A. Working memory: Looking back and looking forward. Nat Rev Neurosci. 2003;4:829–839. doi: 10.1038/nrn1201. [DOI] [PubMed] [Google Scholar]
- 141.Fitch WT. The evolution of speech: A comparative review. Trends Cogn Sci. 2000;4:258–267. doi: 10.1016/s1364-6613(00)01494-7. [DOI] [PubMed] [Google Scholar]
- 142.McCandliss BD, Cohen L, Dehaene S. The visual word form area: Expertise for reading in the fusiform gyrus. Trends Cogn Sci. 2003;7:293–299. doi: 10.1016/s1364-6613(03)00134-7. [DOI] [PubMed] [Google Scholar]
- 143.Wandell BA, Rauschecker AM, Yeatman JD. Learning to see words. Ann Rev Psychol. 2012;63:31–53. doi: 10.1146/annurev-psych-120710-100434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144.Turkeltaub PE, Flowers DL, Lyon LG, Eden GF. Development of ventral stream representations for single letters. Ann N Y Acad Sci. 2008;1145:13–29. doi: 10.1196/annals.1416.026. [DOI] [PubMed] [Google Scholar]
- 145.Joseph JE, Cerullo MA, Farley AB, Steinmetz NA, Mier CR. fMRI correlates of cortical specialization and generalization for letter processing. Neuroimage. 2006;32:806–820. doi: 10.1016/j.neuroimage.2006.04.175. [DOI] [PubMed] [Google Scholar]
- 146.Pernet C, Celsis P, Démonet J-F. Selective response to letter categorization within the left fusiform gyrus. Neuroimage. 2005;28:738–744. doi: 10.1016/j.neuroimage.2005.06.046. [DOI] [PubMed] [Google Scholar]
- 147.Callan AM, Callan DE, Masaki S. When meaningless symbols become letters: Neural activity change in learning new phonograms. Neuroimage. 2005;28:553–562. doi: 10.1016/j.neuroimage.2005.06.031. [DOI] [PubMed] [Google Scholar]
- 148.Longcamp M, Anton J-L, Roth M, Velay J-L. Premotor activations in response to visually presented single letters depend on the hand used to write: A study on left-handers. Neuropsychologia. 2005;43:1801–1809. doi: 10.1016/j.neuropsychologia.2005.01.020. [DOI] [PubMed] [Google Scholar]
- 149.Flowers DL, et al. Attention to single letters activates left extrastriate cortex. Neuroimage. 2004;21:829–839. doi: 10.1016/j.neuroimage.2003.10.002. [DOI] [PubMed] [Google Scholar]
- 150.Longcamp M, Anton J-L, Roth M, Velay J-L. Visual presentation of single letters activates a premotor area involved in writing. Neuroimage. 2003;19:1492–1500. doi: 10.1016/s1053-8119(03)00088-0. [DOI] [PubMed] [Google Scholar]
- 151.Logothetis NK, Pauls J. Psychophysical and physiological evidence for viewer-centered object representations in the primate. Cereb Cortex. 1995;5:270–288. doi: 10.1093/cercor/5.3.270. [DOI] [PubMed] [Google Scholar]
- 152.Dehaene S, et al. Why do children make mirror errors in reading? Neural correlates of mirror invariance in the visual word form area. Neuroimage. 2010;49:1837–1848. doi: 10.1016/j.neuroimage.2009.09.024. [DOI] [PubMed] [Google Scholar]
- 153.Pegado F, Nakamura K, Cohen L, Dehaene S. Breaking the symmetry: Mirror discrimination for single letters but not for pictures in the visual word form area. Neuroimage. 2011;55:742–749. doi: 10.1016/j.neuroimage.2010.11.043. [DOI] [PubMed] [Google Scholar]
- 154.Lancaster JL, et al. Bias between MNI and Talairach coordinates analyzed using the ICBM-152 brain template. Hum Brain Mapp. 2007;28:1194–1205. doi: 10.1002/hbm.20345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 155.Eickhoff SB, et al. Coordinate-based activation likelihood estimation meta-analysis of neuroimaging data: a random-effects approach based on empirical estimates of spatial uncertainty. Hum Brain Mapp. 2009;30:2907–2926. doi: 10.1002/hbm.20718. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 156.Turkeltaub PE, et al. Minimizing within-experiment and within-group effects in activation likelihood estimation meta-analyses. Hum Brain Mapp. 2012;33:1–13. doi: 10.1002/hbm.21186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 157.Genovese CR, Lazar NA, Nichols T. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. Neuroimage. 2002;15:870–878. doi: 10.1006/nimg.2001.1037. [DOI] [PubMed] [Google Scholar]
- 158.Van Essen DC. A Population-Average, Landmark- and Surface-based (PALS) atlas of human cerebral cortex. Neuroimage. 2005;28:635–662. doi: 10.1016/j.neuroimage.2005.06.058. [DOI] [PubMed] [Google Scholar]