Abstract
Words, grammar, and phonology are linguistically distinct, yet their neural substrates are difficult to distinguish in macroscopic brain regions. We investigated whether they can be separated in time and space at the circuit level using intra-cranial electrophysiology (ICE), namely by recording local field potentials (LFP) from populations of neurons using depth electrodes implanted in language-related brain regions while people read words verbatim or grammatically inflected them (present/past, singular/plural). Neighboring probes within Broca’s area revealed distinct neuronal activity for lexical (~200 ms), grammatical (~320 ms), and phonological (~450 ms) processing, identically for nouns and verbs, in a region activated in the same patients and task in functional magnetic resonance imaging (fMRI). This suggests that a linguistic processing sequence predicted on computational grounds is implemented in the brain in fine-grained spatiotemporally patterned activity.
Within cognitive neuroscience, language is understood far less well than sensation, memory, or motor control, because language has no animal homologues, and methods appropriate to humans (functional magnetic resonance imaging (fMRI), studies of brain-damaged patients, and scalp-recorded potentials) are far coarser in space or time than the underlying causal events in neural circuitry. Moreover, language involves several kinds of abstract information (lexical, grammatical, phonological) that are difficult to manipulate independently. This has left a gap in understanding between the computational structure of language suggested by linguistics and the neural circuitry that implements language processing. We narrow this gap using a technique with high spatial, temporal, and physiological resolution, and a task that distinguishes three components of linguistic computation.
According to linguistic analyses, the ability to identify words, combine them grammatically, and articulate their sounds involve several kinds of representations, with logical dependencies among them (1, 2). For example, to pronounce a verb in a sentence, one must determine the appropriate tense given the intended meaning and syntactic context (e.g., “walk”, “walks”, “walked”, “walking”). One must identify the particular verb, which specifies whether to use a regular (e.g., “walked”) or irregular (e.g., “went”) form. In addition, one must unpack the phonological content of the verb and suffix to implement three additional computations: phonological adjustments in the sequence of phonemes (e.g., inserting a vowel between verb and suffix in “patted,” but not in “walked”), phonetic adjustments in the pronunciation of the phonemes (such as the difference between the “d” in “walked” and “jogged”), and conversion of the phoneme sequence into articulatory motor commands.
This logical decomposition does not entail that each kind of representation corresponds to a distinct stage or circuit in the brain. In many neural-network models, the selection of tense, discrimination of regular from irregular inflection, and formulation of the phonetic output are computed in parallel and in one time-step within a single distributed network (3, 4). Others contain loops and feedback connections, propagate probabilistic constraints, and iteratively settle into a globally stable state, with no fixed sequence of operations (5). Even stage models may incorporate cascades where partial information from one stage begins to feed the next before its computation is complete (6). Nonetheless the most comprehensive model of speech production, developed by Levelt, Roelofs, & Meyer (LRM), maximizes parsimony and falsifiability by implementing linguistic operations as discrete ordered stages, eschewing feedback, loops, parallelism, or cascades (7). They posit stages for lexical retrieval (which they associate with the left middle temporal gyrus at 150–225 ms after stimulus presentation), grammatical encoding (locus and duration unknown), phonological retrieval (posterior temporal lobe, 200–400 ms), phonological and phonetic processing (Broca’s area, 400–600 ms), self-monitoring (superior temporal lobe, beginning at 275–400 ms but highly variable in duration), and articulation (motor cortex) (8, 9).
Current evidence, however, leaves considerable uncertainty about the localization and timing of these components, especially grammatical processing. Although clinical studies report double dissociations in which a patient is more impaired in grammar than phonology or vice-versa (10), in most studies both abilities are linked to similar regions in the left inferior prefrontal cortex, particularly Broca’s area (11). Though Broca’s area itself has been identified as the seat of phonology, grammar, and even specific grammatical operations (12, 13, 14), lesion and neuroimaging studies have tied it to a broad variety of linguistic and nonlinguistic processes (15). This uncertainty may be a consequence of the coarseness of current measurements. It remains possible that grammatical and other linguistic processes are processed distinctly, even sequentially, in the microcircuitry of the brain, but techniques that sum over seconds and centimeters necessarily blur them.
In a rare procedure, electrodes are implanted in the brains of patients with epilepsy for clinical evaluation. Recordings of intra-cranial electrophysiology (ICE) from unaffected brain tissue during periods of normal activity can provide millisecond resolution in time with millimeter resolution in space. We recorded local field potentials (LFP) from multi-contact depth electrodes in three right-handed patients (age 38–51; above-average language and cognitive skills) whose electrodes were located in and around Broca’s area while they read words verbatim or converted them to an inflected form (past/present, singular/plural) (Figs. 1 and 2) (16). The task engages inflectional morphology, which is like syntax in combining meaningful elements according to grammatical rules, but the units are shorter and semantically simpler, making fewer demands on working memory and conceptual integration, thus allowing greater experimental control. We applied the high resolution of ICE to a task that distinguishes three linguistic processes to investigate the spatiotemporal patterning of word production in the brain.
In each trial, participants saw either the instruction “Repeat word” (the “Read” condition), or a cue that dictated an inflected form (“Every day they ____”; “Yesterday they ____”; “That is a ____”; “Those are the ____”). Next they saw a target word and produced the appropriate form silently (Fig. 1A) (16). The 240 target words were presented in uninflected form in the phrase “a [noun]” or “to [verb]” (17) (Fig. 1B). Half the targets were regular (e.g. “link”/“linked”) and half irregular (e.g. “think”/“thought”), to ensure that participants had to access the word rather than automatically appending the regular suffix (18).
The Null-Inflect (N) condition requires an inflected form of the verb (present tense) or noun (singular), yet these forms are not overtly marked and thus require the same output to be pronounced as in the Read (R) condition. The difference between these conditions thus implicates the process of inflection. In contrast, the Overt-Inflect (O) condition (past-tense verb or plural noun) requires that a suffix be added (regular) or the form changed (irregular). It thus differs from the Null-Inflect condition in requiring computation of a different phonological output (Fig. 1B; the label ‘phonological’ subsumes phonological, phonetic, and articulatory processes). The design was fully crossed, with trials presented in pseudorandom order.
To assess if these patients’ language systems were organized normally, and to correlate LFP with fMRI, we performed fMRI in two of the patients before their electrodes were placed. Their activation patterns were indeed similar to 18 healthy controls (Fig. 2A–C) (for other fMRI results see 19). Most of the 168 bipolar channels from which we recorded (across patients) were in fMRI-active regions (Fig. 2A–G). LFP that was significantly correlated with the task (p<.001, corrected; see 16) was recorded in about half (86/168) of the channels (19 channels in Patient A, 37 in B, and 30 in C). Of these channels, 49 (57%) were within Broca’s area or the anterior temporal lobes (16 in A, 19 in B, 14 in C). Of the 49 channels, 26 were within Broca’s area, and the majority (20/26) yielded a strong triphasic (3-component) LFP waveform (9 in Patient A, 8 in B, 3 in C). The mean peaks occurred ~200, ~320, and ~450 ms after the target word onset (Fig. 2A), and this timing was consistent across patients (Fig. 4A and B; fig. S1, fig. S4, fig. S5).
The three LFP components showed signatures of distinct linguistic processing stages (Fig. 2A–C). The ~200 ms component appears to reflect lexical identification. The timing converges with when word-specific activity has previously been recorded in the visual word form area (VWFA) (20, 21, but see 22), and when the VWFA has been shown to become phase-locked with Broca’s area (23). Furthermore, the magnitude of the component varied with word frequency, which indexes lexical access (24). Specifically, rare words (frequency 1–4) yielded a significantly higher amplitude (t(204)=3.32, p < .001) than common words (frequency 9 to 12) (Fig. 2A bottom; 25). Word frequency is inversely correlated with word length, but the present effect is not a consequence of length: we found no difference at ~200 ms between short (2–4 character) and long (6–11 character) words (Fig. 2A), nor a difference between one-morpheme and two-morpheme responses (26). Later components were not affected by frequency. Finally, consistent with the fact that lexical identification is required by all three inflectional conditions, the ~200 ms component did not vary across them. Primary lexical access is generally associated with temporal cortex rather than Broca’s area (8), so this component may index delivery of word identity information into Broca’s area for subsequent processing, consistent with anatomic and physiological evidence that the two areas are integrated (23,27). Although word-evoked activity in this latency range has previously been localized to Broca’s area with LFP (28) and MEG (29), it has not been demonstrated to be modulated by lexical frequency.
The subsequent two LFP components showed activity patterns predicted for grammatical and phonological processing, respectively (Fig. 2B and C). In the ~320 ms component (Fig. 2B) the Overt-Inflect and Null-Inflect conditions significantly differed from the Read condition, but not from each other. Thus, the ~320 ms component is modulated by the demands of inflection (required by Overt-Inflect and Null-Inflect but not Read), but not by the demands of phonological programming (required in Overt-Inflect but not in Null-Inflect or Read; recall Fig. 1C). In contrast, in a component appearing at ~450 ms, Overt-Inflect did differ from the Null-Inflect and Read conditions, which did not differ from each other (Fig. 2C). This contrasting pattern indicates that the ~450 ms component reflects phonological, phonetic, and articulatory programming, independently confirmed by its sensitivity to the number of syllables (Fig. 4C). Both components were recorded from Broca’s area in all patients (fig. S1), and specifically in Patient A (Fig. 1) from the inferior frontal gyrus pars triangularis deep in the inferior frontal sulcus. The ~320 ms component was recorded near the fundus; the ~450 ms component 5 mm more lateral along the sulcus within a sub-gyral fold that faced the fundus (Fig. 3I, fig. S1a). This region is often considered part of area 45 (but see 30).
The pattern of sign inversions across neighboring bipolar channels in space (Fig. 2A top) indicates that the generators of the LFP components were local (fig. S3), and the differences in inversions across components in time indicate that their generators were not identical (Fig. 3I and J). Thus the overall LFP pattern suggests a fine-grain spatiotemporal progression of lexical, grammatical, and phonological processing within Broca’s area during word production.
The triphasic pattern in all patients was found exclusively in Broca’s area (Fig. 4A and B). Outside Broca’s area other patterns prevailed: for example, temporal lobe sites showed a slow and late monophasic component at 500–600 ms (Fig. 4A bottom; fig. S4f and g) (31), possibly reflecting self-monitoring (7, 8). The condition differences for each component were also consistent across patients, replicating the temporal isolation of grammatical (~320 ms) from phonological (~450 ms) processing (fig. S1). The word-frequency effect on the ~200 ms component was significant in Patients A and B and marginal (p=0.06) in Patient C (fig. S2). The ~200, ~320, and ~450 ms components were consistent in their timing across patients, though the keypress reaction times, which require the self-monitoring process, varied among patients and conditions (fig. S6).
Although nouns and verbs differ linguistically and neurobiologically (32, 33), the neuronal activity they evoked was similar (Fig. 4B). Furthermore, the patterning across inflectional conditions was the same for nouns and verbs (34). These parallels suggest that words from different lexical classes feed a common process for inflection.
Further evidence that the LFP patterns reflect inflectional computation is that they are triggered by presentation of the target word, not the cue, even though the cues contain more visual and linguistic elements (Fig. 4D) (35). Furthermore, activity evoked by the cue showed little sensitivity to the inflectional conditions.
The LFP patterns are consistent with the computational nature of the task, and with independent estimates of the timing of its subprocesses. Inflectional processing cannot occur before the word is identified (especially as to whether it is regular or irregular), and phonological, phonetic, and articulatory processing cannot be computed before the phonemes of the inflected form have been determined. Word identification has been shown to occur at 170–250 ms (8, 29, 36), consistent with the ~200 ms component, and syllabification and other phonological processes at 400–600 ms, consistent with the phonological component at 400–500 ms (8). In naming tasks, speech onset occurs at around 600 ms (8), which is consistent with the self-monitoring behavioral responses we recorded (fig. S6). Self-monitoring has been localized to the temporal lobe (8), where we recorded LFPs in the post-response latency range that may correspond to previously described scalp ERPs (37).Working backwards from 600 ms, we note that motor neuron commands occur 50–100 ms prior to speech, placing them just after the phonological component we found to peak at 400–500 ms (38). In sum, the location, behavioral correlates, and timing of the components of neuronal activity in Broca’s area suggest that they embody, respectively, lexical identification (~200 ms), grammatical inflection (~320 ms), and phonological processing (~450 ms), in the production of nouns and verbs alike.
Although the language processing stream as a whole surely exhibits parallelism, feedback, and interactivity, the current results support parsimony-based models such as LRM (7) in which one portion of this stream consists of spatiotemporally distinct processes corresponding to levels of linguistic computation. Among the processes identified by these higher-resolution data is grammatical computation, which has been elusive in previous, coarser-grained investigations. As such the results are also consistent with recent proposals that Broca’s area is not dedicated to a single kind of linguistic representation but is differentiated into adjacent but distinct circuits that process phonological, grammatical, and lexical information (37, 39, 40, 41).
Supplementary Material
Acknowledgments
Supported by NIH grants NS18741 (E.H.), NS44623 (E.H.), HD18381 (S.P.), T32-MH070328 (N.T.S.), NCRR P41-RR14075; and the Mental Illness and Neuroscience Discovery (MIND) Institute (N.T.S.), Sackler Scholars Programme in Psychobiology (N.T.S.) and Harvard Mind/Brain/Behavior Initiative (N.T.S.). We heartily thank the patients. We also thank Efstathios Papavassiliou and Julian Wu for access to their patients; Suresh Narayanan, Nima Dehghani, Matthew T. Wheeler, Frank Kampmann and Larry Gruber for assistance with intracranial electrophysiological data; Rajeev Raizada for manuscript suggestions; Nicole M. Sahin; and two anonymous reviewers whose suggestions and encouragement greatly improved this paper.
Footnotes
References
- 1.Pinker S. The Language Instinct. HarperColllins; 1994. [Google Scholar]
- 2.Pinker S. Science. 1991;253:530. doi: 10.1126/science.1857983. [DOI] [PubMed] [Google Scholar]
- 3.Plunkett K, Marchman V. Cognition. 1991;38:43. doi: 10.1016/0010-0277(91)90022-v. [DOI] [PubMed] [Google Scholar]
- 4.MacWhinney B, Leinbach J. Cognition. 1991;40:121. doi: 10.1016/0010-0277(91)90048-9. [DOI] [PubMed] [Google Scholar]
- 5.Joanisse MF, Seidenberg MS. Proc Natl Acad Sci USA. 1999;96:7592. doi: 10.1073/pnas.96.13.7592. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.McClelland JL. Psychol Rev. 1979;56:287. [Google Scholar]
- 7.Levelt WJM, Roelofs A, Meyer AS. Behav Brain Sci. 1999;22:1. doi: 10.1017/s0140525x99001776. [DOI] [PubMed] [Google Scholar]
- 8.Indefrey P, Levelt WJM. Cognition. 2004;92:101. doi: 10.1016/j.cognition.2002.06.001. [DOI] [PubMed] [Google Scholar]
- 9.Janssen DP, Roelofs A, Levelt WJM. Lang & Cogn Processes. 2002;17:209. [Google Scholar]
- 10.Dronkers N. Nature. 1996;384:159. doi: 10.1038/384159a0. [DOI] [PubMed] [Google Scholar]
- 11.We use “Broca’s area” to denote the left inferior frontal gyrus, pars opercularis and pars triangularis (classically, Brodmann areas 44 and 45, but see 30).
- 12.Broca P. Bulletin de la Société Anatomique. 1861;6:330. [Google Scholar]
- 13.Zurif E, Caramazza A, Myerson R. Neuropsychologia. 1972;10:405. doi: 10.1016/0028-3932(72)90003-6. [DOI] [PubMed] [Google Scholar]
- 14.Grodzinsky Y. Behav Brain Sci. 2000;23:1. doi: 10.1017/s0140525x00002399. [DOI] [PubMed] [Google Scholar]
- 15.Kaan E, Swaab TY. Trends Cogn Sci. 2002;6:350. doi: 10.1016/s1364-6613(02)01947-2. [DOI] [PubMed] [Google Scholar]
- 16.Materials and methods are available as supporting material on Science Online.
- 17.The context words (a, to) prevented participants from simply concatenating the cue and target (a strategy that would succeed in two-thirds of the trials), and helped equalize difficulty across conditions.
- 18.Differences in the signal for regular and irregular verbs are not analyzed here (for discussion see 19).
- 19.Sahin NT, Pinker S, Halgren E. Cortex. 2006;42:540. doi: 10.1016/s0010-9452(08)70394-0. [DOI] [PubMed] [Google Scholar]
- 20.Cohen L, Dehaene S. NeuroImage. 2004;22:466. doi: 10.1016/j.neuroimage.2003.12.049. [DOI] [PubMed] [Google Scholar]
- 21.Nobre AC, Allison T, McCarthy G. Nature. 1994;372:260. doi: 10.1038/372260a0. [DOI] [PubMed] [Google Scholar]
- 22.Price CJ, Devlin JT. NeuroImage. 2003;19:473. doi: 10.1016/s1053-8119(03)00084-3. [DOI] [PubMed] [Google Scholar]
- 23.Sahin NT, et al. NeuroImage. 2007;36:S74. [Google Scholar]
- 24.Hauk O, Pulvermuller F. Clin Neurophysiol. 2004;115:1090. doi: 10.1016/j.clinph.2003.12.020. [DOI] [PubMed] [Google Scholar]
- 25.Frequency score was the rounded natural log of the combined frequencies of all inflectional forms of a word, plus one.
- 26.These factors were largely independent: word length correlated little with morpheme count (0.267) or frequency (-0.347).
- 27.Friederici AD. Trends Cogn Sci. 2009;13:175. doi: 10.1016/j.tics.2009.01.001. [DOI] [PubMed] [Google Scholar]
- 28.Halgren E, et al. J Physiol (Paris) 1994;88:51. doi: 10.1016/0928-4257(94)90093-0. [DOI] [PubMed] [Google Scholar]
- 29.Marinkovic K, et al. Neuron. 2003;38:487. doi: 10.1016/s0896-6273(03)00197-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Amunts K, et al. J Comp Neurol. 1999;421:319. doi: 10.1002/(sici)1096-9861(19990920)412:2<319::aid-cne10>3.0.co;2-7. [DOI] [PubMed] [Google Scholar]
- 31.This component may approximate the P600 component often recorded from the scalp (Friederici AD. Clin Neurosci. 1997;4:64.), but comparisons are difficult because the P600 is generally elicited by errors, in comprehension rather than production experiments.
- 32.Caramazza A, Hillis AE. Nature. 1991;349:788. doi: 10.1038/349788a0. [DOI] [PubMed] [Google Scholar]
- 33.Shapiro K, Caramazza A. Trends Cogn Sci. 2003;7:201. doi: 10.1016/s1364-6613(03)00060-3. [DOI] [PubMed] [Google Scholar]
- 34.Except that, for nouns, the Overt-Read comparison at ~320 and the Overt-Null comparison at ~450 ms only approached significance (p=0.08 and 0.06, respectively; one-tailed t-test).
- 35.We measured the average amplitude of the rectified all-conditions LFP in Broca’s area channels in all patients, in the 150–650 ms interval, embracing our components of interest. The response epoch had a higher amplitude than the cue epoch in most (20/26) channels, and across all channels was 99% greater. [Patient A yielded a higher amplitude in the response epoch in 7 of 10 channels, on average 71.7% higher; Patient B in 7 of 10 channels (+33.6% on average); and Patient C in 6 of 6 channels (+191.6% on average)].
- 36.Gaillard R, et al. Neuron. 2006;50:191. doi: 10.1016/j.neuron.2006.03.031. [DOI] [PubMed] [Google Scholar]
- 37.Friederici AD. Trends Cogn Sci. 2002;6 doi: 10.1016/s1364-6613(00)01839-8. [DOI] [PubMed] [Google Scholar]
- 38.LFP components reported here vary by amplitude but not latency or duration; evidently the processes they index are consistently timed, and other processes (e.g., assembly and enactment of the articulatory plan (8)) produce the differences in response latency.
- 39.Hagoort P. Trends Cogn Sci. 2005;9:416. doi: 10.1016/j.tics.2005.07.004. [DOI] [PubMed] [Google Scholar]
- 40.Bornkessel I, Schlesewsky M. Psychol Rev. 2006;113 doi: 10.1037/0033-295X.113.4.787. [DOI] [PubMed] [Google Scholar]
- 41.However, the fine-grained, within-gyrus localization reported here cannot easily be mapped onto the more macroscopic divisions suggested by these authors.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.