Abstract
Is sentence structure processed by the same neural and cognitive resources that are recruited for processing word meanings, or do structure and meaning rely on distinct resources? Linguistic theorizing and much behavioral evidence suggest tight integration between lexico-semantic and syntactic representations and processing. However, most current proposals of the neural architecture of language continue to postulate a distinction between the two. One of the earlier and most cited pieces of neuroimaging evidence in favor of this dissociation comes from a paper by Dapretto & Bookheimer (1999). Using a sentence-meaning judgment task, Dapretto & Bookheimer observed two distinct peaks within the left inferior frontal gyrus (LIFG): one more active during a lexico-semantic manipulation, and the other – during a syntactic manipulation. Although the paper is highly cited, no attempt has been made, to our knowledge, to replicate the original finding. We report an fMRI study that attempts to do so. Using a combination of whole-brain, group-level ROI, and participant-specific functional ROI approaches, we fail to replicate the original dissociation. In particular, whereas parts of LIFG respond reliably more strongly during lexico-semantic than syntactic processing, no part of LIFG (including in the region defined around the peak reported by Dapretto & Bookheimer) shows the opposite pattern. We speculate that the original result was a false positive, possibly driven by a small subset of participants or items that biased a fixed-effects analysis with low power.
Introduction
Sentence comprehension requires us to retrieve the word meanings from the mental lexicon (lexico-semantic processing), and infer how they relate to one another within the sentence (syntactic processing) – i.e., recover their dependency structure using a combination of lexico-semantic constraints, word order, and/or functional morphology (e.g., Dryer et al., 2002; Gibson et al., 2013). Together, individual word meanings and the way they combine determine the propositional content of the sentence (i.e., who is doing what to whom). Whether these two components of sentence comprehension rely on distinct pools of cognitive and neural resources has been long debated (e.g., Dick et al., 2001).
In order to search for a potential dissociation between lexico-semantic and syntactic processing, cognitive neuroscientists have tested whether some brain regions respond selectively, or at least preferentially, to one or the other. To this end, several manipulations contrasting the two kinds of processes have been used, including a) linguistically degraded materials like lists of unconnected words that require lexical-level understanding but not putting words together into complex representations, vs. “Jabberwocky” sentences that contain a coarse-level representation of the dependency structure but not lexical meanings (e.g., Friederici et al., 2000; Humphries et al., 2006; Fedorenko et al., 2010; see Bautista & Wilson, for a related approach), b) violations of lexico-semantic vs. syntactic expectations (e.g., Embick et al., 2000; Kuperberg et al., 2003; Cooke et al., 2006; Friederici et al., 2010; Herrmann et al., 2012), and c) adaptation to lexico-semantic content vs. syntactic structure (e.g., Noppeney & Price, 2004; Santi & Grodzinsky, 2010; Menenti et al., 2012; Segaert et al., 2012). These numerous studies have produced a complicated empirical picture filled with contradictions (e.g., see Fedorenko et al., 2018, for a discussion). Nevertheless, the dominant view among cognitive neuroscientists studying language remains that lexico-semantic and syntactic processing rely on distinct pools of resources (e.g., Friederici, 2012; Baggio and Hagoort, 2011; Tyler et al., 2011; Duffau et al., 2014; Ullman, 2016; cf. Fedorenko et al., 2012a; Blank et al., 2016; Wilson & Bautista, 2016).
One of the most cited studies that has argued for a dissociation between lexico-semantic and syntactic processing was conducted by Dapretto and Bookheimer and published in Neuron in 1999. The study used an original manipulation where participants made meaning judgments on pairs of sentences, which differed either in one word (replaced by a synonym, resulting in the same meaning, or by a non-synonym, leading to a change in meaning) or in the structure of the sentence (e.g., an Active/Passive alternation that either kept the thematic roles the same or switched them; see sample items in Methods). The key result was a double dissociation between the Semantics and Syntax conditions observed in the left inferior frontal gyrus (LIFG), i.e., two nearby peaks revealed by the Semantics > Syntax, and Syntax > Semantics contrast, respectively. This result, the authors argued, provided “unequivocal evidence that these functions [lexico-semantic and syntactic processing; SBMF] are […] subserved by distinct cortical areas”.
Dapretto & Bookheimer’s study has been cited 689 times (as of November 12, 2018; https://scholar.google.com/citations?view_op=view_citation&h1=en&user=fQ-cmN8AAAAJ&citation_for_view=fQ-cmN8AAAAJ:d1gkVwhDp10C), and the pattern of citations over the years (Fig. 1) suggests that it is still being used by researchers as evidence for distinct brain regions supporting lexico-semantic vs. syntactic processing.
And yet, it appears that no replication of this study has ever been published either by one of the original author’s labs, or by any other research group. Given a) the study’s impact on the field, combined with b) recent studies that have argued for overlap between lexico-semantic and syntactic processing across the fronto-temporal language network (e.g., Fedorenko et al., 2012a; Blank et al., 2016; Wilson & Bautista, 2016; Fedorenko et al., 2018), and c) current emphasis on reproducibility in the fields of psychology (e.g., Ioannidis, 2005; Simmons et al., 2011; Button et al., 2013; Ioannidis et al., 2014) and cognitive neuroscience (e.g., Poldrack et al., 2017), we here attempted a conceptual replication of Dapretto & Bookheimer’s findings.
Methods
Participants
Fifteen individuals (age 20-30 (25.3 ±4.1), 5 females), native speakers of English, participated for payment. Fourteen of the fifteen participants were right-handed (as determined by the Edinburgh handedness inventory; Oldfield, 1971), but all fifteen showed typical, left-lateralized, language activations (as assessed with an independent language “localizer” task conducted in the same session; Fedorenko et al., 2010). All participants had normal hearing and vision, and no history of neurological illness or language impairment. Participants gave written informed consent in accordance with the requirements of MIT’s Committee on the Use of Humans as Experimental Subjects (COUHES).
Design, materials, and procedure
Each participant completed the critical task, as well as one or more additional tasks for unrelated studies. The entire scanning session lasted approximately two hours.
Design and materials:
The basic design was the same as in Dapretto & Bookheimer’s study. Participants were presented with pairs of sentences and asked to decide whether they meant roughly the same thing. The critical manipulation was whether the sentences in the pair differed in one of the words (the Semantics condition) or in the structure / word order (the Syntax condition). In particular, in the Semantics condition, one of the words in the first sentence was replaced by a synonym in the second sentence (roughly preserving the meaning) or by a word with a different meaning (leading to different meanings), as in (1a). In the Syntax condition, the sentences were either syntactic alternations with the same meaning, or the structure / word order was changed leading to a different meaning, as in (1b).
(1a).
Same: | Anna invited the composer. / Anna invited the songwriter. |
Different: | Anna invited the composer. / Anna invited the translator. |
(1b).
Same: | Anna invited the composer. / The composer was invited by Anna. |
Different: | Anna invited the composer. / The composer invited Anna. |
The materials consisted of 80 items (sentence pairs). Forty items used the Active / Passive constructions (as in Dapretto & Bookheimer’s study), and forty – the Double Object (DO) / Prepositional Phrase Object (PP) constructions. Each item had four versions, as in (1a–b), for a total of 320 trials. The full set of materials is available at https://osf.io/wtv9f/.
The 320 trials were divided into four experimental lists (80 trials each, 40 trials per condition) following a Latin Square Design so that each list contained only one version of any given item. Each participant saw the materials from just one experimental list, and each list was seen by 3-4 participants.
A number of features varied and were balanced across the materials. First, the construction was always the same across the two sentences in a pair in the Semantics condition (balanced between active and passive for the Active / Passive trials, and between double object and prepositional phrase object for the DO / PP trials). In the Syntax condition, the construction was always different in the Same-meaning trials because this is how the propositional meaning was preserved (again, balanced between active and passive for the Active / Passive trials, and between double object and prepositional phrase object for the DO / PP trials). For the Different-meaning trials, the construction could either be the same or different, as follows:
(2a).
Same construction: | Anna invited the composer. / The composer invited Anna. |
Different constructions: | Elizabeth disliked the proprietor. / Elizabeth was disliked by the proprietor. |
(2b).
Same construction: | Amanda lent the cook some money. / The cook lent Amanda some money. |
Different constructions: | Brenda read the expert a passage. / The expert read a passage to Brenda. |
For trials where the constructions differed between the two sentences in a pair, we balanced whether the first sentence was active vs. passive (for the Active / Passive trials), or whether it was DO vs. PP (for the DO / PP trials).
Second, all sentences (in both Active / Passive and DO / PP constructions) contained one occupation noun and one name. Whether the first noun in the first sentence in a pair was an occupation or a name was balanced across items.
And third, for the Semantics condition, we varied how exactly the words in the second sentence in a pair differed from the words in the first. (This does not apply to the Syntax condition trials, where the content words are identical across the two sentences within each pair.) In particular, for the Active / Passive trials, either the occupation noun or the verb could be replaced (by a synonym or a word with a different meaning); and for the DO/PP trials, either the occupation noun or the direct object (inanimate) noun could be replaced.
Procedure:
An event-related design was used. Each event (trial) consisted of an initial 300 ms fixation, 2,000 ms presentation of the first sentence (presented all at once), 200 ms inter-sentence interval, 2,000 ms presentation of the second sentence, and a 1,500 ms window for participants to respond (by pressing one of two buttons on a button box), for a total of 6 s. The 80 trials in a list were divided into two runs, with each run consisting of 40 trials and additional 120 s of inter-trial fixation, for a total run duration of 360 s (6 min). Each participant performed two runs. The optseq2 algorithm (Dale, 1999) was used to create condition orderings and to distribute fixation among the trials so as to optimize our ability to de-convolve neural responses to each condition. Eight orders were created, and order varied across runs and participants.
A summary of the key differences between the current study and Dapretto & Bookheimer’s (1999) study:
Table 1 provides a summary of the key differences between the studies. The key improvements in the design of the current study concern power, both with respect to the number of participants tested (almost twice as many participants), and amount of data collected for each participant: we used five times as many trials per condition. The original study used a generally more powerful blocked design. However, given that the original study used only one block per condition, the current study is likely to be more powerful in spite of the use of an event-related design (with 40 events per condition) (e.g., Nee, 2019). (See Analyses below for a formal power calculation.)
Table 1:
Current study | Dapretto & Bookheimer’s study | |
---|---|---|
Participants | n=15 | n=8 |
fMRI design | Event-related | Blocked |
Materials | 40 pairs of sentences per condition, 40 events per condition | 8 pairs of sentences per condition, 1 block per condition |
Constructions used | Active/Passive alternation, Double Object/Prepositional Phrase Object alternation | Active/Passive alternation, Locative Prepositional Phrase alternation |
Presentation | Visual | Auditory |
Acquisition device | A 3 Tesla Siemens Trio scanner with a 32-channel head coil | A 3 Tesla GE scanner; no coil information provided |
Acquisition parameters | TR = 2000ms, TE = 30ms matrix size = 96×96 with 200mm field of view | TR = 2500ms, TE = 45ms matrix size = 64×64 with 200mm field of view |
Preprocessing software | SPM5 | SPM96 |
Preprocessing | 4mm FWHM Gaussian smoothing kernel; high-pass filtering | 6mm FWHM Gaussian smoothing kernel; no high-pass filtering |
Statistical modeling | Random effects analysis | Fixed effects analysis? (not fully clear from the description) |
We also deviated in one of the constructions used. Although we adopted the Active / Passive alternation, we replaced the Locative Prepositional Phrase alternation (e.g., The pool is behind the gate. / Behind the gate is the pool.) with a more commonly used Double Object / Prepositional Phrase Object (DO / PP) alternation (e.g., Allen et al., 2012; Gibson et al., 2013). The reason we chose not to use the Locative alternation from the original study is that fronted locative prepositional phrases (locative inversion) are rare in natural language (e.g., Gibson et al., 2013). Finally, we opted for the use of the visual presentation (cf. auditory presentation used by Dapretto & Bookheimer). Abundant prior evidence suggests that high-level language processing brain regions, including those in the frontal lobe, are robust to presentation modality (e.g., Buchweitz et al., 2009; Fedorenko et al., 2010, 2016; Braze et al., 2011; Bemis & Pylkkanen, 2012; Vagharchakian et al., 2012; Scott et al., 2016). Thus, the use of a different modality is not expected to matter.
Given these differences between the original study and the current one, this replication is not a direct replication, but a conceptual one, albeit a close one. Conceptual replications have been argued to be as important, if not more important in some cases, for establishing robust cumulative science (e.g., Schmidt, 2009).
fMRI data acquisition and preprocessing
Structural and functional data were collected on the whole-body 3 Tesla Siemens Trio scanner with a 32-channel head coil at the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research at MIT. T1-weighted structural images were collected in 128 axial slices with 1 mm isotropic voxels (TR = 2530 ms, TE = 3.48 ms).
Functional, blood oxygenation level dependent (BOLD) data were acquired using an EPI sequence (with a 90° flip angle and using GRAPPA with an acceleration factor of 2), with the following acquisition parameters: thirty-one 4 mm thick near-axial slices, acquired in an interleaved order with a 10% distance factor; 2.1 mm × 2.1 mm in-plane resolution; field of view of 200 mm in the phase encoding anterior to posterior (A > P) direction; matrix size of 96×96; TR of 2000 ms; and TE of 30 ms. Prospective acquisition correction (Thesen, Heid, Mueller, & Schad, 2000) was used to adjust the positions of the gradients based on the participant’s motion one TR back. The first 10 s of each run were excluded to allow for steady-state magnetization.
MRI data were analyzed using SPM5 (using default parameters, unless specified otherwise) and supporting, custom MATLAB scripts. (The use of an older version of the SPM software should make the preprocessing and analysis more similar to those used by Dapretto & Bookheimer, who used SPM96.) Each participant’s data were motion corrected and then normalized into a common brain space (the Montreal Neurological Institute, MNI, Brain Template) and resampled into 2 mm isotropic voxels. The data were then smoothed with a 4 mm FWHM Gaussian filter and high-pass filtered (at 200 s). The critical task’s effects were estimated using a General Linear Model (GLM) in which each experimental condition was modeled with a boxcar function (corresponding to an event) convolved with the canonical hemodynamic response function (HRF).
Analyses
What counts as a replication in brain imaging studies is still debated (e.g., Hong et al., 2018). In an effort to be comprehensive, we performed three analyses to assess whether the dissociation reported by Dapretto & Bookheimer (1999) between syntactic and lexico-semantic processing holds in the current dataset.
First, we performed a traditional random-effects analysis (e.g., Holmes & Firston, 1998), where individual activation maps are overlaid in the common space, and a t-test is performed across participants in each voxel for each relevant contrast. In particular, following Dapretto & Bookheimer, we examined group-level effects for the following four contrasts: i) Semantics > Fixation, ii) Syntax > Fixation, iii) Semantics > Syntax, and iv) Syntax > Semantics.
Second, we performed a more targeted analysis of the activation peaks that emerged in Dapretto & Bookheimer’s study for the direct contrasts of the Semantics and Syntax conditions: {−48, 20, −4} in Talairach space ({−48.5, 20.8, −3.6} in MNI space) for the Semantics > Syntax contrast, and {−44, 22, 10} ({−44.4, −23.2, 9.7} in MNI space) for the Syntax > Semantics contrast. To do so, we defined spherical regions of interest (ROIs) (of two different sizes: radius = 10mm and 5mm; available for download from https://osf.io/wtv9f/) around those activation peaks and extracted responses to the Semantics and Syntax conditions (including broken down by construction). We then performed one-tailed t-tests to evaluate whether the previously reported effects replicate in the current dataset.
And finally, we gave the data the strongest chance to reveal a dissociation if such is present, using an individual-participant functional localization approach, which has been shown to benefit from higher sensitivity and functional resolution compared to group-based analyses (e.g., Saxe et al., 2006; Thirion, 2007; Nieto-Castañon & Fedorenko, 2012; see also the power calculation below). In particular, we searched, in each participant individually, for the most Semantics-preferring voxels (i.e., showing the strongest effect for the Semantics > Syntax contrast), and for the most Syntax-preferring voxels (i.e., showing the strongest effect for the Syntax > Semantics contrast) in the left inferior frontal gyrus (LIFG). To constrain the search, we used anatomical masks (Tzourio-Mazoyer et al., 2002) for the three sub-regions of LIFG – pars orbitalis (LIFGorb), pars triangularis (LIFGtri), and pars opercularis (LIFGop). To define individual functional regions of interest (fROIs), we divided the data in half, and using one half of the data we sorted the voxels within each mask by the t-value for the relevant contrast (i.e., Semantics > Syntax or Syntax > Semantics). We then chose the top 10% of voxels as the fROI. Thus, in each participant, we defined 6 fROIs: i) a Semantics-preferring fROI in LIFGorb, ii) a Semantics-preferring fROI in LIFGtri, iii) a Semantics-preferring fROI in LIFGop, iv) a Syntax-preferring fROI in LIFGorb, v) a Syntax-preferring fROI in LIFGtri, and vi) a Syntax-preferring fROI in LIFGop. We then extracted the responses to the Semantics and Syntax conditions (including broken down by construction) from the other half of the data and tested their difference using one-tailed t-tests. This analysis helps circumvent the high inter-individual variability that characterizes the human frontal lobes (e.g., Amunts et al., 1999; Tomaiuolo et al., 1999; Juch et al., 2005; Fedorenko et al., 2012b). Thus, even if the individual activation peaks for the Semantics > Syntax and Syntax > Semantics contrast are spatially variable enough so that group-level analyses (both whole-brain random-effects analysis, and ROI-based analysis) fail to detect them, this analysis would recover these effects if they hold across participants anywhere within the LIFG. The individual activation maps for the Semantics > Fixation and Syntax > Fixation contrasts are available for download at https://osf.io/wtv9f/.
To formally estimate power for this analysis, we first need to evaluate the expected effect size for the contrast between semantic and syntactic conditions in participant-specific fROIs. We do this in several steps: first, we note that, based on a sample of n=352 participants (unpublished data from the Fedorenko lab), the average effect size for a robust contrast, between Sentences and Nonword-lists, is d = 1.41. Next, we note that in the left IFG, subtler contrasts, between different kinds of sentences, have been observed to elicit effect sizes that are about 60% of the Sentences > Nonwords effects (e.g., Blank et al., 2016). To err on the conservative side, we estimate the effect size in the current experiment to instead be 50% of the Sentences > Nonwords effect, i.e., d=0.7. This estimate is consistent with two other observations: first, a separate experiment in our lab (Fedorenko et al., 2018) estimated the difference between the reading of sentences with semantic (wrong word) vs. morpho-syntactic (wrong inflection) violations to be of similar magnitude (d = 0.64). And second, the effect sizes within participant-specific fROIs that we report for the current experiment are 1.25 (LIFGorb), 0.95 (LIFGtri), and 0.64 (LIFGop) (the effect sizes reported in the paper are smaller, because they were computed based on an independent-samples formula, which some claim is a more appropriate way to estimate effect sizes even for dependent-samples designs; in contrast, all estimates in the current paragraph are based on a dependent-samples formula, which is the estimate that is plugged into power calculations for dependent-samples designs). For an estimated effect size of d = 0.64-0.7, with p = 0.05, the power for our experiment is 75-82%.
Results
Behavioral results.
Dapretto & Bookheimer (1999) collected behavioral data from the scanned participants in a separate behavioral study (conducted at least 6 months after the fMRI session), and found that the two conditions were comparable in difficulty. We replicate similar across-condition accuracies and reaction times in our study (Fig. 2), although we collected the behavioral data during the scanning session. In particular, the accuracies for both conditions were close to 90% and not significantly different (Semantics condition: 87.3% ±7.7 (M±SD) Syntax condition: 88.2% ±6.7; paired-samples t(14)=0.30, p>0.76; Cohen’s d=0.11, based on a conservative independent-samples test). Similarly, the RTs did not differ either when considering all trials (Semantics condition: 0.55 s ±0.13; Syntax condition: 0.53 s ±0.12; paired-samples t(14)=−39, p=0.7; |d|<0.08), or when considering correctly answered trials only (Semantics condition: 0.52 s ±0.14; Syntax condition: 0.54 s ±0.13; paired-samples t(14)=0.78, p>0,44; d=0.11). Comparable behavioral performance suggests that whatever differences might be observed in neural responses between the two conditions would not be attributable to differences in cognitive effort.
fMRI results.
1. Traditional random-effects analysis.
Figure 3 shows whole-brain activation maps for each condition relative to the fixation baseline across the two studies. Visual examination of the maps suggests broad similarity between studies (see also Table 2 for a list of the activation peaks for each contrast in the current study), with, critically, robust responses detected for both contrasts in the left inferior frontal cortex.
Table 2:
Comparison | Syntax Condition | Semantics Condition | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Region (Brodmann Area) | x | y | z | t | x | y | z | t | ||
Inferior Parietal Lobule | (BA 1) | L | −46 | −40 | −58 | 6.06 | ||||
Postcentral Gyrus | (BA 1) | L | −44 | −24 | 54 | 6.30 | ||||
Middle Frontal Gyrus | (BA 6) | L | −24 | −10 | 52 | 7.91 | ||||
Superior Frontal Gyrus | (BA 6) | L | −10 | 8 | 66 | 6.46 | ||||
Middle Frontal Gyrus | (BA 6) | L | −34 | 8 | 46 | 6.43 | ||||
Middle Frontal Gyrus | (BA 6) | L | −40 | 0 | 54 | 5.43 | ||||
Inferior Parietal Lobule | (BA 7) | L | −38 | −44 | 52 | 6.34 | ||||
Inferior Frontal Gyrus | (BA 8) | L | −42 | 6 | 32 | 5.69 | ||||
Medial Frontal Gyrus | (BA 8) | L | −8 | 22 | 48 | 7.91 | ||||
Middle Frontal Gyrus | (BA 8) | L | −46 | 14 | 46 | 7.57 | ||||
Middle Frontal Gyrus | (BA 9) | R | 52 | 26 | 30 | 6.85 | ||||
Insula | (BA 13) | L | −34 | 24 | 2 | 9.20 | ||||
Sub-Gyral | (BA 13) | R | 32 | 22 | 8 | 6.21 | ||||
Declive | (BA 19) | R | 16 | −64 | −30 | 8.26 | ||||
Middle Temporal Gyrus | (BA 21) | L | −54 | −46 | 2 | 5.80 | −56 | −34 | −4 | 8.86 |
Middle Temporal Gyrus | (BA 21) | L | −50 | −26 | −8 | 7.05 | −50 | −24 | −8 | 12.15 |
Middle Temporal Gyrus | (BA 21) | R | −50 | −34 | 0 | 8.29 | 62 | −40 | 0 | 6.81 |
Temporal Lobe | (BA 21) | R | 50 | −28 | −10 | 5.82 | ||||
Fusiform Gyrus | (BA 37) | L | −36 | −42 | −24 | 7.97 | ||||
Fusiform Gyrus | (BA 37) | R | 38 | −48 | −16 | 6.44 | ||||
Inferior Parietal Lobe | (BA 39) | L | −32 | −58 | 48 | 7.38 | ||||
Sub-Gyral | (BA 39) | L | −30 | −52 | 42 | 7.60 | ||||
Superior Temporal Gyrus | (BA 39) | L | −48 | −54 | 8 | 7.25 | ||||
Inferior Frontal Gyrus | (BA 44) | L | −50 | 12 | 26 | 6.61 | ||||
Inferior Frontal Gyrus | (BA 44) | R | 46 | 14 | 26 | 5.81 | ||||
Inferior Frontal Gyrus | (BA 47) | L | −34 | 26 | −2 | 8.90 | ||||
Caudate | (BA 48) | R | 10 | 8 | 6 | 6.51 | ||||
Thalamus | (BA 50) | L | −12 | −12 | 10 | 7.61 | ||||
Culmen | undefined | R | 2 | −54 | −22 | 6.72 | ||||
Sub-Gyral | undefined | R | 40 | −36 | 4 | 6.72 | ||||
Extra-Nuclear | undefined | R | 28 | 22 | 0 | 7.63 | ||||
Lentiform Nucleus | undefined | L | −16 | −2 | 8 | 5.96 | ||||
Superior Frontal Gyrus | undefined | L | 0 | 6 | 56 | 5.81 | ||||
Superior Frontal Gyrus | undefined | L | 0 | 20 | 56 | 5.49 | ||||
undefined | undefined | L | −8 | −58 | −34 | 5.90 | ||||
Comparison | Syntax vs. Semantics | Semantics vs. Syntax | ||||||||
Region (Brodmann Area) | x | y | z | t | x | y | z | t | ||
Precuneus | (BA 7) | R | 10 | −68 | 46 | 3.56 | ||||
Medial Frontal Gyrus | (BA 9) | L | −4 | 50 | 48 | 6.09 | ||||
Superior Frontal Gyrus | (BA 9) | L | −12 | 56 | 32 | 7.53 | ||||
Medial Frontal Gyrus | (BA 10) | L | −8 | 54 | 22 | 5.31 | ||||
Inferior Frontal Gyrus | (BA 47) | L | −46 | 38 | −12 | 6.10 |
Figure 4 shows the whole-brain activation map for the Semantics > Syntax contrast in Dapretto & Bookheimer’s study and our study (centering the crosshair on the same stereotactic location). The Syntax > Semantics contrast did not reveal any significant peaks at a threshold of either 0.0001 or 0.001. The group-level (as well as individual) maps for all four contrasts (including versions of the maps smoothed with a larger, 8mm, smoothing kernel) are available at https://osf.io/wtv9f/.
2. Activation-peak-based group-level ROI analysis.
Figure 5 shows mean responses to the Semantics and Syntax conditions, including broken down by construction (Active / Passive vs. DO / PO), in each of the two activation peaks reported in Dapretto & Bookheimer’s study. The peak reported as showing a Semantics > Syntax effect by Dapretto & Bookheimer showed a similar effect in our study (Semantics condition = 1.12±0.68 (M±SD), Syntax condition = 0.95±0.75, paired samples t(14)=1.97, p=0.03; Cohen’s d=0.23 based on an independent-samples test. However, the peak originally reported for the Syntax > Semantics effect also exhibited stronger activation in the semantic condition (Semantics condition = 0.82±0.51, Syntax condition = 0.63±0.48, t(14)=2.39, two-tailed p=0.03; d=0.37). Reducing the size of the ROI from a 10mm sphere to a 5mm sphere did not affect these results (Figure 5B), which plausibly reflect the low sensitivity of group-level ROIs (e.g., Nieto-Castañon & Fedorenko, 2012). Further, this pattern of results was descriptively similar across the two constructions, with significant Semantics > Syntax effects in both ROIs for the Active / Passive alternation (which was shared between the current study and the original study), and non-significant effects in the same direction for the DO / PO alternation. These results argue against the idea that the failure to replicate the dissociation is due to the changes in the materials.
3. Individual-subject functional ROI analysis.
Figure 6 shows responses to the critical conditions in individually defined functional ROIs. Here, half of the functional data was used to select the most Semantics- vs. Syntax-preferring voxels, in each participant separately and within each of the three sub-divisions of the LIFG. Then, responses in these voxels were independently estimated using the other half of the data. This analysis revealed reliable Semantics > Syntax effects in the Semantics > Syntax fROIs (i.e., fROIs consisting of most Semantics-preferring voxels) within LIFGorb (Semantics condition = 0.79±0.49, Syntax condition = 0.37±0.43, t(14)=4.85, p=10−4; d = 0.88), LIFGtri (Semantics condition = 1.16±0.73, Syntax condition = 0.63±0.45, t(14)=3.70, p=0.001; d = 0.83), and LIFGop (Semantics condition = 0.94±0.64, Syntax condition = 0.69±0.55, t(14)=2.47, p=0.014; d = 0.40). These effects suggest that the LIFG may contain areas that show robustly and replicably (across runs) greater engagement during the Semantics condition than the Syntax condition (although we note that the effect in LIFGop would not survive correction for multiple comparisons). Furthermore, these results seem stable across the two constructions: reliable for the Active/Passive alternation in all three fROIs (LIFGorb: t(14)=4.69, p=0.0002, d=0.62; LIFGtri: t(14)=3.46, p=0.002, d=0.66; LIFGop: t(14)=2.18, p=0.02, d=0.34), and for the DO/PO alternation in the LIFGorb (t(14)=3.92, p=0.0008, d=1.07) and LIFGtri (t(14)=2.95, p=0.005, d=0.91), but not LIFGop (t(14)=1.63, p=0.06, d=0.39). In contrast, the analysis of Syntax > Semantics fROIs (i.e., fROIs consisting of most Syntax-preferring voxels) did not reveal any replicable Syntax > Semantics effects in any of the three sub-divisions of the LIFG (ps>0.31). In fact, within the LIFGorb, the responses still showed a numerically stronger response to the Semantics than Syntax condition. This is striking (given that we specifically searched for most Syntax-preferring voxels) and suggests that no voxels within LIFG respond robustly and replicably (across runs) more strongly during the Syntax condition than the Semantics condition, at least in this paradigm. The fact that subject-specific fROI analyses are characterized by high sensitivity (e.g., Nieto-Castañon & Fedorenko, 2012), these results increase our confidence that the original Dapretto & Bookheimer finding was a false positive.
Discussion
To summarize, in a classic fMRI study, Dapretto & Bookheimer (1999) reported a dissociation between semantic and syntactic processing within the left inferior frontal gyrus. We here reported an fMRI study designed to conceptually replicate this early finding. We used the same two-condition design, but substantially expanded the set of experimental materials (five-fold), and included almost twice as many participants in order to increase statistical power. Although the group-level whole-brain maps contrasting each condition to a low-level fixation baseline revealed broad similarity between the two studies (and between the two conditions), the direct contrasts of the Semantics and Syntax conditions did not replicate the originally reported dissociation. In particular, we found a number of reliable activation peaks for the Semantics > Syntax contrast, including within the LIFG, but the Syntax > Semantics contrast did not produce any reliable peaks within the LIFG. In line with this whole-brain analysis, we found a similar pattern in group-level ROIs defined around the original Semantics > Syntax, and Syntax > Semantics activation peaks from Dapretto & Bookheimer’s study: the Semantics condition elicited reliably greater response in both the Semantics-peak ROIs, and the Syntax-peak ROIs. Finally, in an individual-participants functional localization analysis, which circumvents inter-individual anatomical and functional variability (rampant in the left frontal lobe, e.g., Amunts et al., 1999; Tomaiuolo et al., 1999; Juch et al., 2005; Fedorenko et al., 2012b), we were able to detect reliably greater responses to the Semantics than Syntax condition within the orbital and triangular sub-divisions of the LIFG. However, nowhere within the LIFG were there regions that responded reliably more strongly during the processing of the Syntax condition compared to the Semantics condition. Thus, the dissociation originally reported by Dapretto & Bookheimer does not appear to be robust to replication.
What can explain the non-replication of the original finding? The first, and perhaps most plausible, contributor is the fact that Dapretto & Bookheimer appear to have relied on an analysis that treated participants as fixed effects rather than random effects. In a fixed-effects analysis, individual participants are not viewed as being randomly drawn from the population. Consequently, the results cannot be generalized beyond the sample tested, and the effects could be potentially driven by a small subset of participants (or even a single participant). The seminal publication about this significant limitation in many early brain-imaging studies had only come out a year earlier (Holmes & Friston, 1998), and thus it is possible that the authors had still relied on the fixed-effects analysis (it is difficult to determine this with certainty from the description provided in the Methods section).
Second, the original study used a small number of experimental items (8 per condition). This number is very low by the standards of language research, especially in cognitive neuroscience studies, where at least 20-30 items per condition are typically used to ensure generalizability. In the current study we used 40 unique trials per condition, and observed highly similar patterns across two distinct constructions. Thus, it is possible that in the original study, one or two of the items were driving the effects (see e.g., Bedny et al., 2007, for discussion).
It is also worth noting that Darpetto & Bookheimer, as is not uncommon in the fMRI literature, did not report the magnitudes of response to the Semantics and Syntax conditions. Thus, the effect size cannot be determined, only its significance (see Chen et al., 2017, for a discussion of this issue in fMRI research). More specifically, the significant peaks reported by Dapretto & Bookheimer are consistent with either of the hypothetical patterns shown in Figure 7. We suspect that the original result was more consistent with the possibility shown in the right panel of Figure 7, i.e., with small effect sizes. And small effects, especially observed in underpowered studies, are less likely to be real (e.g., Gelman & Carlin, 2014; Simonsohn, 2015; Open Science Collaboration, 2015).
To conclude, although the question of whether distinct pools of cognitive resources and cortical regions support lexico-semantic and syntactic processing is likely to keep generating controversy and further research (see also Fedorenko et al., 2018), we here found that at least one study that is commonly cited as evidence for this dissociation does not appear to replicate in an experiment with a similar design, materials, and greater statistical power. It may be important to ask, as researchers have recently done in the field of psychology (e.g., Open Science Collaboration, 2015), what proportion of fMRI studies are robust to replication (see Hong et al., 2019, for a discussion).
Highlights for Siegelman et al.
Is sentence structure processed by the same brain regions that are recruited for processing word meanings?
We try to replicate distinct peaks in LIFG for semantic vs. syntactic processing reported by Dapretto & Bookheimer (1999).
Parts of LIFG respond reliably more strongly during semantic than syntactic processing.
No part of LIFG (including in the region defined around the peak reported by D&B) shows the opposite pattern.
The original D&B finding is likely a false positive driven by a subset of participants/items in a fixed-effects analysis.
Acknowledgments
E.F. was supported by the NIH awards R00-HD057522, R01-DC016607, and R01-DC016950. We also acknowledge the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research, MIT. For technical support during scanning, the authors thank Atsushi Takahashi and Steve Shannon.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References:
- Allen K, Pereira F, Botvinick M, Goldberg AE. Distinguishing grammatical constructions with fMRI pattern analysis. Brain and Language. 2012;123(3):174–82. [DOI] [PubMed] [Google Scholar]
- Amunts K, Schleicher A, Burgel U, Mohlberg H, Uylings HBM, Zilles K. Broca’s region revisited: Cytoarchitecture and intersubject variability. Journal of Comparative Neurology. 1999;412(2):319–41. [DOI] [PubMed] [Google Scholar]
- Bates E, Goodman JC. On the inseparability of grammar and the lexicon: Evidence from acquisition, aphasia and real-time processing. Language and Cognitive Processes. 1997;12(5–6):507–84. [Google Scholar]
- Bautista A, Wilson SM. Neural responses to grammatically and lexically degraded speech. Language Cognition and Neuroscience. 2016;31(4):567–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bedny M, Aguirre GK, Thompson-Schill SL. Item analysis in functional magnetic resonance imaging. NeuroImage. 2007;35(3):1093–102. [DOI] [PubMed] [Google Scholar]
- Bemis DK, Pylkkänen L. Basic linguistic composition recruits the left anterior temporal lobe and left angular gyrus during both listening and reading. Cerebral Cortex. 2012;23(8):1859–73. [DOI] [PubMed] [Google Scholar]
- Blank I, Balewski Z, Mahowald K, Fedorenko E. Syntactic processing is distributed across the language system. NeuroImage. 2016;127:307–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Braze D, Mencl WE, Tabor W, Pugh KR, Constable RT, Fulbright RK, Magnuson JS, Van Dyke JA, Shankweiler DP. Unification of sentence processing via ear and eye: An fMRI study. Cortex. 2011;47(4):416–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buchweitz A, Mason RA, Tomitch L, Just MA. Brain activation for reading and listening comprehension: An fMRI study of modality effects and individual differences in language comprehension. Psychology & neuroscience. 2009;2(2):111–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Button KS, Ioannidis JP, Mokrysz C, Nosek BA, Flint J, Robinson ES, Munafo MR. Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience. 2013;14(5):365–76. [DOI] [PubMed] [Google Scholar]
- Chen G, Taylor PA & Cox RW Is the statistic value all we should care about in neuroimaging? NeuroImage, 2017;147:952–959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cooke A, Grossman M, DeVita C, Gonzalez-Atavales J, Moore P, Chen W, Gee J, Detre J. Large-scale neural network for sentence processing. Brain and Language. 2006;96(1):14–36. [DOI] [PubMed] [Google Scholar]
- Dale AM. Optimal experimental design for event-related fMRI. Human Brain Mapping. 1999;8(2-3):109–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dapretto M, Bookheimer SY. Form and content: Dissociating syntax and semantics in sentence comprehension. Neuron. 1999;24(2):427–32. [DOI] [PubMed] [Google Scholar]
- Embick D, Marantz A, Miyashita Y, O’Neil W, Sakai KL. A syntactic specialization for Broca’s area. Proceedings of the National Academy of Sciences of the United States of America. 2000;97(11):6150–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fedorenko E, Mineroff Z, Siegelman M, Blank I. Word meanings and sentence structure recruit the same set of fronto-temporal regions during comprehension. bioRxiv. 2018. January 1:477851. [Google Scholar]
- Fedorenko E, Duncan J, Kanwisher N. Language-selective and domain-general regions lie side by side within Broca’s area. Current Biology. 2012b;22(21):2059–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fedorenko E, Hsieh PJ, Nieto-Castanon A, Whitfield-Gabrieli S, Kanwisher N. New method for fMRI investigations of language: Defining ROIs functionally in individual subjects. Journal of Neurophysiology. 2010;104(2):1177–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fedorenko E, Nieto-Castanon A, Kanwisher N. Lexical and syntactic representations in the brain: An fMRI investigation with multi-voxel pattern analyses. Neuropsychologia. 2012a;50(4):499–513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fedorenko E, Scott TL, Brunner P, Coon WG, Pritchett B, Schalk G, Kanwisher N. Neural correlate of the construction of sentence meaning. Proceedings of the National Academy of Sciences. 2016;113(41):E6256–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friederici AD, Kotz SA, Scott SK, Obleser J. Disentangling syntax and intelligibility in auditory language comprehension. Human Brain Mapping. 2010;31(3):448–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friederici AD, Meyer M, von Cramon DY. Auditory language comprehension: An event-related fMRI study on the processing of syntactic and lexical information. Brain and Language. 2000;74(2):289–300. [DOI] [PubMed] [Google Scholar]
- Gelman A, Carlin J. Beyond power calculations: Assessing Type S (Sign) and Type M (Magnitude) errors. Perspectives on Psychological Science. 2014;9(6):641–51. [DOI] [PubMed] [Google Scholar]
- Gibson E, Bergen L, Piantadosi ST. Rational integration of noisy evidence and prior semantic expectations in sentence interpretation. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(20):8051–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hagoort P, Brown C, Groothusen J. The syntactic positive shift (SPS) as an ERP measure of syntactic processing. Language and Cognitive Processes. 1993;8(4):439–83. [Google Scholar]
- Herrmann B, Obleser J, Kalberlah C, Haynes JD, Friederici AD. Dissociable neural imprints of perception and grammar in auditory functional imaging. Human Brain Mapping. 2012;33(3):584–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Holmes A, Friston K. Generalisability, random effects and population inference. NeuroImage. 1998;7(4):S754. [Google Scholar]
- Hong Y, Yoo Y, Wager T, Woo CW. False-positive neuroimaging: Undisclosed flexibility in testing spatial hypotheses allows presenting anything as a replicated finding. bioRxiv. 2019:514521. [DOI] [PubMed] [Google Scholar]
- Humphries C, Binder JR, Medler DA, Liebenthal E. Syntactic and semantic modulation of neural activity during auditory sentence comprehension. Journal of Cognitive Neuroscience. 2006;18(4):665–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ioannidis JPA. Why most published research findings are false. PLOS Medicine. 2005;2(8):696–701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ioannidis JPA, Munafo MR, Fusar-Poli P, Nosek BA, David SP. Publication and other reporting biases in cognitive sciences: detection, prevalence, and prevention. Trends in Cognitive Sciences. 2014;18(5):235–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Juch H, Zimine I, Seghier ML, Lazeyras F, Fasel JHD. Anatomical variability of the lateral frontal lobe surface: implication for intersubject variability in language neuroimaging. NeuroImage. 2005;24(2):504–14. [DOI] [PubMed] [Google Scholar]
- Kuperberg GR, Holcomb PJ, Sitnikova T, Greve D, Dale AM, Caplan D. Distinct patterns of neural modulation during the processing of conceptual and syntactic anomalies. Journal of Cognitive Neuroscience. 2003. February 15;15(2):272–93. [DOI] [PubMed] [Google Scholar]
- Kutas M, Federmeier KD. Thirty years and counting: Finding meaning in the N400 component of the event-related brain potential (ERP). Annual Review of Psychology, Vol 62 2011;62:621–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kutas M, Hillyard SA. Event-related brain potentials to semantically inappropriate and surprisingly large words. Biological Psychology. 1980;11(2):99–116. [DOI] [PubMed] [Google Scholar]
- Menenti L, Gierhan SME, Segaert K, Hagoort P. Shared language: Overlap and segregation of the neuronal infrastructure for speaking and listening revealed by functional MRI. Psychological Science. 2011;22(9):1173–82. [DOI] [PubMed] [Google Scholar]
- Nee DE. fMRI replicability depends upon sufficient individual-level data. Communications Biology. 2019. April 12;2(1):130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nieto-Castanon A, Fedorenko E. Subject-specific functional localizers increase sensitivity and functional resolution of multi-subject analyses. NeuroImage. 2012;63(3):1646–69. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Noppeney U, Price CJ. An fMRI study of syntactic adaptation. Journal of Cognitive Neuroscience. 2004;16(4):702–13. [DOI] [PubMed] [Google Scholar]
- Osterhout L, Holcomb PJ. Event-related brain potentials elicited by syntactic anomaly. Journal of Memory and Language. 1992;31(6):785–806. [Google Scholar]
- Poldrack RA, Gorgolewski KJ. OpenfMRI: Open sharing of task fMRI data. Neuroimage. 2017;144:259–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Santi A, Grodzinsky Y. fMRI adaptation dissociates syntactic complexity dimensions. Neuroimage. 2010;51(4):1285–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saxe R, Brett M, Kanwisher N. Divide and conquer: a defense of functional localizers. Neuroimage. 2006;30(4):1088–96. [DOI] [PubMed] [Google Scholar]
- Schmidt S Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology. 2009. Jun;13(2):90. [Google Scholar]
- Scott TL, Gallee J, Fedorenko E. A new fun and robust version of an fMRI localizer for the frontotemporal language system. Cognitive Neuroscience. 2016:1–10. [DOI] [PubMed] [Google Scholar]
- Segaert K, Menenti L, Weber K, Petersson KM, Hagoort P. Shared syntax in language production and language comprehension - an fMRI Study. Cerebral Cortex. 2012;22(7):1662–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simmons JP, Nelson LD, Simonsohn U. False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science. 2011;22(11):1359–66. [DOI] [PubMed] [Google Scholar]
- Simonsohn U Small telescopes: Detectability and the evaluation of replication results. Psychological Science. 2015;26(5):559–69. [DOI] [PubMed] [Google Scholar]
- The Open Science Collaboration. PSYCHOLOGY. Estimating the reproducibility of psychological science. Science. 2015;349(6251):aac4716. [DOI] [PubMed] [Google Scholar]
- Thesen S, Heid O, Mueller E, Schad LR. Prospective acquisition correction for head motion with image-based tracking for real-time fMRI. Magnetic Resonance in Medicine. 2000;44(3):457–63. [DOI] [PubMed] [Google Scholar]
- Thirion B, Pinel P, Mériaux S, Roche A, Dehaene S, Poline JB. Analysis of a large fMRI cohort: Statistical and methodological issues for group analyses. Neuroimage. 2007. March 1;35(1):105–20. [DOI] [PubMed] [Google Scholar]
- Tomaiuolo F, MacDonald JD, Caramanos Z, Posner G, Chiavaras M, Evans AC, Petrides M. Morphology, morphometry and probability mapping of the pars opercularis of the inferior frontal gyrus: an in vivo MRI analysis. European Journal of Neuroscience. 1999;11(9):3033–46. [DOI] [PubMed] [Google Scholar]
- Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, Mazoyer B, Joliot M. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage. 2002;15(1):273–89. [DOI] [PubMed] [Google Scholar]
- Vagharchakian L, Dehaene-Lambertz G, Pallier C, Dehaene S. A temporal bottleneck in the language comprehension network. Journal of Neuroscience. 2012;32(26):9089–102. [DOI] [PMC free article] [PubMed] [Google Scholar]