Skip to main content
Cerebral Cortex (New York, NY) logoLink to Cerebral Cortex (New York, NY)
. 2023 Apr 1;33(12):7904–7929. doi: 10.1093/cercor/bhad087

The human language system, including its inferior frontal component in “Broca’s area,” does not support music perception

Xuanyi Chen 1,2,3,, Josef Affourtit 4,5, Rachel Ryskin 6,7,8, Tamar I Regev 9,10, Samuel Norman-Haignere 11,12,13,14, Olessia Jouravlev 15,16,17, Saima Malik-Moraleda 18,19,20, Hope Kean 21,22, Rosemary Varley 23,2, Evelina Fedorenko 24,25,26,2,
PMCID: PMC10505454  PMID: 37005063

Abstract

Language and music are two human-unique capacities whose relationship remains debated. Some have argued for overlap in processing mechanisms, especially for structure processing. Such claims often concern the inferior frontal component of the language system located within “Broca’s area.” However, others have failed to find overlap. Using a robust individual-subject fMRI approach, we examined the responses of language brain regions to music stimuli, and probed the musical abilities of individuals with severe aphasia. Across 4 experiments, we obtained a clear answer: music perception does not engage the language system, and judgments about music structure are possible even in the presence of severe damage to the language network. In particular, the language regions’ responses to music are generally low, often below the fixation baseline, and never exceed responses elicited by nonmusic auditory conditions, like animal sounds. Furthermore, the language regions are not sensitive to music structure: they show low responses to both intact and structure-scrambled music, and to melodies with vs. without structural violations. Finally, in line with past patient investigations, individuals with aphasia, who cannot judge sentence grammaticality, perform well on melody well-formedness judgments. Thus, the mechanisms that process structure in language do not appear to process music, including music syntax.

Keywords: language, music, syntactic processing, fMRI, domain specificity

Introduction

To interpret language or appreciate music, we must understand how different elements—words in language, notes and chords in music—relate to each other. Parallels between the structural properties of language and music have been drawn for over a century (e.g. Riemann 1877, as cited in Swain 1995; Lindblom and Sundberg 1969; Fay 1971; Boilès 1973; Cooper 1973; Bernstein 1976; Sundberg and Lindblom 1976; Lerdahl and Jackendoff 1977, 1983; Roads and Wieneke 1979; Krumhansl and Keil 1982; Baroni et al. 1983; Swain 1995; cf. Jackendoff 2009; Temperley 2022). However, the question of whether music processing relies on the same mechanisms as those that support language processing continues to spark debate.

The empirical landscape is complex. A large number of studies have argued for overlap in structural processing based on behavioral (e.g. Fedorenko et al. 2009; Slevc et al. 2009; Hoch et al. 2011; Van de Cavey and Hartsuiker 2016; Kunert et al. 2016), ERP (e.g. Janata 1995; Patel et al. 1998; Koelsch et al. 2000), MEG (e.g. Maess et al. 2001), fMRI (e.g. Koelsch et al. 2002; Levitin and Menon 2003; Tillmann et al. 2003; Koelsch 2006; Kunert et al. 2015; Musso et al. 2015), and ECoG (e.g. Sammler et al. 2009, 2013; te Rietmolen et al. 2022) evidence (see Tillman 2012; Kunert and Slevc 2015; LaCroix et al. 2015, for reviews). However, we would argue that no prior study has compellingly established reliance on shared syntactic processing mechanisms in language and music.

First, evidence from behavioral, ERP, and, to a large extent, MEG studies is indirect because they do not allow to unambiguously determine where neural responses originate (in ERP and MEG, this is because of the “inverse problem”; Tarantola 2005; Baillet 2014).

Second, the bulk of the evidence comes from structure-violation paradigms. In such paradigms, responses to the critical condition—which contains an element that violates the rules of tonal music—are contrasted with responses to the control condition, where stimuli obey the rules of tonal music. (For language, syntactic violations, like violations of number agreement, are often used.) Because structural violations (across domains) constitute unexpected events, a brain region that responds more strongly to the structure-violation condition than the control (no violation) condition may support structure processing in music, but it may also reflect domain-general processes, like attention or error detection/correction (e.g. Bigand et al. 2001; Poulin-Charronnat et al. 2005; Tillmann et al. 2006; Hoch et al. 2011; Perruchet and Poulin-Charronnat 2013) or low-level sensory effects (e.g. Bigand et al. 2014; Collins et al. 2014; cf. Koelsch et al. 2007). In order to argue that a brain region that shows a structure-violation > no violation effect supports structure processing in music, one would need to establish that this brain region (i) is selective for structural violations and does not respond to unexpected nonstructural (but similarly salient) events in music or other domains, and (ii) responds to music stimuli even when no violation is present. This latter point is (surprisingly) not often discussed but is deeply important: if a brain region supports the processing of music structure, it should be engaged whenever music is processed (similar to how language areas respond robustly to well-formed sentences, in addition to showing sensitivity to violated linguistic expectations; e.g. Fedorenko et al. 2020). After all, in order to detect a structural violation, a brain region needs to process the structure of the preceding context, which implies that it should be working whenever a music stimulus is present. No previous study has established both of the properties above—selectivity for structural relative to nonstructural violations and robust responses to music stimuli with no violations—for the brain regions that have been argued to support structure processing in music (and to overlap with regions that support structure processing in language). In fact, some studies that have compared unexpected structural and nonstructural events in music (e.g. a timbre change) have reported similar neural responses in fMRI (e.g. Koelsch et al. 2002; cf. some differences in EEG effects—e.g. Koelsch et al. 2001). Relatedly, and in support of the idea that effects of music structure violations largely reflect domain-general attentional effects, meta-analyses of neural responses to unexpected events across domains (e.g. Corbetta and Shulman 2002; Fouragnan et al. 2018; Corlett et al. 2021) have identified regions that grossly resemble those reported in studies of music structure violations (see Fedorenko and Varley 2016 for discussion).

Third, most prior fMRI (and MEG) investigations have relied on comparisons of group-level activation maps. Such analyses suffer from low functional resolution (e.g. Nieto-Castañón and Fedorenko 2012; Fedorenko 2021), especially in cases where the precise locations of functional regions vary across individuals, as in the association cortex (Fischl et al. 2008; Frost and Goebel 2012; Tahmasebi et al. 2012; Vázquez-Rodríguez et al. 2019). Thus, observing activation overlap at the group level does not unequivocally support shared mechanisms. Indeed, studies that have used individual-subject-level analyses have reported a low or no response to music in the language-responsive regions (Fedorenko et al. 2011; Rogalsky et al. 2011; Deen et al. 2015).

Fourth, the interpretation of some of the observed effects has relied on the so-called “reverse inference” (Poldrack 2006, 2011; Fedorenko 2021), where function is inferred from a coarse anatomical location: for example, some music-structure-related effects observed in or around “Broca’s area” have been interpreted as reflecting the engagement of linguistic-structure-processing mechanisms (e.g. Maess et al. 2001; Koelsch et al. 2002) given the long-standing association between “Broca’s area” and language, including syntactic processing specifically (e.g. Caramazza and Zurif 1976; Friederici et al. 2006). However, this reasoning is not valid: Broca’s area is a heterogeneous region, which houses components of at least two functionally distinct brain networks (Fedorenko et al. 2012a; Fedorenko and Blank 2020): the language-selective network, which responds during language processing, visual or auditory, but does not respond to diverse nonlinguistic stimuli (Fedorenko et al. 2011; Monti et al. 2009, 2012; see Fedorenko and Varley 2016 for a review) and the domain-general executive control or “multiple demand (MD)” network, which responds to any demanding cognitive task and is robustly modulated by task difficulty (Duncan 2010, 2013; Fedorenko et al. 2013; Assem et al. 2020). As a result, here and more generally, functional interpretation based on coarse anatomical localization is not justified.

Fifth, many prior fMRI investigations have not reported the magnitudes of response to the relevant conditions and only examined statistical significance maps for the contrast of interest (e.g. a whole-brain map showing voxels that respond reliably more strongly to melodies with vs. without a structural violation, and to sentences with vs. without a structural violation). Response magnitudes of experimental conditions relative to a low-level baseline and to each other are critical for interpreting a functional profile of a brain region (see e.g. Chen et al. 2017, for discussion). For example, a reliable violation > no violation effect in music (similar arguments apply to language) could be observed when both conditions elicit above-baseline responses, and the violation condition elicits a stronger response (Fig. 1A, left bar graph)—a reasonable profile for a brain region that supports music processing and is sensitive to the target structural manipulation. However, a reliable violation > no violation effect could also be observed when both conditions elicit below-baseline responses, and the violation condition elicits a less negative response (Fig. 1A, middle bar graph), or when both conditions elicit low responses—in the presence of a strong response to stimuli in other domains—and the between-condition difference is small (Fig. 1A, right bar graph; note that with sufficient power even very small effects can be highly reliable, but this does not make them theoretically meaningful; e.g. Cumming 2012; Sullivan and Feinn 2012). The two latter profiles, where a brain region is more active during silence than when listening to music, or when the response is overall low and the effect of interest is minuscule, would be harder to reconcile with a role of this brain region in music processing (see also the second point above).

Fig. 1.

Fig. 1

Illustration of the importance of examining the magnitudes of neural response to the experimental conditions rather than only the statistical significance maps for the contrast(s) of interest. A significant violation > no violation effect A) and overlap between a significant violation > no violation effect in language vs. in music B) are each compatible with multiple distinct functional profiles, only one of which (on the left in each case) supports the typically proposed interpretation (A region that processes structure in some domain of interest in A), and a region that processes structure across domains, in both language and music, in B)).

Similarly, with respect to the music-language overlap question, a reliable violation > no violation effect for both language and music could be observed in a brain region where sentences and melodies with violations elicit similarly strong responses, and those without violations elicit lower responses (Fig. 1B, left bar graph); but it could also arise in a brain region where sentences with violations elicit a strong response, sentences without violations elicit a lower response, but melodies elicit an overall low response, with the violation condition eliciting a higher response than the no-violation condition (Fig. 1B, right bar graph). Whereas in the first case, it may be reasonable to argue that the brain region in question supports some computation that is necessary to process structure violations in both domains, such interpretation would not be straightforward in the second case. In particular, given the large main effect of language > music, any account of possible computations supported by such a brain region would need to explain this difference instead of simply focusing on the presence of a reliable effect of violation in both domains. In summary, without examining the magnitudes of response, it is not possible to distinguish among many, potentially very different, functional profiles, without which formulating hypotheses about a brain region’s computations is precarious.

Aside from the limitations above, to the best of our knowledge, all prior brain imaging studies have used a single manipulation in one set of materials and one set of participants. To compellingly argue that a brain region supports (some aspects of) structural processing in both language and music, it is important to establish both the robustness of the key effect by replicating it with a new set of experimental materials and/or in a new group of participants, and its generalizability to other contrasts between conditions that engage the hypothesized computation and ones that do not. For example, to argue that a brain region houses a core syntactic mechanism needed to process hierarchical relations and/or recursion in both language and music (e.g. Patel 2003; Fadiga et al. 2009; Roberts 2012; Koelsch et al. 2013; Fitch and Martins 2014), one would need to demonstrate that this region (i) responds robustly to diverse structured linguistic and musical stimuli (which all invoke the hypothesized shared computation), (ii) shows replicable responses across materials and participants, and (iii) is sensitive to more than a single manipulation targeting the hypothesized computations specifically, as needed to rule out paradigm-/task-specific accounts (e.g. structured vs. unstructured stimuli, stimuli with vs. without structural violations, stimuli that are more vs. less structurally complex—e.g. with long-distance vs. local dependencies, adaptation to structure vs. some other aspect of the stimulus, etc.).

Finally, the neuropsychological patient evidence is at odds with the idea of shared mechanisms for processing language and music. If language and music relied on the same syntactic processing mechanism, individuals who are impaired in their processing of linguistic syntax should also exhibit impairments in musical syntax. Although some prior studies report subtle musical deficits in patients with aphasia (Patel et al. 2008a; Sammler et al. 2011), the evidence is equivocal, and many aphasic patients appear to have little or no difficulties with music, including the processing of music structure (Luria et al. 1965; Brust 1980; Marin 1982; Basso and Capitani 1985; Polk and Kertesz 1993; Slevc et al. 2016; Faroqi-Shah et al. 2020; Chiappetta et al. 2022; cf. Omigie and Samson 2014 and Sihvonen et al. 2017 for discussions of evidence that musical training may lead to better outcomes following brain damage/resection). Similarly, children with Specific Language Impairment (now called Developmental Language Disorder)—a developmental disorder that affects several aspects of linguistic and cognitive processing, including syntactic processing (e.g. Bortolini et al. 1998; Bishop and Norbury 2002)—show no impairments in musical processing (Fancourt 2013; cf. Jentschke et al. 2008). In an attempt to reconcile the evidence from acquired and developmental disorders with claims about structure-processing overlap based on behavioral and neural evidence from neurotypical participants, Patel (2003, 2008, 2012; see Slevc and Okada 2015, Patel and Morgan 2017, and Asano et al. 2021 for related proposals) put forward a hypothesis whereby the representations that mediate language and music are stored in distinct brain areas, but the mechanisms that perform online computations on those representations are partially overlapping. We return to this idea in the Discussion.

To bring clarity to this ongoing debate, we conducted 3 fMRI experiments with neurotypical adults, and a behavioral study with individuals with severe aphasia. For the fMRI experiments, we took an approach where we focused on the “language network”—a well-characterized set of left frontal and temporal brain areas that selectively support linguistic processing (e.g. Fedorenko et al. 2011), and asked whether any parts of this network show responses to music and sensitivity to music structure. In each experiment, we used an extensively validated language “localizer” task based on the reading of sentences and nonword sequences (Fedorenko et al. 2010; see Scott et al. 2017 and Malik-Moraleda, Ayyash et al. 2022 for evidence that this localizer is modality independent) to identify language-responsive areas in each participant individually. Importantly, these areas have been shown, across dozens of brain imaging studies, to be robustly sensitive to linguistic syntactic processing demands in diverse manipulations (e.g. Keller et al. 2001; Röder et al. 2002; Friederici 2011; Pallier et al. 2011; Bautista and Wilson 2016, among many others)—including when defined with the same localizer as the one used here (e.g. Fedorenko et al. 2010, 2012a, 2020; Blank et al. 2016; Shain, Blank et al. 2020; Shain et al. 2021, 2022)—and their damage leads to linguistic, including syntactic, deficits (e.g. Caplan et al. 1996; Dick et al. 2001; Wilson and Saygin 2004; Tyler et al. 2011; Wilson et al. 2012; Mesulam et al. 2014; Ding et al. 2020; Matchin and Hickok 2020, among many others). To address the critical research question, we examined the responses of these language areas to music, and their necessity for processing music structure. In experiment 1, we included several types of music stimuli including orchestral music, single-instrument music, synthetic drum music, and synthetic melodies, a minimal comparison between songs and spoken lyrics, and a set of nonmusic auditory control conditions. We additionally examined sensitivity to structure in music across 2 structure-scrambling manipulations. In experiment 2, we further probed sensitivity to structure in music using the most common manipulation, contrasting responses to well-formed melodies vs. melodies containing a note that does not obey the constraints of Western tonal music. And in experiment 3, we examined the ability to discriminate between well-formed melodies and melodies containing a structural violation in 3 profoundly aphasic individuals across 2 tasks. Finally, in experiment 4, we examined the responses of the language regions to yet another set of music stimuli in a new set of participants. Furthermore, these participants were all native speakers of Mandarin, a tonal language, which allowed us to evaluate the hypothesis that language regions may play a greater role in music processing in individuals with higher sensitivity to linguistic pitch (e.g. Deutsch et al. 2006, 2009; Bidelman et al. 2011; Creel et al. 2018; Ngo et al. 2016; Liu et al. 2021).

Materials and methods

Participants

Experiments 1, 2, and 4 (fMRI)

A total of 48 individuals (age 18–51, mean 24.3; 28 females, 20 males) from the Cambridge/Boston, MA, community participated for payment across 3 fMRI experiments (n = 18 in experiment 1; n = 20 in experiment 2; n = 18 in experiment 4; 8 participants overlapped between experiments 1 and 2). Overall, 33 participants were right-handed and 4 left-handed, as determined by the Edinburgh handedness inventory (Oldfield 1971), or self-report (see Willems et al. 2014, for arguments for including left-handers in cognitive neuroscience research); the handedness data for the remaining 11 participants (one in experiment 2 and 10 in experiment 4) were not collected. All but one participant (with no handedness information) in experiment 4 showed typical left-lateralized language activations in the language localizer task described below (as assessed by numbers of voxels falling within the language parcels in the left vs. right hemisphere (LH vs. RH), using the following formula: (LH − RH)/(LH + RH); e.g. Jouravlev et al. 2020; individuals with values of 0.25 or greater were considered to have a left-lateralized language system). For the participant with right-lateralized language activations (with a lateralization value at or below −0.25), we used right-hemisphere language regions for the analyses (see SI3 for versions of the analyses where the LH language regions were used for this participant and where this participant was excluded; the critical results were not affected). Participants in experiments 1 and 2 were native English speakers; participants in experiment 4 were native Mandarin speakers and proficient speakers of English (none had any knowledge of Russian, which was used in the unfamiliar foreign-language condition in experiment 4). Detailed information on the participants’ music background was, unfortunately, not collected, except for ensuring that the participants were not professional musicians. All participants gave informed written consent in accordance with the requirements of MIT’s Committee on the Use of Humans as Experimental Subjects.

Experiment 3 (behavioral)

Individuals with aphasia

Three participants with severe and chronic aphasia were recruited to the study (SA, PR, and PP). All participants gave informed consent in accordance with the requirements of UCL’s Institutional Review Board. Background information on each participant is presented in Table 1. Anatomical scans are shown in Fig. 2A and extensive perisylvian damage in the left hemisphere, encompassing areas where language activity is observed in neurotypical individuals, is illustrated in Fig. 2B.

Table 1.

Background information on the participants with aphasia.

Patient Sex Age (years) at testing Time post-onset (years) at testing Handedness Etiology Premorbid musical experience Premorbid employment
SA M 67 21 R Subdural empyema Sang in choir; basic sight-reading ability. No formal training. Police sergeant
PR M 68 14 L Left hemisphere stroke Drummer in band; basic sight-reading ability. No formal training. Retail manager
PP M 77 10 R Left hemisphere stroke Childhood musical training (5 years). No adult experience. Minerals trader
Fig. 2.

Fig. 2

A) Anatomical scans (T2-weighted for SA, T1-weighted for PR and PP) of the aphasic participants (all scans were performed during the chronic phase, as can be seen from the ventricular enlargement). Note that the right side of the image represents the left side of the brain. B) PR’s (top) and PP’s (bottom) anatomical scans (blue-tinted) shown with the probabilistic activation overlap map for the fronto-temporal language network overlaid (SA’s raw anatomical data were not available). The map was created by overlaying thresholded individual activation maps (red-tinted) for the sentences > nonwords contrast (Fedorenko et al. 2010) in 220 neurotypical participants (none of whom were participants in any experiments in the current study). As the images show, the language network falls largely within the lesioned tissue in the left hemisphere. C) Performance of the control participants and participants with aphasia on 2 measures of linguistic syntax processing (see design, materials, and procedure—experiment 3): The comprehension of spoken reversible sentences (top), and the spoken grammaticality judgments (bottom). The densities show the distribution of proportion correct scores in the control participants and the boxplot shows the quartiles of the control population (the whiskers show ×1.5 interquartile range and points represent outliers). The dots show individual participants (for the individuals with aphasia, the initials indicate the specific participant). Dashed gray lines indicate chance performance.

Control participants

We used Amazon.com’s Mechanical Turk platform to recruit normative samples for the music tasks and a subset of the language tasks that are most critical to linguistic syntactic comprehension. Ample evidence now shows that online experiments yield data that closely mirror the data patterns in experiments conducted in a lab setting (e.g. Crump et al. 2013). Data from participants with IP addresses in the United States who self-reported being native English speakers were included in the analyses. A total of 50 participants performed the critical music task, and the Scale task from the Montreal Battery for the Evaluation of Amusia (MBEA; Peretz et al. 2003), as detailed below. Data from participants who responded incorrectly to the catch trial in the MBEA Scale task (n = 5) were excluded from the analyses, for a final sample of 45 control participants for the music tasks. A separate sample of 50 participants performed the Comprehension of spoken reversible sentences task. Data from one participant who completed fewer than 75% of the questions and another participant who did not report being a native English speaker were excluded for a final sample of 48 control participants. Finally, a third sample of 50 participants performed the Spoken grammaticality judgment task. Data from one participant who did not report being a native English speaker were excluded for a final sample of 49 control participants.

Design, materials, and procedure

Experiments 1, 2, and 4 (fMRI)

Each participant completed a language localizer task (Fedorenko et al. 2010) and one or more of the critical music perception experiments, along with one or more tasks for unrelated studies. The scanning sessions lasted ~2 h.

Language localizer

This task is described in detail in Fedorenko et al. (2010) and subsequent studies from the Fedorenko lab (e.g. Fedorenko et al. 2011, 2020; Blank et al. 2014, 2016; Pritchett et al. 2018; Paunov et al. 2019; Shain, Blank et al. 2020, among others) and is available for download from https://evlab.mit.edu/funcloc/. Briefly, participants read sentences and lists of unconnected, pronounceable nonwords in a blocked design. Stimuli were presented one word/nonword at a time at the rate of 450 ms per word/nonword. Participants read the materials passively and performed a simple button-press task at the end of each trial (included in order to help participants remain alert). Each participant completed 2 ~6-min runs. This localizer task has been extensively validated and shown to be robust to changes in the materials, modality of presentation (visual vs. auditory), and task (e.g. Fedorenko et al. 2010; Fedorenko 2014; Scott et al. 2017; Diachek, Blank, Siegelman et al. 2020; Malik-Moraleda, Ayyash et al. 2022; Lipkin et al. 2022; see the results of experiments 1 and 4 for additional replications of modality robustness). Furthermore, a network that corresponds closely to the localizer contrast (sentences > nonwords) emerges robustly from whole-brain task-free data—voxel fluctuations during rest (e.g. Braga et al. 2020; see Salvo et al. 2021 for a general discussion of how well-validated localizers tend to show tight correspondence with intrinsic networks recovered in a data-driven way). The fact that different regions of the language network show strong correlations in their activity during naturalistic cognition (see also Blank et al. 2014; Paunov et al. 2019; Malik-Moraleda, Ayyash et al. 2022) provides support for the idea that this network constitutes a “natural kind” in the brain (a subset of the brain that is strongly internally integrated and robustly dissociable from the rest of the brain) and thus a meaningful unit of analysis. However, we also examine individual regions of this network, to paint a more complete picture, given that many past claims about language-music overlap have specifically concerned the inferior frontal component of the language network.

Experiment 1

Participants passively listened to diverse stimuli across 18 conditions in a long-event-related design. The materials for this and all other experiments are available at OSF: https://osf.io/68y7c/. All stimuli were 9 s in length. The conditions were selected to probe responses to music, to examine sensitivity to structure scrambling in music, to compare responses to songs vs. spoken lyrics, and to compare responses to music stimuli vs. other auditory stimuli.

The 4 nonvocal music conditions (all Western tonal music) included orchestral music, single-instrument music, synthetic drum music, and synthetic melodies (see SI5 for a summary of the acoustic properties of these and other conditions, as quantified with the MIR toolbox; Lartillot and Toiviainen 2007; Lartillot and Grandjean 2019). The orchestral music condition consisted of 12 stimuli (SI-Table 4a) selected from classical orchestras or jazz bands. The single-instrument music condition consisted of 12 stimuli (SI-Table 4b) that were played on one of the following instruments: cello (n = 1), flute (n = 1), guitar (n = 4), piano (n = 4), sax (n = 1), or violin (n = 1). The synthetic drum music condition consisted of 12 stimuli synthesized using percussion patches from MIDI files taken from freely available online collections. The stimuli were synthesized using the MIDI toolbox for MATLAB (writemidi). The synthetic melodies condition consisted of 12 stimuli transcribed from folk tunes obtained from freely available online collections. Each melody was defined by a sequence of notes with corresponding pitches and durations. Each note was composed of harmonics 1–10 of the fundamental presented in equal amplitude, with no gap in-between notes. Phase discontinuities between notes were avoided by ensuring that the starting phase of the next note was equal to the ending phase of the previous note.

The synthetic drum music and the synthetic melodies conditions had scrambled counterparts to probe sensitivity to music structure. This intact > scrambled contrast has been used in some past studies of structure processing in music (e.g. Levitin and Menon 2003) and is conceptually parallel to the sentences > word-list contrast in language, which has been often used to probe sensitivity to combinatorial processing (e.g. Fedorenko et al. 2010). The scrambled drum music condition was created by jittering the inter-note-interval (INI). The amount of jitter was sampled from a uniform distribution (from −0.5 to 0.5 beats). The scrambled INIs were truncated to be no smaller than 5% of the distribution of INIs from the intact drum track. The total distribution of INIs was then scaled up or down to ensure that the total duration remained unchanged. The scrambled melodies condition was created by scrambling both pitch and rhythm information. Pitch information was scrambled by randomly reordering the sequence of pitches and then adding jitter to disrupt the key. The amount of jitter for each note was sampled from a uniform distribution centered on the note’s pitch after shuffling (from −3 to +3 semitones). The duration of each note was also jittered (from −0.2 to 0.2 beats). To ensure the total duration was unaffected by jitter, N/2 positive jitter values were sampled, where N is the number of notes, and then a negative jitter was added with the same magnitude for each of the positive samples, such that the sum of all jitters equaled 0. To ensure the duration of each note remained positive, the smallest jitters were added to the notes with the smallest durations. Specifically, the note durations and sampled jitters were sorted by their magnitude, summed, and then the jittered durations were randomly reordered.

To allow for a direct comparison between music and linguistic conditions within the same experiment, we included auditory sentences and auditory nonword sequences. The sentence condition consisted of 24 lab-constructed stimuli (half recorded by a male, and half by a female). Each stimulus consisted of a short story (each 3 sentences long) describing common, everyday events. Any given participant heard 12 of the stimuli (6 male, 6 female). The nonword sequence condition consisted of 12 stimuli (recorded by a male).

We also included 2 other linguistic conditions: songs and spoken lyrics. These conditions were included to test whether the addition of a melodic contour to speech (in songs) would increase the responses of the language regions. Such a pattern might be expected of a brain region that responds to both linguistic content and music structure. The songs and the lyrics conditions each consisted of 24 stimuli. We selected songs with a tune that was easy to sing without accompaniment. These materials were recorded by 4 male singers: each recorded between 2 and 11 song-lyrics pairs. The singers were actively performing musicians (e.g. in a cappella groups) but were not professionals. Any given participant heard either the song or the lyrics version of an item for 12 stimuli in each condition.

Finally, to assess the specificity of potential responses to music, we included 3 nonmusic conditions: animal sounds and two kinds of environmental sounds (pitched and unpitched), which all share some low-level acoustic properties with music (see SI-5). The animal sounds condition and the environmental sounds conditions each consisted of 12 stimuli taken from in-lab collections. If individual recordings were shorter than 9 s, then several recordings of the same type of sound were concatenated together (100 ms gap in between). We included the pitch manipulation in order to test for general responsiveness to pitch—a key component of music—in the language regions.

(The remaining 5 conditions were not directly relevant to the current study or redundant with other conditions for our research questions and therefore not included in the analyses. These included 3 distorted speech conditions—lowpass-filtered speech, speech with a flattened pitch contour, and lowpass-filtered speech with a flattened pitch contour—and 2 additional low-level controls for the synthetic melody stimuli. The speech conditions were included to probe sensitivity to linguistic prosody for an unrelated study. The additional synthetic music control conditions were included to allow for a more rigorous interpretation of the intact > scrambled synthetic melodies effect had we observed such an effect. For completeness, on the OSF page, https://osf.io/68y7c/, we provide a data table that includes responses to these 5 conditions.)

For each participant, stimuli were randomly divided into 6 sets (corresponding to runs) with each set containing 2 stimuli from each condition. The order of the conditions for each run was selected from 4 predefined palindromic orders, which were constructed so that conditions targeting similar mental processes (e.g. orchestral music and single-instrument music) were separated by other conditions (e.g. speech or animal sounds). Each run contained 3 10-s fixation periods: at the beginning, in the middle, and at the end. Otherwise, the stimuli were separated by 3-s fixation periods, for a total run duration of 456 s (7 min 36 s). All but 2 of the 18 participants completed all 6 runs (and thus got a total of 12 experimental events per condition); the remaining 2 completed 4 runs (and thus got 8 events per condition).

Because, as noted above, we have previously established that the language localizer is robust to presentation modality, we used the visual localizer to define the language regions. However, in SI-2, we show that the critical results are similar when auditory contrasts (sentences > nonwords in experiment 1, or Mandarin sentences > foreign in experiment 4) are instead used to define the language regions.

Experiment 2

Participants listened to well-formed melodies (adapted and expanded from Fedorenko et al. 2009) and melodies with a structural violation in a long-event-related design and judged the well-formedness of the melodies. As discussed in the Introduction, this type of manipulation is commonly used to probe sensitivity to music structure, including in studies examining language-music overlap (e.g. Patel et al. 1998; Koelsch et al. 2000, 2002; Maess et al. 2001; Tillmann et al. 2003; Fedorenko et al. 2009; Slevc et al. 2009; Kunert et al. 2015; Musso et al. 2015). The melodies were between 11 and 14 notes. The well-formed condition consisted of 90 melodies, which were tonal and ended in a tonic note with an authentic cadence in the implied harmony. All melodies were isochronous, consisting of quarter notes except for the final half note. The first 5 notes established a strong sense of key. Each melody was then altered to create a version with a “sour” note: the pitch of one note (from among the last 4 notes in a melody) was altered up or down by 1 or 2 semitones, so as to result in a non-diatonic note while keeping the melodic contour (the up-down pattern) the same. The structural position of the note that underwent this change varied among the tonic, the fifth, and the major third. The full set of 180 melodies was distributed across 2 lists following a Latin Square design. Any given participant heard stimuli from one list.

For each participant, stimuli were randomly divided into 2 sets (corresponding to runs) with each set containing 45 melodies (22 or 23 per condition). The order of the conditions, and the distribution of inter-trial fixation periods, was determined by the optseq2 algorithm (Dale 1999). The order was selected from among 4 predefined orders, with no more than 4 trials of the same condition in a row. In each trial, participants were presented with a melody for 3 s followed by a question, presented visually on the screen, about the well-formedness of the melody (“Is the melody well-formed?”). To respond, participants had to press 1 of 2 buttons on a button box within 2 s. When participants answered, the question was replaced by a blank screen for the remainder of the 2-s window; if no response was made within the 2-s window, the experiment advanced to the next trial. Responses received within 1 s after the end of the previous trial were still recorded to account for the possible slow responses. The screen was blank during the presentation of the melodies. Each run contained 151 s of fixation interleaved among the trials, for a total run duration of 376 s (6 min 16 s). Fourteen of the 20 participants completed both runs, 4 participants completed 1 run, and the 2 remaining participants completed 2 runs but we only included their first run because, because of experimenter error, the second run came from a different experimental list and thus included some of the melodies from the first run in the other condition (the data pattern was qualitatively and quantitatively the same if both runs were included for these participants). Finally, because of a script error, participants only heard the first 12 notes of each melody during the 3 s of stimulus presentation. Therefore, we only analyzed the 80 of the 90 pairs (160 of the 180 total melodies) where the contrastive note appeared within the first 12 notes.

Experiment 4

Participants passively listened to single-instrument music, environmental sounds, sentences in their native language (Mandarin), and sentences in an unfamiliar foreign language (Russian) in a blocked design. All stimuli were 5–5.95 s in length. The conditions were selected to probe responses to music, and to compare responses to music stimuli vs. other auditory stimuli. The critical music condition consisted of 60 stimuli selected from classical pieces by J.S. Bach played on cello, flute, or violin (n = 15 each) and jazz music played on saxophone (n = 15). The environmental sounds condition consisted of 60 stimuli selected from in-lab collections and included both pitched and unpitched stimuli. The foreign language condition consisted of 60 stimuli selected from Russian audiobooks (short stories by Paustovsky and “Fathers and Sons” by Turgenev). The foreign language condition was included because creating a “nonwords” condition (the baseline condition we typically use for defining the language regions; Fedorenko et al. 2010) is challenging in Mandarin given that most words are monosyllabic, thus most syllables carry some meaning. As a result, sequences of syllables are more akin to lists of words. Therefore, we included the unfamiliar foreign language condition, which also works well as a baseline for language processing (Malik-Moraleda, Ayyash et al. 2022). The Mandarin sentence condition consisted of 120 lab-constructed sentences, each recorded by a male and a female native speaker. (The experiment also included 5 conditions that were not relevant to the current study and therefore not included in the analyses. These included 3 conditions probing responses to the participants’ second language (English) and 2 control conditions for Mandarin sentences. For completeness, on the OSF page, https://osf.io/68y7c/, we provide a data table that includes responses to these 5 conditions.)

Stimuli were grouped into blocks with each block consisting of 3 stimuli and lasting 18 s (stimuli were padded with silence to make each trial exactly 6-s long). For each participant, blocks were divided into 10 sets (corresponding to runs), with each set containing 2 blocks from each condition. The order of the conditions for each run was selected from 8 predefined palindromic orders. Each run contained 3 14-s fixation periods: at the beginning, in the middle, and at the end, for a total run duration of 366 s (6 min 6 s). Five participants completed 8 of the 10 runs (and thus got 16 blocks per condition); the remaining 13 completed 6 runs (and thus got 12 blocks per condition). (We had created enough materials for 10 runs, but based on observing robust effects for several key contrasts in the first few participants who completed 6–8 runs, we administered 6–8 runs to the remaining participants.)

Because we have previously found that an English localizer works well in native speakers of diverse languages, including Mandarin, as long as they are proficient in English (Malik-Moraleda, Ayyash et al. 2022), we used the same localizer in experiment 4 as the one used in experiments 1 and 2, for consistency. However, in SI-2 (SI-Fig. 2C and SI-Table 2c), we show that the critical results are similar when the Mandarin sentences > foreign contrast is instead used to define the language regions.

Experiment 3 (behavioral)

Language assessments

Participants with aphasia were assessed for the integrity of lexical processing using word-to-picture matching tasks in both spoken and written modalities (ADA Spoken and Written Word-Picture Matching; Franklin et al. 1992). Productive vocabulary was assessed through picture naming. In the spoken modality, the Boston Naming Test was employed (Kaplan et al. 2001), and in writing, the PALPA Written Picture Naming subtest (Kay et al. 1992). Sentence processing was evaluated in both spoken and written modalities through comprehension (sentence-to-picture matching) of reversible sentences in active and passive voice. In a reversible sentence, the heads of both noun phrases are plausible agents, and therefore, word order, function words, and functional morphology are the only cues to who is doing what to whom. Participants also completed spoken and written grammaticality judgment tasks, where they made a yes/no decision as to the grammaticality of a word string. The task employed a subset of sentences from Linebarger et al. (1983).

All 3 participants exhibited severe language impairments that disrupted both comprehension and production (Table 2). For lexical-semantic tasks, all 3 participants displayed residual comprehension ability for high imageability/picturable vocabulary, although more difficulty was evident on the synonym matching test, which included abstract words. They were all severely anomic in speech and writing. Sentence production was severely impaired with output limited to single words, social speech (expressions, like “How are you?”), and other formulaic expressions (e.g. “and so forth”). Critically, all 3 performed at or close to chance level on spoken and written comprehension of reversible sentences and grammaticality judgments; each patient’s scores were lower than all of the healthy controls (Table 2 and Fig. 2C).

Table 2.

Results of language assessments for participants with aphasia and healthy controls. For each test, we show number of correctly answered questions out of the total number of questions.

Participant SA PR PP Controls
Lexical-semantic assessments
ADA Spoken Word-Picture Matching (chance = 16.5) 60/66 61/66 64/66 N/A
ADA Written Word-Picture Matching (chance = 16.5) 62/66 66/66 58/66 N/A
ADA spoken synonym matching (chance = 80) 123/160 121/160 135/160 N/A
ADA written synonym matching (chance = 80) 121/160 145/160 143/160 N/A
Boston Naming Test
(NB: accepting both spoken and written responses)
4/60 4/60 11/60 N/A
PALPA 54 Written Picture Naming 24/60 2/60 1/60 N/A
Syntactic assessments
Comprehension of spoken reversible sentences (chance = 40) 49/80 38/80 52/80 Mean = 79.5/80
SD = 1.03
Min = 74/80
Max = 80/80
n = 48
Comprehension of written reversible sentences (chance = 40) 42/80 49/80 51/80 N/A
Spoken grammaticality judgments (chance = 24) 33/48 34/48 35/48 Mean = 45.5/48
SD = 2.52
Min = 36/48
Max = 48/48
n = 49
Written grammaticality judgments (chance = 24) 29/48 24/48 29/48 N/A
Critical music task

Participants judged the well-formedness of the melodies from experiment 2. Judgments were intended to reflect the detection of the key violation in the sour versions of the melodies. The full set of 180 melodies was distributed across 2 lists following a Latin Square design. All participants heard all 180 melodies. The control participants heard the melodies from one list, followed by the melodies from the other list, with the order of lists counter-balanced across participants. For the participants with aphasia, each list was further divided in half, and each participant was tested across 4 sessions, with 45 melodies per session, to minimize fatigue.

Montreal Battery for the Evaluation of Amusia

To obtain another measure of music competence/sensitivity to music structure, we administered the MBEA (Peretz et al. 2003). The battery consists of 6 tasks that assess musical processing components described by Peretz and Coltheart (2003): 3 target melodic processing, 2 target rhythmic processing, and 1 assesses memory for melodies. Each task consists of 30 experimental trials (and uses the same set of 30 base melodies) and is preceded by practice examples. Some of the tasks additionally include a catch trial, as described below. For the purposes of the current investigation, the critical task is the “Scale” task. Participants are presented with pairs of melodies that they have to judge as identical or not. On half of the trials, one of the melodies is altered by modifying the pitch of one of the tones to be out of scale. Like our critical music task, this task aims to test participants’ ability to represent and use tonal structure in Western music, except that instead of making judgments on each individual melody, participants compare 2 melodies on each trial. This task thus serves as a conceptual replication (Schmidt 2009). One trial contains stimuli designed to be easy, intended as a catch trial to ensure that participants are paying attention. In this trial, the comparison melody has all its pitches set at random. This trial is excluded when computing the scores.

Control participants performed just the Scale task. Participants with aphasia performed all 6 tasks, distributed across 3 testing sessions to minimize fatigue.

fMRI data acquisition, preprocessing, and first-level modeling (for experiments 1, 2, and 4)

 

Data acquisition

Whole-brain structural and functional data were collected on a whole-body 3 Tesla Siemens Trio scanner with a 32-channel head coil at the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research at MIT. T1-weighted structural images were collected in 176 axial slices with 1 mm isotropic voxels (repetition time (TR) = 2,530 ms; echo time (TE) = 3.48 ms). Functional, blood oxygenation level-dependent (BOLD) data were acquired using an EPI sequence with a 90o flip angle and using GRAPPA with an acceleration factor of 2; the following parameters were used: 31 4.4 mm thick near-axial slices acquired in an interleaved order (with 10% distance factor), with an in-plane resolution of 2.1 × 2.1 mm, FoV in the phase encoding (A >> P) direction 200 mm and matrix size 96 × 96 voxels, TR = 2,000 ms and TE = 30 ms. The first 10 s of each run were excluded to allow for steady state magnetization (see OSF https://osf.io/68y7c/ for the pdf of the scanning protocols). (Note that we opted to use a regular, continuous, scanning sequence in spite of investigating responses to auditory conditions. However, effects of scanner noise are unlikely to be detrimental given that all the stimuli are clearly perceptible, as also confirmed by examining responses in the auditory areas.)

Preprocessing

fMRI data were analyzed using SPM12 (release 7487), CONN EvLab module (release 19b), and other custom MATLAB scripts. Each participant’s functional and structural data were converted from DICOM to NIFTI format. All functional scans were coregistered and resampled using B-spline interpolation to the first scan of the first session (Friston et al. 1995). Potential outlier scans were identified from the resulting subject-motion estimates as well as from BOLD signal indicators using default thresholds in CONN preprocessing pipeline (5 standard deviations above the mean in global BOLD signal change, or framewise displacement values above 0.9 mm; Nieto-Castañón 2020). Functional and structural data were independently normalized into a common space (the Montreal Neurological Institute (MNI) template; IXI549Space) using SPM12 unified segmentation and normalization procedure (Ashburner and Friston 2005) with a reference functional image computed as the mean functional data after realignment across all timepoints omitting outlier scans. The output data were resampled to a common bounding box between MNI-space coordinates (−90, −126, −72) and (90, 90, 108), using 2 mm isotropic voxels and fourth-order spline interpolation for the functional data, and 1 mm isotropic voxels and trilinear interpolation for the structural data. Last, the functional data were smoothed spatially using spatial convolution with a 4 mm FWHM Gaussian kernel.

First-level modeling

For both the language localizer task and the critical experiments, effects were estimated using a General Linear Model (GLM) in which each experimental condition was modeled with a boxcar function convolved with the canonical hemodynamic response function (HRF; fixation was modeled implicitly, such that all timepoints that did not correspond to one of the conditions were assumed to correspond to a fixation period). Temporal autocorrelations in the BOLD signal timeseries were accounted for by a combination of high-pass filtering with a 128-s cutoff, and whitening using an AR(0.2) model (first-order autoregressive model linearized around the coefficient a = 0.2) to approximate the observed covariance of the functional data in the context of Restricted Maximum Likelihood estimation. In addition to experimental condition effects, the GLM design included first-order temporal derivatives for each condition (included to model variability in the HRF delays), as well as nuisance regressors to control for the effect of slow linear drifts, subject-motion parameters, and potential outlier scans on the BOLD signal.

Definition of the language functional regions of interest (for experiments 1, 2, and 4)

For each critical experiment, we defined a set of language functional regions of interest (fROIs) using group-constrained, subject-specific localization (Fedorenko et al. 2010). In particular, each individual map for the sentences > nonwords contrast from the language localizer was intersected with a set of 5 binary masks. These masks (Fig. 3; available at OSF: https://osf.io/68y7c/) were derived from a probabilistic activation overlap map for the same contrast in a large independent set of participants (n = 220) using watershed parcellation, as described in Fedorenko et al. (2010) for a smaller set of participants. These masks covered the fronto-temporal language network in the left hemisphere. Within each mask, a participant-specific language fROI was defined as the top 10% of voxels with the highest t-values for the localizer contrast.

Fig. 3.

Fig. 3

Responses of the language fROIs (pooling across the network—top, and for each fROI individually—bottom) to the language localizer conditions (in gray), to the 4 auditory conditions containing speech in experiment 1 (red shades), to the 5 music conditions in experiments 1 and 2 (blue shades), and to the 3 nonlinguistic/nonmusic auditory conditions (green shades) in experiment 1. Here and elsewhere, the error bars represent standard errors of the mean by participants, and the dots—individual participants. For the language localizer results, we include here all participants in experiments 1 and 2. The responses to the music conditions cluster around the fixation baseline, are much lower than the responses to sentences, and are not higher than the responses to nonmusic sounds.

Validation of the language fROIs

To ensure that the language fROIs behave as expected (i.e. show a reliably greater response to the sentences condition compared with the nonwords condition), we used an across-runs cross-validation procedure (e.g. Nieto-Castañón and Fedorenko 2012). In this analysis, the first run of the localizer was used to define the fROIs, and the second run to estimate the responses (in percent BOLD signal change, PSC) to the localizer conditions, ensuring independence (e.g. Kriegeskorte et al. 2009); then the second run was used to define the fROIs, and the first run to estimate the responses; finally, the extracted magnitudes were averaged across the 2 runs to derive a single response magnitude for each of the localizer conditions. Statistical analyses were performed on these extracted PSC values. Consistent with much previous work (e.g. Fedorenko et al. 2010; Mahowald and Fedorenko 2016; Diachek, Blank, Siegelman et al. 2020), each of the language fROIs showed a robust sentences > nonwords effect (all Ps < 0.001).

Statistical analyses for the fMRI experiments

All analyses were performed with linear mixed-effects models using the “lme4” package in R with P-value approximation performed by the “lmerTest” package (Bates et al. 2015; Kuznetsova et al. 2017). Effect size (Cohen’s d) was calculated using the method from Westfall et al. (2014) and Brysbaert and Stevens (2018).

Sanity check analyses and results

To estimate the responses in the language fROIs to the conditions of the critical experiments here and in the critical analyses, the data from all the runs of the language localizer were used to define the fROIs, and the responses to each condition were then estimated in these regions. Statistical analyses were then performed on these extracted PSC values. (For experiments 1 and 4, we repeated the analyses using alternative language localizer contrasts to define the language fROIs (auditory sentences > nonwords in experiment 1, and Mandarin sentences > foreign in experiment 4), which yielded quantitatively and qualitatively similar responses (see SI-2).)

We conducted 2 sets of sanity check analyses. First, to ensure that auditory conditions that contain meaningful linguistic content elicit strong responses in the language regions relative to perceptually similar conditions with no discernible linguistic content, we compared the auditory sentences condition with the auditory nonwords condition (experiment 1) or with the foreign language condition (experiment 4). Indeed, as expected, the auditory sentence condition elicited a stronger response than the auditory nonwords condition (experiment 1) or the foreign language condition (experiment 4). These effects were robust at the network level (Ps < 0.001; SI-Table 1a). Furthermore, the sentences > nonwords effect was significant in all but one language fROI in experiment 1, and the sentences > foreign effect was significant in all language fROIs in experiment 4 (Ps < 0.05; SI-Table 1a).

And second, to ensure that the music conditions elicit strong responses in auditory cortex, we extracted the responses from a bilateral anatomically defined auditory cortical region (area Te1.2 from the Morosan et al. 2001 cytoarchitectonic probabilistic atlas) to the 6 critical music conditions: orchestral music, single instrument music, synthetic drum music, and synthetic melodies in experiment 1; well-formed melodies in experiment 2; and the music condition in experiment 4. Statistical analyses, comparing each condition to the fixation baseline, were performed on these extracted PSC values. As expected, all music conditions elicited strong responses in this primary auditory area bilaterally (all Ps ≅ 0.001; SI-Table 1b and SI-Fig. 1).

Critical analyses

To characterize the responses in the language network to music perception, we asked 3 questions. First, we asked whether music conditions elicit strong responses in the language regions. Second, we investigated whether the language network is sensitive to structure in music, as would be evidenced by stronger responses to intact than scrambled music, and stronger responses to melodies with structural violations compared with the no-violation control condition. And third, we asked whether music conditions elicit strong responses in the language regions of individuals with high sensitivity to linguistic pitch—native speakers of a tonal language (Mandarin).

For each contrast (the contrasts relevant to the 3 research questions are detailed below), we used two types of linear mixed-effect regression models:

  • (i) the language network model, which examined the language network as a whole; and

  • (ii) the individual language fROI models, which examined each language fROI separately.

As alluded to in the Introduction, treating the language network as an integrated system is reasonable given that the regions of this network (i) show similar functional profiles, both with respect to selectivity for language over nonlinguistic processes (e.g. Fedorenko et al. 2011; Pritchett et al. 2018; Jouravlev et al. 2019; Ivanova et al. 2020, 2021) and with respect to their role in lexico-semantic and syntactic processing (e.g. Fedorenko et al. 2012b, 2020; Blank et al. 2016); and (ii) exhibit strong inter-region correlations in both their activity during naturalistic cognition paradigms (e.g. Blank et al. 2014; Braga et al. 2020; Paunov et al. 2019; Malik-Moraleda, Ayyash et al. 2022) and key functional markers, like the strength or extent of activation in response to language stimuli (e.g. Mahowald and Fedorenko 2016; Mineroff, Blank et al. 2018). However, to allow for the possibility that language regions differ in their response to music and to examine the region on which most claims about language-music overlap have focused (the region that falls within “Broca’s area”), we supplement the network-wise analyses with the analyses of the 5 language fROIs separately.

For each network-wise analysis, we fit a linear mixed-effect regression model predicting the level of BOLD response in the language fROIs in the contrasted conditions. The model included a fixed effect for condition (the relevant contrasts are detailed below for each analysis) and random intercepts for fROIs and participants. Here and elsewhere, the P-value was estimated by applying the Satterthwaite’s method-of-moment approximation to obtain the degrees of freedom (Giesbrecht and Burns 1985; Fai and Cornelius 1996; as described in Kuznetsova et al. 2017). For the comparison against the fixation baseline, the random intercept for participants was removed because it is no longer applicable.

Effect size ~ condition + (1 | fROI) + (1 | Participant)

For each fROI-wise analysis, we fit a linear mixed-effect regression model predicting the level of BOLD response in each of the 5 language fROIs in the contrasted conditions. The model included a fixed effect for condition and random intercepts for participants. For each analysis, the results were FDR-corrected for the 5 fROIs. For the comparison against the fixation baseline, the random intercept for participants was removed because it is no longer applicable.

Effect size ~ condition + (1 | Participant)

Results

Does music elicit a response in the language network?

As discussed in the Introduction, a brain region that supports (some aspect of) music processing, including structure processing, should show a strong response to music stimuli. To test whether language regions respond to music, we used 4 contrasts using data from experiments 1 and 2. First, we compared the responses to each of the music conditions (orchestral music, single instrument music, synthetic drum music, and synthetic melodies in experiment 1; well-formed melodies in experiment 2) against the fixation baseline—the most liberal baseline. Second, we compared the responses to the music conditions against the response to the nonword strings condition—an unstructured and meaningless linguistic stimulus (in experiment 1, we used the auditory nonwords condition, and in experiment 2, we used the visual nonwords condition from the language localizer). Third, in experiment 1, we additionally compared the responses to the music conditions against the response to nonlinguistic, nonmusic stimuli (animal and environmental sounds). A brain region that supports music processing should elicit a strong positive response relative to the fixation baseline and the nonwords condition (our baseline for the language regions); furthermore, if the response is selective, it should be stronger than the response elicited by nonmusic auditory stimuli. Finally, in experiment 1, we also directly compared the responses to songs vs. lyrics. A brain region that responds to music should respond more strongly to songs given that they contain a melodic contour in addition to the linguistic content.

None of the music conditions elicited a strong response in the language network (Fig. 3 and Table 3). The responses to music (i) fell at or below the fixation baseline (except for the well-formed melodies condition in experiment 2, which elicited a small positive response in some regions), (ii) were lower than the response elicited by auditory nonwords (except for the LMFG language fROI, where the responses to music and nonwords were similarly low), and (iii) did not significantly differ from the responses elicited by nonlinguistic, nonmusic conditions. Finally, the response to songs, which contain both linguistic content and a melodic contour, was not significantly higher than the response elicited by the linguistic content alone (lyrics); in fact, at the network level, the response to songs was reliably lower than to lyrics.

Table 3.

Statistical results (2-sided) for the contrasts between music conditions and 3 kinds of baselines (fixation, nonwords, and nonlinguistic nonmusic auditory conditions—animal sounds and environmental sounds) in experiments 1 and 2, and for the contrast between songs and lyrics in experiment 1. Abbreviations: β, the beta estimate for the effect; SE, standard error of the mean by participants; df, degrees of freedom; d, Cohen’s d (Westfall et al. 2014; Brysbaert and Stevens 2018); t ,  the t statistic; P, the significance value (for the individual fROIs, these values have been FDR-corrected for the number of fROIs (n = 5)). In light gray, we highlight the results that are not consistent with the role of the language regions in music perception: of the 84 tests performed, none showed an effect predicted by language-music overlap accounts.

graphic file with name bhad087fx1.jpg

Is the language network sensitive to structure in music?

Experiments 1 and 2 (fMRI)

Because most prior claims about the overlap between language and music concern the processing of structure—given the parallels that can be drawn between the syntactic structure of language and the tonal and rhythmic structure in music (e.g. Lerdahl and Jackendoff 1977, 1983; cf. Jackendoff 2009)—we used 3 contrasts to test whether language regions are sensitive to music structure. First and second, in experiment 1, we compared the responses to synthetic melodies vs. their scrambled counterparts, and to synthetic drum music vs. the scrambled drum music condition. The former targets both tonal and rhythmic structure, and the latter selectively targets rhythmic structure. The reason to examine rhythmic structure is that some patient studies have argued that pitch contour processing relies on the right hemisphere, and rhythm processing draws on the left hemisphere (e.g. Zatorre 1984; Peretz 1990; Alcock et al. 2000; cf. Boebinger et al. 2021 for fMRI evidence of bilateral responses in high-level auditory areas to both tonal and rhythmic structure processing and for lack of spatial segregation between the two), so although most prior work examining the language-music relationship has focused on tonal structure, rhythmic structure may a priori be more likely to overlap with linguistic syntactic structure given their alleged co-lateralization based on the patient literature. And third, in experiment 2, we compared the responses to well-formed melodies vs. melodies with a sour note. A brain region that responds to structure in music should respond more strongly to intact than scrambled music (similar to how language regions respond more strongly to sentences than lists of words; e.g. Fedorenko et al. 2010; Diachek, Blank, Siegelman et al. 2020), and also exhibit sensitivity to structure violations (similar to how

language regions respond more strongly to sentences that contain grammatical errors: e.g. Embick et al. 2000; Newman et al. 2001; Kuperberg et al. 2003; Cooke et al. 2006; Friederici et al. 2010; Herrmann et al. 2012; Fedorenko et al. 2020). Note that given the lack of a strong and consistent response to music in the language regions (Fig. 3 and Table 3), the answer to this narrower question is somewhat of a foregone conclusion: even if one or more of the language regions showed a reliable effect in these music-structure-probing contrasts, such effects would be difficult to interpret as reflecting music structure processing given that structured music stimuli elicit a response approximately at the level of the fixation baseline in the language areas. Nevertheless, we report the results for these 3 contrasts for completeness, and because most prior studies have focused on such contrasts.

The language regions did not show consistent sensitivity to structural manipulations in music (Fig. 4 and Table 4). In experiment 1, the responses to synthetic melodies did not significantly differ from the responses to the scrambled counterparts, and the responses to synthetic drum music did not significantly differ from (or were weaker than) the responses to scrambled drum music. In experiment 2, at the network level, we observed a small and weakly significant (P < 0.05) effect of sour-note > well-formed melodies. This effect was not significant in any of the 5 individual fROIs (prior to the FDR correction, the LMFG fROI showed a small significant effect).

Fig. 4.

Fig. 4

Responses of the language fROIs (pooling across the network—top, and for each fROI individually—bottom) to the language localizer conditions (in gray), and to the 3 sets of conditions that target structure in music (in blue). The error bars represent standard error of the mean by participants. For the language localizer results, we include here participants in experiments 1 and 2. The responses to the music conditions cluster around the fixation baseline, and are much lower than the response to sentences. One of the 3 critical contrasts (sour-note > well-formed melodies) elicits a small and weakly reliable effect at the network level, but it is not individually significant in any of the 5 fROIs.

Table 4.

Statistical results (2-sided) for the contrasts between the synthetic drum music and scrambled drum music, synthetic melodies and scrambled synthetic melodies, and sour-note and well-formed melodies contrasts in experiments 1 and 2. Abbreviations: β, the beta estimate for the effect; SE, standard error of the mean by participants; df, degrees of freedom; d, Cohen’s d (Westfall et al. 2014; Brysbaert and Stevens 2018); t, the t statistic; P, the significance value (for the individual fROIs, these values have been FDR-corrected for the number of fROIs (n = 5)). In light gray, we highlight the results that are not consistent with the role of the language regions in the processing of music structure: of the 18 tests performed, 1 showed an effect predicted by language-music overlap accounts: a small and statistically weak response to one of the 3 structure-targeting contrasts (in the presence of an overall very weak response to music relative to fixation; see Fig. 3 and Table 3).

Contrast Language network LIFGorb LIFG LMFG LAnt
temp
LPost
temp
Synthetic drum music > scrambled drum music β = 0.099
SE = 0.073
df = 157.823
d = 0.140
t = 1.358
P = 0.176
β = 0.252
SE = 0.191
df = 18.000
d = 0.288
t = 1.322
P = 1.000
β = 0.027
SE = 0.176
df = 18.000
d = 0.034
t = 0.156
P = 1.000
β = 0.014
SE = 0.186
df = 18.000
d = 0.018
t = 0.073
P = 1.000
β = 0.124
SE = 0.103
df = 18.000
d = 0.247
t = 1.210
P = 1.000
β = 0.079
SE = 0.110
df = 18.000
d = 0.165
t = 0.718
P = 1.000
Synthetic melodies > scrambled synthetic melodies β = −0.124
SE = 0.061
df = 157.720
d = −0.238
t = −2.015
P = 0.046*
β = −0.147
SE = 0.130
df = 18.000
d = −0.245
t = −1.133
P = 1.000
β = −0.009
SE = 0.153
df = 18.000
d = −0.017
t = −0.057
P = 1.000
β = −0.143
SE = 0.202
df = 18.000
d = −0.216
t = −0.708
P = 1.000
β = −0.199
SE = 0.101
df = 18.000
d = −0.572
t = −1.971
P = 0.320
β = −0.121
SE = 0.106
df = 18.000
d = −0.365
t = −1.142
P = 1.000
Sour-note melodies > well-formed melodies β = 0.145
SE = 0.069
df = 175.884
d = 0.196
t = 2.102
P = 0.037*
β = 0.195
SE = 0.098
df = 20.000
d = 0.245
t = 1.985
P = 0.305
β = 0.150
SE = 0.105
df = 20.000
d = 0.180
t = 1.431
P = 0.840
β = 0.212
SE = 0.090
df = 20.000
d = 0.252
t = 2.363
P = 0.140
β = 0.065
SE = 0.051
df = 20.000
d = 0.114
t = 1.280
P = 1.000
β = 0.104
SE = 0.056
df = 20.000
d = 0.248
t = 1.856
P = 0.390

Experiment 3 (behavioral)

In experiment 3, we further asked whether individuals with severe deficits in processing linguistic syntax also exhibit difficulties in processing music structure. To do so, we assessed participants’ ability to discriminate well-formed (“good”) melodies from melodies with a sour note (“bad”), while controlling for their response bias (how likely they are overall to say that something is well-formed) by computing d’ for each participant (Green and Swets 1966), in addition to proportion correct. We then compared the d’ values of each individual with aphasia to the distribution of d’ values of healthy control participants using a Bayesian test for single case assessment (Crawford and Garthwaite 2007) as implemented in the psycho package in R (Makowski 2018). (Note that for the linguistic syntax tasks, it was not necessary to conduct statistical tests comparing the performance of each individual with aphasia to the control distribution because the performance of each individual with aphasia was lower than 100% of the control participants’ performances.) We similarly compared the proportion correct on the MBEA Scale task of each individual with aphasia to the distribution of accuracies of healthy controls. If linguistic and music syntax draw on the same resources, then individuals with linguistic syntactic impairments should also exhibit deficits on tasks that require the processing of music syntax.

In the critical music task, where participants were asked to judge the well-formedness of musical structure, neurotypical control participants responded correctly, on average, on 87.1% of trials, suggesting that the task was sufficiently difficult to preclude ceiling effects. Patients with severe aphasia showed intact sensitivity to music structure. The 3 patients had accuracies of 89.4% (PR), 94.4% (SA), and 97.8% (PP), falling on the higher end of the controls’ performance range (Fig. 5 and Table 5). Crucially, none of the 3 aphasic participants’ d’ scores were lower than the average control participants’ d’ scores (M = 2.75, SD = 0.75). In fact, the patients’ d’ scores were high: SA’s d’ was 3.51, higher than 83.91% (95% credible interval (CI) [75.20, 92.03]) of the control population, PR’s d’ was 3.09, higher than 67.26% (95% CI [56.60, 78.03]) of the control population, and PP’s d’ was 3.99, higher than 94.55% (95% CI [89.40, 98.57]) of the control population. None of the 3 aphasic participants’ bias/criterion c scores (Green and Swets 1966) differed reliably from the control participants’ c scores (M = −0.40, SD = 0.40). SA’s c was −0.53, lower than 62.34% (95% CI [50.40, 71.67]) of the control population, PR’s c was −0.74, lower than 79.48% (95% CI [69.58, 88.44]) of the control population, and PP’s c was −0.29, higher than 60.88% (95% CI [50.08, 70.04]) of the control population. In the Scale task from the Montreal Battery for the Evaluation of Aphasia, the control participants’ performance showed a similar distribution to that reported in Peretz et al. (2003). All participants with aphasia performed within the normal range, with 2 participants making no errors. PR and PP’s score was higher than 85.24% (95% CI [76.94, 93.06]) of the control population, providing a conceptual replication of the results from the well-formed/sour-note melody discrimination task. SA’s score was higher than 30.57% (95% CI [20.00, 41.50]) of the control population.

Fig. 5.

Fig. 5

Performance of the control and aphasic participants on two measures of music syntax processing: the critical music task (left), the Scale task of the MBEA (right). The densities show the distribution of proportion correct scores in the control participants and the boxplot shows the quartiles of the control population (the whiskers show ×1.5 interquartile range and points represent outliers). The dots show individual participants (for the aphasic individuals, the initials indicate the specific participant). Dashed gray lines indicate chance performance.

Table 5.

Results for participants with aphasia and control participants on the critical music task and the Scale task of the MBEA (Peretz et al. 2003). For participants with aphasia, we report the results from all 6 MBEA tasks, for completeness.

Participant SA PR PP Controls
Critical Music Task 170/180 161/180 176/180 M = 156.5/180
SD = 15.8
Min = 109/180
Max = 177/180
n = 45
Montreal Battery for the Evaluation of Amusia
(Critical for this study) Task 1 (Scale) 27/30 30/30 30/30 M = 28/30
SD = 1.89
Min = 23/30
Max = 30/30
n = 45
Task 2 (Interval; “Same Contour” on MBEA CD) 26/30 22/30 18/30
Task 3 (Contour; “Different Contour” on MBEA CD) 22/30 23/30 18/30
Task 4 (Rhythm; “Rhythmic Contour” on MBEA CD) 25/30 25/30 22/30
Task 5 (Meter; “Metric” on MBEA CD) 28/30 22/30 24/30
Task 6 (Incidental Memory) 28/30 28/30 22/30

Does music elicit a response in the language network of native speakers of a tonal language?

The above analyses focus on the language network’s responses to music stimuli and its sensitivity to music structure in English native speakers. However, some have argued that responses to music may differ in speakers of languages that use pitch to make lexical or grammatical distinctions (e.g. Deutsch et al. 2006, 2009; Bidelman et al. 2011; Creel et al. 2018; Ngo et al. 2016, Liu et al. 2021). In experiment 4, we therefore tested whether language regions of Mandarin native speakers respond to music. Similar to experiment 1, we compared the response to the music condition against (i) the fixation baseline, (ii) the foreign language condition, and (iii) a nonlinguistic, nonmusic condition (environmental sounds). A brain region that supports music processing should respond more strongly to music than the fixation baseline and the foreign condition; if the response is further selective, it should be stronger than the response elicited by environmental sounds.

Results from Mandarin native speakers replicated the results from experiment 1: the music condition did not elicit a strong response in the language network (Fig. 6 and Table 6). Although the response to music was above the fixation baseline at the network level and in some fROIs, the response did not differ from (or was lower than) the responses elicited by an unfamiliar foreign language (Russian) and environmental sounds.

Fig. 6.

Fig. 6

Responses of the language fROIs (pooling across the network—top, and for each fROI individually—bottom) to the language localizer conditions (in gray), to the two auditory conditions containing speech (red shades), to the music condition (blue), and to the nonlinguistic/nonmusic auditory condition (green) in experiment 4. The error bars represent standard error of the mean by participants. The response to the music condition is much lower than the response to sentences, and is not higher than the response to foreign language and environmental sounds.

Table 6.

Statistical results (2-sided) for the contrasts between the music condition and fixation, foreign language, and environmental sounds in experiment 4. Abbreviations: β, the beta estimate for the effect; SE, standard error of the mean by participants; df, degrees of freedom; d, Cohen’s d (Westfall et al. 2014; Brysbaert and Stevens 2018); t, the t statistic; P, the significance value (for the individual fROIs, these values have been FDR-corrected for the number of fROIs (n = 5)). In light gray, we highlight the results that are not consistent with the role of the language regions in music perception: of the 18 tests performed, 3 showed an effect predicted by language-music overlap accounts: a small positive response to the music condition relative to the weakest baseline (fixation) at the network level and in 2 fROIs individually; this response was still ~2 lower than the unfamiliar foreign language condition and was numerically lower than the environmental sounds condition.

Contrast Language network LIFGorb LIFG LMFG LAnt
temp
LPost
temp
Music > fixation β = 0.454
SE = 0.177
df = 17.646
d = 0.517
t = 2.565
P = 0.020*
β = 0.299
SE = 0.228
df = nan
d = nan
t = 1.308
P = 1.000
β = 0.761
SE = 0.207
df = nan
d = nan
t = 3.683
P = 0.010*
β = 0.480
SE = 0.260
df = nan
d = nan
t = 1.848
P = 0.410
β = 0.268
SE = 0.171
df = nan
d = nan
t = 1.568
P = 0.675
β = 0.462
SE = 0.156
df = nan
d = nan
t = 2.962
P = 0.045*
Music > foreign β = −0.359
SE = 0.141
df = 162.000
d = −0.308
t = −2.547
P = 0.012*
β = −0.360
SE = 0.416
df = 18.000
d = −0.258
t = −0.865
P = 1.000
β = 0.123
SE = 0.309
df = 18.000
d = 0.124
t = 0.398
P = 1.000
β = −0.219
SE = 0.473
df = 18.000
d = −0.149
t = −0.463
P = 1.000
β = −0.703
SE = 0.240
df = 18.000
d = −0.870
t = −2.926
P = 0.045*
β = −0.638
SE = 0.254
df = 18.000
d = −0.686
t = −2.511
P = 0.110
Music > environmental sounds β = −0.141
SE = 0.108
df = 157.749
d = −0.154
t = −1.299
P = 0.196
β = −0.249
SE = 0.187
df = 18.000
d = −0.280
t = −1.328
P = 1.000
β = −0.240
SE = 0.193
df = 18.000
d = −0.302
t = −1.248
P = 1.000
β = 0.038
SE = 0.304
df = 18.000
d = 0.030
t = 0.125
P = 1.000
β = −0.042
SE = 0.147
df = 18.000
d = −0.065
t = −0.285
P = 1.000
β = −0.210
SE = 0.179
df = 18.000
d = −0.310
t = −1.171
P = 1.000

Discussion

We here tackled a much investigated but still debated question: do the brain regions of the language network support the processing of music, especially music structure? Across 3 fMRI experiments, we obtained a clear answer: the brain regions of the language network, which support the processing of linguistic syntax (e.g. Fedorenko et al. 2010, 2020; Pallier et al. 2011; Bautista and Wilson 2016; Blank et al. 2016), do not support music processing (see Table 7 for a summary of the results). We found overall low responses to music (including orchestral pieces, solo pieces played on different instruments, synthetic music, and vocal music) in the language brain regions (Fig. 3; see Sueoka et al. 2022 for complementary evidence from the intersubject correlation approach applied to a rich naturalistic music stimulus), including in speakers of a tonal language (Fig. 6), and no consistent sensitivity to manipulations of music structure (Fig. 4). We further found that the ability to make well-formedness judgments about the tonal structure of music was preserved in patients with severe aphasia who cannot make grammaticality judgments for sentences (Fig. 5), although we acknowledge the possibility that general ability to detect unexpected events may have contributed to performance on the critical music-structure tasks (e.g. Bigand et al. 2014; Collins et al. 2014) and that additional controls would be needed to conclusively determine whether these patients have preserved music-structure processing abilities. Nevertheless, given the brain imaging results (summarized in Table 7), a critical role of the language system in music structure processing is unlikely.

Table 7.

A summary of the results for the tests of the language network’s sensitivity to music in general and to music structure specifically. This pattern of results constitutes strong evidence against the role of the language system—or any of its components—in music perception, including the processing of music structure. With respect to sensitivity to music stimuli: 4 of the 6 conditions failed to elicit a response above the low-level (fixation) baseline anywhere in the language network; 1 condition (in experiment 2) elicited a small above-fixation response, which was not significant at the network level or in any individual fROIs; and 1 condition (in experiment 4) elicited a small above-fixation response (including in 2 individual fROIs) but this response was not higher than that elicited by other auditory conditions like environmental sounds. With respect to sensitivity to music structure: 2 of the 3 manipulations failed to elicit a response anywhere in the language network, and the remaining manipulation elicited a small and weakly significant effect at the network level, which was not reliable in any individual ROI.

Contrast Experiment 1 Experiment 2 Experiment 4
Basic sensitivity to music stimuli Music > fixation
(6 different music conditions tested: 4 in Expt1, 1 in Expt2, and 1 in Expt4)
No No Yes
Music > nonwords/unfamiliar foreign language No No
Music > nonlinguistic, nonmusic auditory conditions No No
Songs (melodic contour + linguistic content) > lyrics (linguistic content) No
Sensitivity to manipulations of music structure Intact music > scrambled music (synthetic melodies) No
Intact music > scrambled music (synthetic drums) No
Sour-note melodies > well-formed melodies No (except for the network level)

Our findings align with (i) prior neuropsychological patient evidence of language/music dissociations (e.g. Luria et al. 1965; Brust 1980; Marin 1982; Basso and Capitani 1985; Polk and Kertesz 1993; Peretz et al. 1994, 1997; Piccirilli et al. 2000; Peretz and Coltheart 2003; Slevc et al. 2016; Faroqi-Shah et al. 2020; Chiappetta et al. 2022) and with (ii) prior evidence that music is processed by music-selective areas in the auditory cortex (Norman-Haignere et al. 2015; see also Boebinger et al. 2021; see Peretz et al. 2015, for review and discussion). The latter, music-selective areas are strongly sensitive to the scrambling of music structure in stimuli like those used here in experiment 1 (see also Fedorenko et al. 2012c; Boebinger et al. 2021; see Mehr et al. 2019 for a priori reasons to expect the effects of tonal structure manipulations in music-selective brain regions). (We provide the responses of music-responsive areas to the conditions of experiments 1 and 2 at: https://osf.io/68y7c/.) In contrast, our findings stand in sharp contrast to numerous reports arguing for shared structure processing mechanisms in the two domains, including specifically in the inferior frontal cortex, within “Broca’s area” (e.g. Patel et al. 1998; Koelsch et al. 2000, 2002; Maess et al. 2001; Levitin and Menon 2003; see Kunert and Slevc 2015; LaCroix et al. 2015; Vuust et al. 2022 for reviews).

Below, we discuss several issues that are relevant for interpreting the current results and/or that these results inform, and outline some limitations of scope of our study.

Theoretical considerations about the language-music relationship

Why might we a priori think that the language network, or some of its components, may be important for processing music in general, or for processing music structure specifically? Similarities between language and music have long been noted and discussed. For example, as summarized in Jackendoff (2009; see also Patel 2008), both capacities are human specific, involve the production of sound (though this is not always the case for language: cf. sign languages, or written language in literate societies), and have multiple culture-specific variants. Furthermore, language and music are intertwined in songs, which appear to be a cultural universal (e.g. Brown 1991; Nettl 2015; see Mehr et al. 2019 for empirical support; see Norman-Haignere et al. 2022 for evidence of neural selectivity for songs in the auditory cortex). However, Jackendoff (2009) notes that (i) most cognitive capacities mechanisms that have been argued to be common to language and music are not uniquely shared by language and music, and (ii) language and music differ in several critical ways, and these differences are important to consider alongside potential similarities when theorizing about possible shared representations and computations.

To elaborate on the first point: the cognitive capacity that has perhaps received the most attention in discussions of cognitive and neural mechanisms that may be shared by language and music is the combinatorial capacity of the two domains (e.g. Riemann 1877, as cited in Swain 1995; Lindblom and Sundberg 1969; Fay 1971; Sundberg and Lindblom 1976; Lerdahl and Jackendoff 1977, 1983; Roads and Wieneke 1979; Krumhansl and Keil 1982). In particular, in language, words can be combined into complex hierarchical structures to form novel phrases and sentences, and in music, notes and chords can similarly be combined to form novel melodies. Furthermore, in both domains, the combinatorial process is constrained by a set of conventions. However, this capacity can be observed, in some form, in many other domains, from visual processing, to math, to social cognition, to motor planning, to general reasoning. Similarly, other cognitive capacities that are necessary to process language and music—including a large long-term memory store for previously encountered elements and patterns, a working memory capacity needed to integrate information as it comes in, an ability to form expectations about upcoming elements, and an ability to engage in joint action—are important for information processing in other domains. An observation that some mental capacity is necessary for multiple domains is compatible with at least 2 architectures: one where the relevant capacity is implemented (perhaps in a similar way) in each relevant set of domain-specific circuits, and another where the relevant capacity is implemented in a centralized mechanism that all domains draw on (e.g. Fedorenko and Shain 2021). Those arguing for overlap between language and music processing advocate a version of the latter. Critically, any shared mechanism that language and music would draw on should also support information processing in other domains that require the relevant computation (see Section ‘Overlap in structure processing in language and music outside of the core language network?’ below for arguments against this kind of architecture). (A possible exception, according to Jackendoff (2009), may be the fine-scale vocal motor control that is needed for speech and vocal music production (cf. sign language or instrumental music), but not any other behaviors, but this kind of ability is implemented outside of the core high-level language system, in the network of brain areas that support articulation (e.g. Basilakos et al. 2015; Guenther 2016).)

More importantly, aside from the similarities that have been noted between language and music, numerous differences characterize the two domains. Most notable are their different functions. Language enables humans to express propositional meanings, and thus to share thoughts with one another. The function of music has long been debated (e.g. Darwin 1871; Pinker 1994; see e.g. McDermott 2008 and Mehr et al. 2020, for a summary of key ideas), but most of the proposed functions have to do with emotional or affective processing, often with a social component  (Jackendoff 2009; Savage et al. 2021). (Although some have discussed the notions of “meaning” in music (e.g. Meyer 1961; Raffman 1993; Cross and Tolbert 2009; Koelsch et al. 2001), it is uncontroversial that music cannot be used to express propositional thought (for discussion, see Patel 2008; Jackendoff 2009; Slevc et al. 2009).). If function drives the organization of the brain (and biological systems more generally; e.g. Rueffler et al. 2012) by imposing particular computational demands on each domain (e.g. Mehr et al. 2020), these fundamentally different functions of language and music provide a theoretical reason to expect cognitive and neural separation between them. Besides, even the components of language and music that appear similar on the surface (e.g. combinatorial processing) differ in deep and important ways (e.g. Patel 2008; Jackendoff 2009; Slevc et al. 2009; Temperley 2022).

Functional selectivity of the language network

The current results add to the growing body of evidence that the left-lateralized fronto-temporal brain network that supports language processing is highly selective for linguistic input (e.g. Fedorenko et al. 2011; Monti et al. 2009, 2012; Deen et al. 2015; Pritchett et al. 2018; Jouravlev et al. 2019; Ivanova et al. 2020, 2021; Benn, Ivanova et al. 2021; Liu et al. 2020; Deen and Freiwald 2021; Paunov et al. 2022; Sueoka et al. 2022; see Fedorenko and Blank 2020 for a review) and not critically needed for many forms of complex cognition (e.g. Lecours and Joanette 1980; Varley and Siegal 2000; Varley et al. 2005; Apperly et al. 2006; Woolgar et al. 2018; Ivanova et al. 2021; see Fedorenko and Varley 2016 for a review). Importantly, this selectivity holds across all regions of the language network, including those that fall within “Broca’s area” in the left inferior frontal gyrus. As discussed in the Introduction, many claims about shared structure processing in language and music have focused specifically on Broca’s area (e.g. Patel 2003; Fadiga et al. 2009; Fitch and Martins 2014). The evidence presented here shows that the language-responsive parts of Broca’s area, which are robustly sensitive to linguistic syntactic manipulations (e.g. Just et al. 1996; Stromswold et al. 1996; Ben-Shachar et al. 2003; Caplan et al. 2008; Peelle et al. 2010; Blank et al. 2016; see e.g. Friederici 2011 and Hagoort and Indefrey 2014 for meta-analyses), do not respond when we listen to music and are not sensitive to structure in music. These results rule out the hypothesis that language and music processing rely on the same mechanism housed in Broca’s area.

It is also worth noting that the very premise of the latter hypothesis—of a special relationship between Broca’s area and the processing of linguistic syntax (e.g. Caramazza and Zurif 1976; Friederici 2018)—has been questioned and overturned. First, syntactic processing does not appear to be carried out focally, but is instead distributed across the entire language network, with all of its regions showing sensitivity to syntactic manipulations (e.g. Fedorenko et al. 2010, 2020; Pallier et al. 2011; Blank et al. 2016; Shain, Blank et al. 2020; Shain et al. 2022), and with damage to different components leading to similar syntactic comprehension deficits (e.g. Caplan et al. 1996; Dick et al. 2001; Wilson and Saygin 2004; Mesulam et al. 2014, 2015). And second, the language-responsive part of Broca’s area, like other parts of the language network, is sensitive to both syntactic processing and word meanings, and even sub-lexical structure (Fedorenko et al. 2010, 2012b, 2020; Regev et al. 2021; Shain et al. 2021). The lack of segregation between syntactic and lexico-semantic processing is in line with the idea of “lexicalized syntax” where the conventions for how words can combine with one another are highly dependent on the particular lexical items (e.g. Goldberg 2002; Jackendoff 2002, 2007; Sag et al. 2003; Levin and Rappaport-Hovav 2005; Bybee 2010; Jackendoff and Audring 2020), and is contra the idea of combinatorial rules that are blind to the content/meaning of the to-be-combined elements (e.g. Chomsky 1965, 1995; Fodor 1983; Pinker and Prince 1988; Pinker 1991, 1999; Pallier et al. 2011).

Overlap in structure processing in language and music outside of the core language network?

We have here focused on the core fronto-temporal language network. Could structure processing in language and music draw on shared resources elsewhere in the brain? The prime candidate is the domain-general executive control, or Multiple Demand (MD), network (e.g. Duncan and Owen 2000; Duncan 2001, 2010; Assem et al. 2020), which supports functions like working memory and inhibitory control. Indeed, according to Patel’s Shared Structural Integration Resource Hypothesis (2003, 2008, 2012), language and music draw on separate representations, stored in distinct cortical areas, but rely on the same working memory store to integrate incoming elements into evolving structures. Relatedly, Slevc et al. (2013; see Asano et al. 2021 for a related proposal) have argued that another executive resource—inhibitory control—may be required for structure processing in both language and music. Although it is certainly possible that some aspects of linguistic and/or musical processing would require domain-general executive resources, based on the available evidence from the domain of language, we would argue that any such engagement does not reflect the engagement of computations related to syntactic structure building. In particular, Blank and Fedorenko (2017) found that activity in the brain regions of the domain-general MD network does not closely “track” linguistic stimuli, as evidenced by low intersubject correlations during the processing of linguistic input (see Paunov et al. 2022 and Sueoka et al. 2022 for replications). Furthermore, Diachek, Blank, Siegelman et al. (2020) showed in a large-scale fMRI investigation that the MD network is not engaged during language processing in the absence of secondary task demands (cf. the core language network, which is relatively insensitive to task demands and responds robustly even during passive listening/reading). And Shain, Blank et al. (2020; see also Shain et al. 2022) have shown that the language network, but not the MD network, is sensitive to linguistic surprisal and working-memory integration costs (see also Wehbe et al. 2021 for evidence that activity in the language, but not the MD, network reflects general incremental processing difficulty).

In tandem, this evidence argues against the role of executive resources in core linguistic computations like those related to lexical access and combinatorial processing, including syntactic parsing and semantic composition (see also Hasson et al. 2015 and Dasgupta and Gershman 2021 for general arguments against the separation between memory and computation in the brain). Thus, although the contribution of executive resources to music processing deserves further investigation (cf. https://osf.io/68y7c/ for evidence of low responses of the MD network to the music conditions in the current study), any overlap within the executive system between linguistic and music processing cannot reflect core linguistic computations, as those seem to be carried out by the language network (see Fedorenko and Shain 2021 for a review). Functionally identifying the MD network in individual participants (e.g. Fedorenko et al. 2013; Shashidhara et al. 2019) is a powerful way to help interpret the observed effects of music manipulations as reflecting general executive demands (see Saxe et al. 2006, Blank et al. 2017, and Fedorenko 2021 for general discussions of greater interpretability of fMRI results obtained from the functional localization approach). Importantly, given the ubiquitous sensitivity of the MD network to cognitive demands, it is/will be important to rule out task demands, rather than stimulus processing, as the source of overlap between music and language processing in interpreting past studies and designing future ones.

Overlap between music processing and other aspects of speech/language

The current study investigated the role of the language network—which supports “high-level” comprehension and production—in music processing. As a result, the claims we make are restricted to those aspects of language that are supported by this network. These include the processing of word meanings and combinatorial (syntactic and semantic) processing, but exclude speech perception, prosodic processing, higher-level discourse structure building, and at least some aspects of pragmatic reasoning. Some of these components of language (e.g. pragmatic reasoning) seem a priori unlikely to share resources with music. Others (e.g. speech perception) have been shown to robustly dissociate from music (Norman-Haignere et al. 2015; Overath et al. 2015; Kell et al. 2018; Boebinger et al. 2021). However, some components of speech and language may, and some do, draw on the same resources as aspects of music. For example, aspects of pitch perception have been argued to overlap between speech and music based on behavioral and neuropsychological evidence (e.g. Wong and Perrachione 2007; Perrachione et al. 2013; Patel et al. 2008b). Indeed, brain regions that selectively respond to pitched sounds have been previously reported (Patterson et al. 2002; Penagos et al. 2004; Norman-Haignere et al. 2013, 2015). Some studies have also suggested that music training may improve general rapid auditory processing and pitch encoding that are important for speech perception and language comprehension (e.g. Overy 2003; Tallal and Gaab 2006; Wong et al. 2007), although at least some of these effects likely originate in the brainstem and subcortical auditory regions (e.g. Wong et al. 2007). Other aspects of high-level auditory perception, including aspects of rhythm, may turn out to overlap as well, and deserve further investigation (see Patel 2008 for a review).

We also have focused on Western tonal instrumental music here. In the future, it would be useful to extend these findings to more diverse kinds of music. That said, given that individuals are most sensitive to structure in music with which they have experience (e.g. Cuddy et al. 1981; Cohen 1982; Curtis and Bharucha 2009), it seems unlikely that music from less familiar traditions would elicit a strong response in the language areas (see Boebinger et al. 2021 for evidence that music-selective areas of the auditory cortex respond to culturally diverse music styles). Furthermore, given that evolutionarily early forms of music were likely vocal (e.g. Trehub 2003; Mehr and Krasnow 2017), it would be useful to examine the responses of the language regions to vocal music without linguistic content, like humming or whistling. Based on preliminary unpublished data from our lab (available upon request), responses to such stimuli in the language areas appear low.

In conclusion, we have here provided extensive evidence against the role of the language network in music perception, including the processing of music structure. Although the relationship between music and aspects of speech and language will likely continue to generate interest in the research community, and aspects of speech and language other than those implemented in the core fronto-temporal language-selective network (Fedorenko et al. 2011; Fedorenko and Thompson-Schill 2014) may indeed share some processing resources with (aspects of) music, we hope that the current study helps bring clarity to the debate about structure processing in language and music.

Supplementary Material

Chen_et_al_SI_bhad087

Acknowledgments

We would like to acknowledge the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research at MIT, and its support team (Steve Shannon and Atsushi Takahashi). We thank former and current EvLab members for their help with fMRI data collection (especially Meilin Zhan for help with experiment 4). We thank Josh McDermott for input on many aspects of this work, Jason Rosenberg for composing the melodies used in experiments 2 and 3, and Zuzanna Balewski for help with creating the final materials used in experiments 2 and 3. For experiment 3, we thank Vitor Zimmerer for help with creating the grammaticality judgment task, Ted Gibson for help with collecting the control data, and Anya Ivanova for help with Fig. 2. For experiment 4, we thank Anne Cutler, Peter Graff, Morris Alper, Xiaoming Wang, Taibo Li, Terri Scott, Jeanne Gallée, and Lauren Clemens for help with constructing and/or recording and/or editing the language materials, and Fatemeh Khalilifar, Caitlyn Hoeflin, and Walid Bendris for help with selecting the music materials and with the experimental script. Finally, we thank the audience at the Society for Neuroscience conference (2014), the Neurobiology of Language conference (virtual edition, 2020), Ray Jackendoff, Dana Boebinger, and members of the Fedorenko and Gibson labs for helpful comments and discussions.

Contributor Information

Xuanyi Chen, Department of Cognitive Sciences, Rice University, TX 77005, United States; Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States.

Josef Affourtit, Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States.

Rachel Ryskin, Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States; Department of Cognitive & Information Sciences, University of California, Merced, Merced, CA 95343, United States.

Tamar I Regev, Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States.

Samuel Norman-Haignere, Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY, United States; Department of Neuroscience, University of Rochester Medical Center, Rochester, NY, United States; Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States; Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY, United States.

Olessia Jouravlev, Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States; Department of Cognitive Science, Carleton University, Ottawa, ON, Canada.

Saima Malik-Moraleda, Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States; The Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA 02138, United States.

Hope Kean, Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States.

Rosemary Varley, Psychology & Language Sciences, UCL, London, WCN1 1PF, United Kingdom.

Evelina Fedorenko, Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States; The Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA 02138, United States.

Author contributions

XC JA RR TR SNH OJ SMM HK RV EF
Conceptualization
Methodology
Software
Investigation
Investigation: fMRI data collection
Investigation: fMRI data preprocessing and analysis
Investigation: Behavioral data collection
Investigation: Behavioral data analysis
Formal statistical analysis
Validation
Visualization
Writing: Original draft
Writing: Editing + comments
Resources
Project administration; overall supervision

CRediT authors statement

Xuanyi Chen (Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft), Josef Affourtit (Investigation, Software, Writing – review & editing), Rachel Ryskin (Formal analysis, Investigation, Validation, Visualization, Writing – original draft), Tamar Regev (Methodology, Visualization, Writing – review & editing), Samuel Norman-Haignere (Methodology, Writing – review & editing), Olessia Jouravlev (Investigation, Writing – review & editing), Saima Malik-Moraleda (Investigation, Writing – review & editing), Hope Kean (Investigation, Writing – review & editing), Rosmary Varley (Conceptualization, Investigation, Methodology, Resources, Writing – original draft), Evelina Fedorenko (Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Resources, Supervision, Writing – original draft)

Funding

The National Institutes of Health (F32-DC-015163 to RR, K99DC018051 to SNH, grant numbers R00HD057522, R01DC016607, R01DC016950, R01NS121471 to EF); National Science Foundation (graduate award to SNH); Howard Hughes Medical Institute/Life Sciences Research Foundation (postdoctoral award to SNH); La Caixa Fellowship (LCF/BQ/AA17/11610043 to SMM); Alzheimer’s Society and The Stroke Association to RV; the Paul and Lilah Newton Brain Science Award, and funds from the Brain and Cognitive Sciences Department, the McGovern Institute for Brain Research, and the Simons Center for the Social Brain to EF.

Conflict of interest statement

None declared.

Data availability

The data sets generated during and/or analyzed during the current study are available in the OSF repository: https://osf.io/68y7c/.

Code availability

Scripts for statistical analysis are available at: https://osf.io/68y7c/.

References

  1. Alcock  KJ, Wade  D, Anslow  P, Passingham  RE. Pitch and timing abilities in adult left-hemisphere-dysphasic and right-hemisphere-damaged subjects. Brain Lang. 2000:75(1):47–65. [DOI] [PubMed] [Google Scholar]
  2. Apperly  IA, Samson  D, Carroll  N, Hussain  S, Humphreys  G. Intact first-and second-order false belief reasoning in a patient with severely impaired grammar. Soc Neurosci. 2006:1(3-4):334–348. [DOI] [PubMed] [Google Scholar]
  3. Asano  R, Boeckx  C, Seifert  U. Hierarchical control as a shared neurocognitive mechanism for language and music. Cognition. 2021:216:104847. [DOI] [PubMed] [Google Scholar]
  4. Ashburner J, Friston KJ. Unified segmentation. Neuroimage. 2005:26(3):839–51. [DOI] [PubMed] [Google Scholar]
  5. Assem  M, Glasser  MF, Van Essen  DC, Duncan  J. A domain-general cognitive core defined in multimodally parcellated human cortex. Cereb Cortex. 2020:30(8):4361–4380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Baillet  S. Forward and inverse problems of MEG/EEG. In: Jaeger  D, Jung  R, editors. Encyclopedia of computational neuroscience. New York (NY): Springer; 2014. pp. 1–8. [Google Scholar]
  7. Baroni  M, Maguire  S, Drabkin  W. The concept of musical grammar. Music Anal. 1983:2(2):175–208. [Google Scholar]
  8. Basilakos  A, Rorden  C, Bonilha  L, Moser  D, Fridriksson  J. Patterns of poststroke brain damage that predict speech production errors in apraxia of speech and aphasia dissociate. Stroke. 2015:46(6):1561–1566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Basso  A, Capitani  E. Spared musical abilities in a conductor with global aphasia and ideomotor apraxia. J Neurol Neurosurg Psychiatry. 1985:48(5):407–412. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bates  D, Mächler  M, Bolker  B, Walker  S. Fitting linear mixed-effects models using lme4. J Stat Softw. 2015:67(1):1–48. [Google Scholar]
  11. Bautista  A, Wilson  SM. Neural responses to grammatically and lexically degraded speech. Lang Cogn Neurosci. 2016:31(4):567–574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Ben-Shachar  M, Hendler  T, Kahn  I, Ben-Bashat  D, Grodzinsky  Y. The neural reality of syntactic transformations: evidence from functional magnetic resonance imaging. Psychol Sci. 2003:14(5):433–440. [DOI] [PubMed] [Google Scholar]
  13. Benn  Y*, Ivanova  A*, Clark  O, Mineroff  Z, Seikus  C, Santos Silva  J, Varley  R, Fedorenko  E. No evidence for a special role of language in feature-based categorization. bioRxiv. 2021. [Google Scholar]
  14. Bernstein  L. The unanswered question: six talks at Harvard. Cambridge (MA): Harvard University Press; 1976. [Google Scholar]
  15. Bidelman  GM, Gandour  JT, Krishnan  A. Musicians and tone-language speakers share enhanced brainstem encoding but not perceptual benefits for musical pitch. Brain Cogn. 2011:77(1):1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Bigand  E, Tillmann  B, Poulin  B, D'Adamo  DA, Madurell  F. The effect of harmonic context on phoneme monitoring in vocal music. Cognition. 2001:81(1):B11–B20. [DOI] [PubMed] [Google Scholar]
  17. Bigand  E, Delbé  C, Poulin-Charronnat  B, Leman  M, Tillmann  B. Empirical evidence for musical syntax processing? Computer simulations reveal the contribution of auditory short-term memory. Front Syst Neurosci. 2014:8:94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Bishop  DVM, Norbury  CF. Exploring the borderlands of autistic disorder and specific language impairment: a study using standardised diagnostic instruments. J Child Psychol Psychiatry. 2002:43(7):917–929. [DOI] [PubMed] [Google Scholar]
  19. Blank  I, Kanwisher  N, Fedorenko  E. A functional dissociation between language and multiple-demand systems revealed in patterns of BOLD signal fluctuations. J Neurophysiol. 2014:112(5):1105–1118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Blank  I, Balewski  Z, Mahowald  K, Fedorenko  E. Syntactic processing is distributed across the language system. NeuroImage. 2016:127:307–323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Blank  IA, Fedorenko  E. Domain-general brain regions do not track linguistic input as closely as language-selective regions. J Neurosci. 2017:37(41):9999–10011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Blank  IA, Kiran  S, Fedorenko  E. Can neuroimaging help aphasia researchers? Addressing generalizability, variability, and interpretability. Cogn Neuropsychol. 2017:34(6):377–393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Boebinger  D, Norman-Haignere  SV, McDermott  JH, Kanwisher  N. Music-selective neural populations arise without musical training. J Neurophysiol. 2021:125(6):2237–2263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Boilès  CL. Reconstruction of proto-melody. Anu Interam Investig Music. 1973:9:45–63. [Google Scholar]
  25. Bortolini  U, Leonard  LB, Caselli  MC. Specific language impairment in Italian and English: evaluating alternative accounts of grammatical deficits. Lang Cogn Process. 1998:13(1):1–20. [Google Scholar]
  26. Braga  RM, DiNicola  LM, Becker  HC, Buckner  RL. Situating the left-lateralized language network in the broader organization of multiple specialized large-scale distributed networks. J Neurophysiol. 2020:124(5):1415–1448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Brown  DR. Human universals. Philadelphia (PA): Temple University Press; 1991. [Google Scholar]
  28. Brust  JC. Music and language: musical alexia and agraphia. Brain. 1980:103(2):367–392. [DOI] [PubMed] [Google Scholar]
  29. Brysbaert  M, Stevens  M. Power analysis and effect size in mixed effects models: a tutorial. J Cogn. 2018:1(1):9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Bybee  J. Language, usage and cognition. Cambridge (UK): Cambridge University Press; 2010. [Google Scholar]
  31. Caplan  D, Hildebrandt  N, Makris  N. Location of lesions in stroke patients with deficits in syntactic processing in sentence comprehension. Brain. 1996:119(3):933–949. [DOI] [PubMed] [Google Scholar]
  32. Caplan  D, Stanczak  L, Waters  G. Syntactic and thematic constraint effects on blood oxygenation level dependent signal correlates of comprehension of relative clauses. J Cogn Neurosci. 2008:20(4):643–656. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Caramazza  A, Zurif  EB. Dissociation of algorithmic and heuristic processes in language comprehension: evidence from aphasia. Brain Lang. 1976:3(4):572–582. [DOI] [PubMed] [Google Scholar]
  34. Chen  G, Taylor  PA, Cox  RW. Is the statistic value all we should care about in neuroimaging?  NeuroImage. 2017:147:952–959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Chiappetta  B, Patel  AD, Thompson  CK. Musical and linguistic syntactic processing in agrammatic aphasia: an ERP study. J Neurolinguistics. 2022:62:101043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Chomsky  N. Aspects of the theory of syntax. Cambridge (MA): MIT Press; 1965. [Google Scholar]
  37. Chomsky  N. The minimalist program. Cambridge (MA): MIT Press; 1995. [Google Scholar]
  38. Collins  T, Tillmann  B, Barrett  FS, Delbé  C, Janata  P. A combined model of sensory and cognitive representations underlying tonal expectations in music: from audio signals to behavior. Psychol Rev. 2014:121(1):33. [DOI] [PubMed] [Google Scholar]
  39. Cooke  A, Grossman  M, DeVita  C, Gonzalez-Atavales  J, Moore  P, Chen  W, Gee  J, Detre  J. Large-scale neural network for sentence processing. Brain Lang. 2006:96(1):14–36. [DOI] [PubMed] [Google Scholar]
  40. Cooper  R. Propositions pour un modele transformationnel de description musicale. Musique en Jeu. 1973:10:70–88. [Google Scholar]
  41. Corbetta  M, Shulman  GL. Control of goal-directed and stimulus-driven attention in the brain. Nat Rev Neurosci. 2002:3(3):201–215. [DOI] [PubMed] [Google Scholar]
  42. Corlett  PR, Mollick  JA, Kober  H. Substrates of human prediction error for incentives, perception, cognition, and action. psyarxiv. 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Crawford  JR, Garthwaite  PH. Comparison of a single case to a control or normative sample in neuropsychology: development of a Bayesian approach. Cogn Neuropsychol. 2007:24(4):343–372. [DOI] [PubMed] [Google Scholar]
  44. Creel  SC, Weng  M, Fu  G, Heyman  GD, Lee  K. Speaking a tone language enhances musical pitch perception in 3–5-year-olds. Dev Sci. 2018:21(1):e12503. [DOI] [PubMed] [Google Scholar]
  45. Cross I, Tolbert E. Music and meaning. The Oxford handbook of music psychology. 2009:24–34. [Google Scholar]
  46. Crump  MJ, McDonnell  JV, Gureckis  TM. Evaluating Amazon's Mechanical Turk as a tool for experimental behavioral research. PLoS One. 2013:8(3):e57410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Cohen  AJ. Exploring the sensitivity to structure in music. Can Univ Music Rev. 1982:3:15–30. [Google Scholar]
  48. Cuddy  LL, Cohen  AI, Mewhort  DJK. Perception of structure in short melodic sequences. J Exp Psychol Hum Percept Perform. 1981:7:869–883. [DOI] [PubMed] [Google Scholar]
  49. Cumming  G. Understanding the new statistics: effect sizes, confidence intervals, and meta-analysis. New York (NY): Taylor & Francis; 2012. [Google Scholar]
  50. Curtis  ME, Bharucha  JJ. Memory and musical expectation for tones in cultural context. Music Percept. 2009:26:365–375. [Google Scholar]
  51. Dale  AM. Optimal experimental design for event-related fMRI. Hum Brain Mapp. 1999:8(2-3):109–114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Darwin  C. The descent of man, and selection in relation to sex. London (UK): John Murray; 1871. [Google Scholar]
  53. Dasgupta  I, Gershman  SJ. Memory as a computational resource. Trends Cogn Sci. 2021:25(3):240–251. [DOI] [PubMed] [Google Scholar]
  54. Deen  B, Koldewyn  K, Kanwisher  N, Saxe  R. Functional organization of social perception and cognition in the superior temporal sulcus. Cereb Cortex. 2015:25(11):4596–4609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Deen  B, Freiwald  WA. Parallel systems for social and spatial reasoning within the cortical apex. bioRxiv. 2021. [Google Scholar]
  56. Deutsch  D, Henthorn  T, Marvin  E, Xu  H. Absolute pitch among American and Chinese conservatory students: prevalence differences, and evidence for a speech-related critical period. J Acoust Soc Am. 2006:119(2):719–722. [DOI] [PubMed] [Google Scholar]
  57. Deutsch  D, Dooley  K, Henthorn  T, Head  B. Absolute pitch among students in an American music conservatory: association with tone language fluency. J Acoust Soc Am. 2009:125(4):2398–2403. [DOI] [PubMed] [Google Scholar]
  58. Diachek  E*, Blank  I*, Siegelman  M*, Affourtit  J, Fedorenko  E. The domain-general multiple demand (MD) network does not support core aspects of language comprehension: a large-scale fMRI investigation. J Neurosci. 2020:40(23):4536–4550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Dick  F, Bates  E, Wulfeck  B, Utman  JA, Dronkers  N, Gernsbacher  MA. Language deficits, localization, and grammar: evidence for a distributive model of language breakdown in aphasic patients and neurologically intact individuals. Psychol Rev. 2001:108(4):759–788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Ding  J, Martin  RC, Hamilton  AC, Schnur  TT. Dissociation between frontal and temporal-parietal contributions to connected speech in acute stroke. Brain. 2020:143(3):862–876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Duncan  J, Owen  AM. Common regions of the human frontal lobe recruited by diverse cognitive demands. Trends Neurosci. 2000:23(10):475–483. [DOI] [PubMed] [Google Scholar]
  62. Duncan  J. An adaptive coding model of neural function in prefrontal cortex. Nat Rev Neurosci. 2001:2(11):820–829. [DOI] [PubMed] [Google Scholar]
  63. Duncan  J. The multiple-demand (MD) system of the primate brain: mental programs for intelligent behaviour. Trends Cogn Sci. 2010:14(4):172–179. [DOI] [PubMed] [Google Scholar]
  64. Duncan  J. The structure of cognition: attentional episodes in mind and brain. Neuron. 2013:80(1):35–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Embick  D, Marantz  A, Miyashita  Y, O'Neil  W, Sakai  KL. A syntactic specialization for Broca's area. Proc Natl Acad Sci U S A. 2000:97(11):6150–6154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Fadiga  L, Craighero  L, D'Ausilio  A. Broca's area in language, action, and music. Ann N Y Acad Sci. 2009:1169(1):448–458. [DOI] [PubMed] [Google Scholar]
  67. Fai  AH-T, Cornelius  PL. Approximate F-tests of multiple degree of freedom hypotheses in generalized least squares analyses of unbalanced split-plot experiments. J Stat Comput Simul. 1996:54(4):363–378. [Google Scholar]
  68. Fancourt  A. Exploring musical cognition in children with specific language impairment.  doctoral thesis. London (UK): Goldsmiths, University of London; 2013. [Google Scholar]
  69. Faroqi-Shah  Y, Slevc  LR, Saxena  S, Fisher  SJ, Pifer  M. Relationship between musical and language abilities in post-stroke aphasia. Aphasiology. 2020:34(7):793–819. [Google Scholar]
  70. Fay  T. Perceived hierarchic structure in language and music. J Music Theory. 1971:15(1/2):112–137. [Google Scholar]
  71. Fedorenko  E, Patel  A, Casasanto  D, Winawer  J, Gibson  E. Structural integration in language and music: evidence for a shared system. Mem Cogn. 2009:37(1):1–9. [DOI] [PubMed] [Google Scholar]
  72. Fedorenko  E, Hsieh  P-J, Nieto-Castañon  A, Whitfield-Gabrieli  S, Kanwisher  N. A new method for fMRI investigations of language: defining ROIs functionally in individual subjects. J Neurophysiol. 2010:104(2):1177–1194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Fedorenko  E, Behr  M, Kanwisher  N. Functional specificity for high-level linguistic processing in the human brain. Proc Natl Acad Sci U S A. 2011:108(39):16428–16433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Fedorenko  E, Duncan  J, Kanwisher  N. Language-selective and domain-general regions lie side by side within Broca’s area. Curr Biol. 2012a:22(21):2059–2062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Fedorenko  E, Nieto-Castañon  A, Kanwisher  N. Lexical and syntactic representations in the brain: an fMRI investigation with multi-voxel pattern analyses. Neuropsychologia. 2012b:50(4):499–513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Fedorenko  E, McDermott  J, Norman-Haignere  S, Kanwisher  N. Sensitivity to musical structure in the human brain. J Neurophysiol. 2012c:108(12):3289–3300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Fedorenko  E, Duncan  J, Kanwisher  N. Broad domain-generality in focal regions of frontal and parietal cortex. Proc Natl Acad Sci U S A. 2013:110(41):16616–16621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Fedorenko  E. The role of domain-general cognitive control in language comprehension. Front Psychol. 2014:5:335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Fedorenko  E, Thompson-Schill  SL. Reworking the language network. Trends Cogn Sci. 2014:18(3):120–126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Fedorenko  E, Varley  R. Language and thought are not the same thing: evidence from neuroimaging and neurological patients. Ann N Y Acad Sci. 2016:1369(1):132–153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Fedorenko  E, Blank  I. Broca’s area is not a natural kind. Trends Cogn Sci. 2020:24(4):270–284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Fedorenko  E, Blank  I, Siegelman  M, Mineroff  Z. Lack of selectivity for syntax relative to word meanings throughout the language network. Cognition. 2020:203:104348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Fedorenko  E. The early origins and the growing popularity of the individual-subject analytic approach in human neuroscience. Curr Opin Behav Sci. 2021:40:105–112. [Google Scholar]
  84. Fedorenko  E, Shain  C. Similarity of computations across domains does not imply shared implementation: the case of language comprehension. Curr Dir Psychol Sci. 2021:30(6):526–534. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Fischl  B, Rajendran  N, Busa  E, Augustinack  J, Hinds  O, Yeo  BT, Mohlberg  H, Amunts  K, Zilles  K. Cortical folding patterns and predicting cytoarchitecture. Cereb Cortex. 2008:18(8):1973–1980. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Fitch  WT, Martins  MD. Hierarchical processing in music, language, and action: Lashley revisited. Ann N Y Acad Sci. 2014:1316(1):87–104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Fodor  JD. Phrase structure parsing and the island constraints. Linguist Philos. 1983:6(2):163–223. [Google Scholar]
  88. Fouragnan  E, Retzler  C, Philiastides  MG. Separate neural representations of prediction error valence and surprise: evidence from an fMRI meta-analysis. Hum Brain Mapp. 2018:39(7):2887–2906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Franklin  S, Turner  JE, Ellis  AW. ADA comprehension battery. York: University of York; 1992. [Google Scholar]
  90. Friederici  AD, Fiebach  CJ, Schlesewsky  M, Bornkessel  ID, Von Cramon  DY. Processing linguistic complexity and grammaticality in the left frontal cortex. Cereb Cortex. 2006:16(12):1709–1717. [DOI] [PubMed] [Google Scholar]
  91. Friederici  AD, Kotz  SA, Scott  SK, Obleser  J. Disentangling syntax and intelligibility in auditory language comprehension. Hum Brain Mapp. 2010:31(3):448–457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Friederici  AD. The brain basis of language processing: from structure to function. Physiol Rev. 2011:91(4):1357–1392. [DOI] [PubMed] [Google Scholar]
  93. Friederici  AD. The neural basis for human syntax: Broca's area and beyond. Curr Opin Behav Sci. 2018:21:88–92. [Google Scholar]
  94. Friston KJ, Ashburner J, Frith CD, Poline JB, Heather JD, Frackowiak RS. Spatial registration and normalization of images. Hum Brain Mapp. 1995:3(3):165–89. [Google Scholar]
  95. Frost  MA, Goebel  R. Measuring structural–functional correspondence: spatial variability of specialised brain regions after macro-anatomical alignment. NeuroImage. 2012:59(2):1369–1381. [DOI] [PubMed] [Google Scholar]
  96. Giesbrecht  F, Burns  J. Two-stage analysis based on a mixed model: large-sample asymptotic theory and small-sample simulation results. Biometrics. 1985:41(2):477–486. [Google Scholar]
  97. Goldberg  AE. Construction grammar. In: Nadel  L, editors. Encyclopedia of cognitive science. Stuttgart: Macmillan; 2002:1–4. [Google Scholar]
  98. Green  DM, Swets  JA. Signal detection theory and psychophysics. New York (NY): Wiley; 1966. [Google Scholar]
  99. Guenther  FH. Neural control of speech. Cambridge (MA): MIT Press; 2016. [Google Scholar]
  100. Hagoort  P, Indefrey  P. The neurobiology of language beyond single words. Ann Rev Neurosci. 2014:37:347–362. [DOI] [PubMed] [Google Scholar]
  101. Hasson  U, Chen  J, Honey  CJ. Hierarchical process memory: memory as an integral component of information processing. Trends Cogn Sci. 2015:19(6):304–313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Herholz  SC, Zatorre  RJ. Musical training as a framework for brain plasticity: behavior, function, and structure. Neuron. 2012:76(3):486–502. [DOI] [PubMed] [Google Scholar]
  103. Herrmann  B, Obleser  J, Kalberlah  C, Haynes  JD, Friederici  AD. Dissociable neural imprints of perception and grammar in auditory functional imaging. Hum Brain Mapp. 2012:33(3):584–595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Hoch  L, Poulin-Charronnat  B, Tillmann  B. The influence of task-irrelevant music on language processing: syntactic and semantic structures. Front Psychol. 2011:2:112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Ivanova  A, Srikant  S, Sueoka  Y, Kean  H, Dhamala  R, O’Reilly  U-M, Bers  MU, Fedorenko  E. Comprehension of computer code relies primarily on domain-general executive resources. eLife. 2020:9:e58906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Ivanova  A, Mineroff  Z, Zimmerer  V, Kanwisher  N, Varley  R, Fedorenko  E. The language network is recruited but not required for non-verbal semantic processing. bioRxiv. 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Jackendoff  R. English particle constructions, the lexicon, and the autonomy of syntax. In: Dehé  N, Jackendoff  R, McIntyre  A, Urban  S, editors. Verb-particle explorations. Berlin: De Gruyter; 2002. pp. 67–94. [Google Scholar]
  108. Jackendoff  R. A parallel architecture perspective on language processing. Brain Res. 2007:1146:2–22. [DOI] [PubMed] [Google Scholar]
  109. Jackendoff  R. Parallels and nonparallels between language and music. Music Percept. 2009:26(3):195–204. [Google Scholar]
  110. Jackendoff  R, Audring  J. The texture of the lexicon: relational morphology and the parallel architecture. Oxford: Oxford University Press; 2020. [Google Scholar]
  111. Janata  P. ERP measures assay the degree of expectancy violation of harmonic contexts in music. J Cogn Neurosci. 1995:7(2):153–164. [DOI] [PubMed] [Google Scholar]
  112. Jentschke  S, Koelsch  S, Sallat  S, Friederici  AD. Children with specific language impairment also show impairment of music-syntactic processing. J Cogn Neurosci. 2008:20(11):1940–1951. [DOI] [PubMed] [Google Scholar]
  113. Jouravlev  O, Zheng  D, Balewski  Z, Pongos  A, Levan  Z, Goldin-Meadow  S, Fedorenko  E. Speech-accompanying gestures are not processed by the language-processing mechanisms. Neuropsychologia. 2019:132:107132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Jouravlev  O, Kell  A, Mineroff  Z, Haskins  AJ, Ayyash  D, Kanwisher  N, Fedorenko  E. Reduced language lateralization in autism and the broader autism phenotype as assessed with robust individual-subjects analyses. Autism Res. 2020:13(10):1746–1761. [DOI] [PubMed] [Google Scholar]
  115. Just  MA, Carpenter  PA, Keller  TA, Eddy  WF, Thulborn  KR. Brain activation modulated by sentence comprehension. Science. 1996:274(5284):114–116. [DOI] [PubMed] [Google Scholar]
  116. Kaplan  E, Goodglass  H, Weintraub  S. Boston naming test. 2nd ed. Philadelphia (PA): Lippincott Williams & Wilkins; 2001. [Google Scholar]
  117. Kay  J, Lesser  R, Coltheart  M. Psycholinguistic assessments of language processing in aphasia (PALPA). Hove (UK): Lawrence Erlbaum; 1992. [Google Scholar]
  118. Kell  AJ, Yamins  DL, Shook  EN, Norman-Haignere  SV, McDermott  JH. A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron. 2018:98(3):630–644. [DOI] [PubMed] [Google Scholar]
  119. Keller  TA, Carpenter  PA, Just  MA. The neural bases of sentence comprehension: a fMRI examination of syntactic and lexical processing. Cereb Cortex. 2001:11(3):223–237. [DOI] [PubMed] [Google Scholar]
  120. Koelsch  S, Gunter  T, Friederici  AD, Schröger  E. Brain indices of music processing: “nonmusicians” are musical. J Cogn Neurosci. 2000:12(3):520–541. [DOI] [PubMed] [Google Scholar]
  121. Koelsch  S, Gunter  TC, Schröger  E, Tervaniemi  M, Sammler  D, Friederici  AD. Differentiating ERAN and MMN: an ERP study. Neuroreport. 2001:12(7):1385–1389. [DOI] [PubMed] [Google Scholar]
  122. Koelsch  S, Gunter  TC, von  Cramon  DY, Zysset  S, Lohmann  G, Friederici  AD. Bach speaks: a cortical “language-network” serves the processing of music. NeuroImage. 2002:17(2):956–966. [PubMed] [Google Scholar]
  123. Koelsch  S. Significance of Broca's area and ventral premotor cortex for music-syntactic processing. Cortex. 2006:42(4):518–520. [DOI] [PubMed] [Google Scholar]
  124. Koelsch  S, Jentschke  S, Sammler  D, Mietchen  D. Untangling syntactic and sensory processing: an ERP study of music perception. Psychophysiology. 2007:44(3):476–490. [DOI] [PubMed] [Google Scholar]
  125. Koelsch  S, Rohrmeier  M, Torrecuso  R, Jentschke  S. Processing of hierarchical syntactic structure in music. Proc Natl Acad Sci U S A. 2013:110(38):15443–15448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  126. Kriegeskorte  N, Simmons  WK, Bellgowan  PS, Baker  CI. Circular analysis in systems neuroscience: the dangers of double dipping. Nat Neurosci. 2009:12(5):535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  127. Krumhansl  CL, Keil  FC. Acquisition of the hierarchy of tonal functions in music. Mem Cogn. 1982:10(3):243–251. [DOI] [PubMed] [Google Scholar]
  128. Kunert  R, Slevc  LR. A commentary on: “neural overlap in processing music and speech”. Front Hum Neurosci. 2015:9:330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  129. Kunert  R, Willems  RM, Casasanto  D, Patel  AD, Hagoort  P. Music and language syntax interact in Broca’s area: an fMRI study. PLoS One. 2015:10(11):e0141069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  130. Kunert  R, Willems  RM, Hagoort  P. Language influences music harmony perception: effects of shared syntactic integration resources beyond attention. R Soc Open Sci. 2016:3(2):150685. [DOI] [PMC free article] [PubMed] [Google Scholar]
  131. Kuperberg  GR, Holcomb  PJ, Sitnikova  T, Greve  D, Dale  AM, Caplan  D. Distinct patterns of neural modulation during the processing of conceptual and syntactic anomalies. J Cogn Neurosci. 2003:15(2):272–293. [DOI] [PubMed] [Google Scholar]
  132. Kuznetsova  A, Brockhoff  PB, Christensen  RH. lmerTest package: tests in linear mixed effects models. J Stat Softw. 2017:82(13):1–26. [Google Scholar]
  133. LaCroix  A, Diaz  AF, Rogalsky  C. The relationship between the neural computations for speech and music perception is context-dependent: an activation likelihood estimate study. Front Psychol. 2015:6:1138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  134. Lartillot  O, Grandjean  D. Tempo and metrical analysis by tracking multiple metrical levels using autocorrelation. Appl Sci. 2019:9(23):5121. [Google Scholar]
  135. Lartillot  O, Toiviainen  P. 2007. A MATLAB toolbox for musical feature extraction from audio. In: Proceedings of the 10th International Conference on Digital Audio Effects; 2007, Bordeaux, France. 244. [Google Scholar]
  136. Lecours  A, Joanette  Y. Linguistic and other psychological aspects of paroxysmal aphasia. Brain Lang. 1980:10(1):1–23. [DOI] [PubMed] [Google Scholar]
  137. Lerdahl  F, Jackendoff  R. Toward a formal theory of tonal music. J Music Theory. 1977:21(1):111–171. [Google Scholar]
  138. Lerdahl  F, Jackendoff  R. An overview of hierarchical structure in music. Music Percept. 1983:1(2):229–252. [Google Scholar]
  139. Levin  B, Rappaport-Hovav  M. Argument realization. Cambridge: Cambridge University Press; 2005. [Google Scholar]
  140. Levitin  DJ, Menon  V. Musical structure is processed in “language” areas of the brain: a possible role for Brodmann Area 47 in temporal coherence. NeuroImage. 2003:20(4):2142–2152. [DOI] [PubMed] [Google Scholar]
  141. Linebarger  MC, Schwartz  MF, Saffran  EM. Sensitivity to grammatical structure in so-called agrammatic aphasics. Cognition. 1983:13(3):361–392. [DOI] [PubMed] [Google Scholar]
  142. Lindblom  B, Sundberg  J. Towards a generative theory of melody. Speech transmission laboratory. Q Prog Status Rep. 1969:10:53–86. [Google Scholar]
  143. Lipkin  B, Tuckute  G, Affourtit  J, Small  H, Mineroff  Z, Jouravlev  O, Rakocevic  L, Pritchett  B, et al.  Probabilistic atlas for the language network based on precision fMRI data from> 800 individuals. Sci Data. 2022:9(1):1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  144. Liu  YF, Kim  J, Wilson  C, Bedny  M. Computer code comprehension shares neural resources with formal logical inference in the fronto-parietal network. eLife. 2020:9:e59340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  145. Liu  J, Hilton  CB, Bergelson  E, Mehr  SA. Language experience shapes music processing across 40 tonal, pitch-accented, and non-tonal languages. bioRxiv. 2021. [Google Scholar]
  146. Luria  AR, Tsvetkova  LS, Futer  DS. Aphasia in a composer. J Neurol Sci. 1965:2(3):288–292. [DOI] [PubMed] [Google Scholar]
  147. Maess  B, Koelsch  S, Gunter  TC, Friederici  AD. Musical syntax is processed in Broca's area: an MEG study. Nat Neurosci. 2001:4(5):540–545. [DOI] [PubMed] [Google Scholar]
  148. Mahowald  K, Fedorenko  E. Reliable individual-level neural markers of high-level language processing: a necessary precursor for relating neural variability to behavioral and genetic variability. NeuroImage. 2016:139:74–93. [DOI] [PubMed] [Google Scholar]
  149. Makowski  D. The psycho package: an efficient and publishing-oriented workflow for psychological science. J Open Source Softw. 2018:3(22):470. [Google Scholar]
  150. Malik-Moraleda  S*, Ayyash  D*, Gallée  J, Affourtit  J, Hoffmann  M, Mineroff  Z, Jouravlev  O, Fedorenko  E. An investigation across 45 languages and 12 language families reveals a universal language network. Nat Neurosci. 2022:25(8):1014–1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  151. Marin  OSM. Neurological aspects of music perception and performance. New York (NY): Academic Press; 1982. [Google Scholar]
  152. Matchin  W, Hickok  G. The cortical organization of syntax. Cereb Cortex. 2020:30(3):1481–1498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  153. Mehr  SA, Krasnow  MM. Parent-offspring conflict and the evolution of infant-directed song. Evol Hum Behav. 2017:38(5):674–684. [Google Scholar]
  154. Mehr  SA, Singh  M, Knox  D, Ketter  DM, Pickens-Jones  D, Atwood  S, Lucas  C, Jacoby  N, Egner  AA, Hiopkins  EJ, et al.  Universality and diversity in human song. Science. 2019:366(6468):eaax0868. [DOI] [PMC free article] [PubMed] [Google Scholar]
  155. Mehr  SA, Krasnow  M, Bryant  G, Hagen  E. Origins of music in credible signaling. Behav Brain Sci. 2020:44:e60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  156. Mesulam  MM, Rogalski  EJ, Wieneke  C, Hurley  RS, Geula  C, Bigio  EH, Thompson  CK, Weintraub  S. Primary progressive aphasia and the evolving neurology of the language network. Nat Rev Neurol. 2014:10(10):554. [DOI] [PMC free article] [PubMed] [Google Scholar]
  157. Mesulam  MM, Thompson  CK, Weintraub  S, Rogalski  EJ. The Wernicke conundrum and the anatomy of language comprehension in primary progressive aphasia. Brain. 2015:138(8):2423–2437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  158. Meyer LB. On rehearing music. Journal of the American Musicological Society. 1961:14(2):257–67. [Google Scholar]
  159. McDermott  J. The evolution of music. Nature. 2008:453(7193):287–288. [DOI] [PubMed] [Google Scholar]
  160. Mineroff  Z*, Blank  I*, Mahowald  K, Fedorenko  E. A robust dissociation among the language, multiple demand, and default mode networks: evidence from inter-region correlations in effect size. Neuropsychologia. 2018:119:501–511. [DOI] [PMC free article] [PubMed] [Google Scholar]
  161. Monti  MM, Parsons  LM, Osherson  DN. The boundaries of language and thought in deductive inference. Proc Natl Acad Sci U S A. 2009:106(30):12554–12559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  162. Monti  MM, Parsons  LM, Osherson  DN. Thought beyond language: neural dissociation of algebra and natural language. Psychol Sci. 2012:23(8):914–922. [DOI] [PubMed] [Google Scholar]
  163. Morosan  P, Rademacher  J, Schleicher  A, Amunts  K, Schormann  T, Zilles  K. Human primary auditory cortex: cytoarchitectonic subdivisions and mapping into a spatial reference system. NeuroImage. 2001:13(4):684–701. [DOI] [PubMed] [Google Scholar]
  164. Musso  M, Weiller  C, Horn  A, Glauche  V, Umarova  R, Hennig  J, Schneider  A, Rijntjes  M. A single dual-stream framework for syntactic computations in music and language. NeuroImage. 2015:117:267–283. [DOI] [PubMed] [Google Scholar]
  165. Nettl  B. The study of ethnomusicology: thirty-three discussions. Champaign (IL): University of Illinois Press; 2015. [Google Scholar]
  166. Newman  AJ, Pancheva  R, Ozawa  K, Neville  HJ, Ullman  MT. An event-related fMRI study of syntactic and semantic violations. J Psycholinguist Res. 2001:30(3):339–364. [DOI] [PubMed] [Google Scholar]
  167. Ngo  MK, KPL  V, Strybel  TZ. Effects of music and tonal language experience on relative pitch performance. Am J Psychol. 2016:129(2):125–134. [DOI] [PubMed] [Google Scholar]
  168. Nieto-Castañón  A, Fedorenko  E. Subject-specific functional localizers increase sensitivity and functional resolution of multi-subject analyses. NeuroImage. 2012:63(3):1646–1669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  169. Nieto-Castañón  A. Handbook of functional connectivity Magnetic Resonance Imaging methods in CONN. Hilbert Press. 2020. [Google Scholar]
  170. Norman-Haignere  S, Kanwisher  N, McDermott  JH. Cortical pitch regions in humans respond primarily to resolved harmonics and are located in specific tonotopic regions of anterior auditory cortex. J Neurosci. 2013:33(50):19451–19469. [DOI] [PMC free article] [PubMed] [Google Scholar]
  171. Norman-Haignere  S, Kanwisher  NG, McDermott  JH. Distinct cortical pathways for music and speech revealed by hypothesis-free voxel decomposition. Neuron. 2015:88(6):1281–1296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  172. Norman-Haignere  SV, Feather  J, Boebinger  D, Brunner  P, Ritaccio  A, McDermott  JH, Schalk  G, Kanwisher  N. A neural population selective for song in human auditory cortex. Curr Biol. 2022:32(7):1470–1484. [DOI] [PMC free article] [PubMed] [Google Scholar]
  173. Oldfield  RC. The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia. 1971:9(1):97–113. [DOI] [PubMed] [Google Scholar]
  174. Omigie  D, Samson  S. A protective effect of musical expertise on cognitive outcome following brain damage?  Neuropsychol Rev. 2014:24(4):445–460. [DOI] [PubMed] [Google Scholar]
  175. Overath  T, McDermott  JH, Zarate  JM, Poeppel  D. The cortical analysis of speech-specific temporal structure revealed by responses to sound quilts. Nat Neurosci. 2015:18(6):903–911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  176. Overy K. Dyslexia and music: From timing deficits to musical intervention. Ann. N. Y. Acad. Sci. 2003:999(1):497–505. [DOI] [PubMed] [Google Scholar]
  177. Pallier  C, Devauchelle  AD, Dehaene  S. Cortical representation of the constituent structure of sentences. Proc Natl Acad Sci U S A. 2011:108(6):2522–2527. [DOI] [PMC free article] [PubMed] [Google Scholar]
  178. Patel  AD, Gibson  E, Ratner  J, Besson  M, Holcomb  PJ. Processing syntactic relations in language and music: an event-related potential study. J Cogn Neurosci. 1998:10(6):717–733. [DOI] [PubMed] [Google Scholar]
  179. Patel  AD. Language, music, syntax and the brain. Nat Neurosci. 2003:6(7):674–681. [DOI] [PubMed] [Google Scholar]
  180. Patel  AD. Music, language, and the brain. Oxford (UK): Oxford University Press; 2008. [Google Scholar]
  181. Patel  AD, Iversen  JR, Wassenaar  M, Hagoort  P. Musical syntactic processing in agrammatic Broca's aphasia. Aphasiology. 2008a:22(7-8):776–789. [Google Scholar]
  182. Patel  AD, Wong  M, Foxton  J, Lochy  A, Peretz  I. Speech intonation perception deficits in musical tone deafness (congenital amusia). Music Percept. 2008b:25(4):357–368. [Google Scholar]
  183. Patel  AD. Language, music, and the brain: a resource-sharing framework. In: Rebuschat  P, Rohrmeier  M, Hawkins  J, Cross  I, editors. Language and music as cognitive systems. Oxford: Oxford University Press; 2012. pp. 204–223. [Google Scholar]
  184. Patel  AD, Morgan  E. Exploring cognitive relations between prediction in language and music. Cogn Sci. 2017:41:303–320. [DOI] [PubMed] [Google Scholar]
  185. Patterson  RD, Uppenkamp  S, Johnsrude  IS, Griffiths  TD. The processing of temporal pitch and melody information in auditory cortex. Neuron. 2002:36(4):767–776. [DOI] [PubMed] [Google Scholar]
  186. Paunov  A, Blank  IA, Fedorenko  E. Functionally distinct language and theory of mind networks are synchronized at rest and during language comprehension. J Neurophysiol. 2019:121:1244–1265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  187. Paunov  AM, Blank  IA, Jouravlev  O, Mineroff  Z, Gallée  J, Fedorenko  E. Differential tracking of linguistic vs. mental state content in naturalistic stimuli by language and theory of mind (ToM) brain networks. Neurobiol Lang. 2022:3(3):413–440. [DOI] [PMC free article] [PubMed] [Google Scholar]
  188. Peelle  JE, Troiani  V, Wingfield  A, Grossman  M. Neural processing during older adults’ comprehension of spoken sentences: age differences in resource allocation and connectivity. Cereb Cortex. 2010:20(4):773–782. [DOI] [PMC free article] [PubMed] [Google Scholar]
  189. Penagos  H, Melcher  JR, Oxenham  AJ. A neural representation of pitch salience in nonprimary human auditory cortex revealed with functional magnetic resonance imaging. J Neurosci. 2004:24(30):6810–6815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  190. Peretz  I. Processing of local and global musical information by unilateral brain-damaged patients. Brain. 1990:113(4):1185–1205. [DOI] [PubMed] [Google Scholar]
  191. Peretz  I, Kolinsky  R, Tramo  M, Labrecque  R, Hublet  C, Demeurisse  G, Belleville  S. Functional dissociations following bilateral lesions of auditory cortex. Brain. 1994:117(6):1283–1301. [DOI] [PubMed] [Google Scholar]
  192. Peretz  I, Belleville  S, Fontaine  S. Dissociations between music and language functions after cerebral resection: a new case of amusia without aphasia. Can J Exp Psychol. 1997:51(4):354–368. [PubMed] [Google Scholar]
  193. Peretz  I, Champod  AS, Hyde  K. Varieties of musical disorders: the Montreal Battery of Evaluation of Amusia. Ann N Y Acad Sci. 2003:999(1):58–75. [DOI] [PubMed] [Google Scholar]
  194. Peretz  I, Coltheart  M. Modularity of music processing. Nat Neurosci. 2003:6(7):688–691. [DOI] [PubMed] [Google Scholar]
  195. Peretz  I, Vuvan  D, Lagrois  MÉ, Armony  JL. Neural overlap in processing music and speech. Philos Trans R Soc Lond Ser B Biol Sci. 2015:370(1664):20140090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  196. Perrachione  TK, Fedorenko  EG, Vinke  L, Gibson  E, Dilley  LC. Evidence for shared cognitive processing of pitch in music and language. PLoS One. 2013:8(8):e73372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  197. Perruchet  P, Poulin-Charronnat  B. Challenging prior evidence for a shared syntactic processor for language and music. Psychon Bull Rev. 2013:20(2):310–317. [DOI] [PubMed] [Google Scholar]
  198. Piccirilli  M, Sciarma  T, Luzzi  S. Modularity of music: evidence from a case of pure amusia. J Neurol Neurosurg Psychiatry. 2000:69(4):541–545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  199. Pinker  S, Prince  A. On language and connectionism: analysis of a parallel distributed processing model of language acquisition. Cognition. 1988:28(1-2):73–193. [DOI] [PubMed] [Google Scholar]
  200. Pinker  S. Rules of language. Science. 1991:253(5019):530–535. [DOI] [PubMed] [Google Scholar]
  201. Pinker  S. The language instinct: how the mind creates language. New York (NY): Harper Collins Publishers, Inc.; 1994. [Google Scholar]
  202. Pinker  S. Out of the minds of babes. Science. 1999:283(5398):40–41. [DOI] [PubMed] [Google Scholar]
  203. Poldrack  RA. Can cognitive processes be inferred from neuroimaging data?  Trends Cogn Sci. 2006:10(2):59–63. [DOI] [PubMed] [Google Scholar]
  204. Poldrack  RA. Inferring mental states from neuroimaging data: from reverse inference to large-scale decoding. Neuron. 2011:72(5):692–697. [DOI] [PMC free article] [PubMed] [Google Scholar]
  205. Polk  M, Kertesz  A. Music and language in degenerative disease of the brain. Brain Cogn. 1993:22(1):98–117. [DOI] [PubMed] [Google Scholar]
  206. Poulin-Charronnat  B, Bigand  E, Madurell  F, Peereman  R. Musical structure modulates semantic priming in vocal music. Cognition. 2005:94:B67–B78. [DOI] [PubMed] [Google Scholar]
  207. Pritchett  B, Hoeflin  C, Koldewyn  K, Dechter  E, Fedorenko  E. High-level language processing regions are not engaged in action observation or imitation. J Neurophysiol. 2018:120(5):2555–2570. [DOI] [PMC free article] [PubMed] [Google Scholar]
  208. Raffman  D. Language, music, and mind. The MIT Press; 1993. [Google Scholar]
  209. Regev TI, Affourtit J, Chen X, Schipper AE, Bergen L, Mahowald K, Fedorenko E. High-level language brain regions are sensitive to sub-lexical regularities. BioRxiv. 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  210. Riemann  H. Musikalische Syntaxis: Grundriss einer harmonischen Satzbildungslehre. Leipzig: Breitkopf und Härtel; 1877. [Google Scholar]
  211. Roads  C, Wieneke  P. Grammars as representations for music. Comput Music J. 1979:3(1):48–55. [Google Scholar]
  212. Roberts  I. Comments and a conjecture inspired by Fabb and Halle. In: Rebuschat  P, Rohrmeier  M, Hawkins  JA, Cross  I, editors. Language and music as cognitive systems. Oxford: Oxford University Press; 2012. pp. 51–66. [Google Scholar]
  213. Röder  B, Stock  O, Neville  H, Bien  S, Rösler  F. Brain activation modulated by the comprehension of normal and pseudo-word sentences of different processing demands: a functional magnetic resonance imaging study. NeuroImage. 2002:15(4):1003–1014. [DOI] [PubMed] [Google Scholar]
  214. Rogalsky C, Rong F, Saberi K, Hickok G. Functional anatomy of language and music perception: temporal and structural factors investigated using functional magnetic resonance imaging. J Neurosci. 2011:31(10):3843–3852. [DOI] [PMC free article] [PubMed] [Google Scholar]
  215. Rueffler  C, Hermisson  J, Wagner  GP. Evolution of functional specialization and division of labor. Proc Natl Acad Sci U S A. 2012:109(6):E326–E335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  216. Sag  I, Wasow  T, Bender  E. Syntactic Theory, an Formal Introduction. Stanford(CA): CSLI Publication; 2003. [Google Scholar]
  217. Salvo JJ, Holubecki AM, Braga RM. Correspondence between functional connectivity and task-related activity patterns within the individual. Curr Opin Behav Sci. 2021:40:178–188. [Google Scholar]
  218. Sammler  D, Koelsch  S, Ball  T, Brandt  A, Elger  CE, Friederici  AD, Grigutsch  M, Huppertz  H-J, Knosche  TR, Wellmer  J, et al.  Overlap of musical and linguistic syntax processing: intracranial ERP evidence. Ann N Y Acad Sci. 2009:1169(1):494–498. [DOI] [PubMed] [Google Scholar]
  219. Sammler  D, Koelsch  S, Friederici  AD. Are left fronto-temporal brain areas a prerequisite for normal music-syntactic processing?  Cortex. 2011:47(6):659–673. [DOI] [PubMed] [Google Scholar]
  220. Sammler  D, Koelsch  S, Ball  T, Brandt  A, Grigutsch  M, Huppertz  HJ, Wellmer  J, Widman  G, Elger  CE, Friederici  AD, et al.  Co-localizing linguistic and musical syntax with intracranial EEG. NeuroImage. 2013:64:134–146. [DOI] [PubMed] [Google Scholar]
  221. Savage  PE, Loui  P, Tarr  B, Schachner  A, Glowacki  L, Mithen  S, Fitch  WT. Music as a coevolved system for social bonding. Behav Brain Sci. 2021:44(e59):1–22. [DOI] [PubMed] [Google Scholar]
  222. Saxe R, Brett M, Kanwisher N. Divide and conquer: a defense of functional localizers. Neuroimage. 2006:30(4):1088–96. [DOI] [PubMed] [Google Scholar]
  223. Schmidt  S. Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Rev Gen Psychol. 2009:13(2):90–100. [Google Scholar]
  224. Scott  TL, Gallée  J, Fedorenko  E. A new fun and robust version of an fMRI localizer for the frontotemporal language system. Cogn Neurosci. 2017:8(3):167–176. [DOI] [PubMed] [Google Scholar]
  225. Shain  C*, Blank  I*, Van Shijndel  M, Schuler  W, Fedorenko  E. fMRI reveals language-specific predictive coding during naturalistic sentence comprehension. Neuropsychologia. 2020:138:107307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  226. Shain  C, Kean  H, Lipkin  B, Affourtit  J, Siegelman  M, Mollica  F, Fedorenko  E. ‘Constituent length’effects in fMRI do not provide evidence for abstract syntactic processing. bioRxiv. 2021. [Google Scholar]
  227. Shain  C, Blank  IA, Fedorenko  E, Gibson  E, Schuler  W. Robust effects of working memory demand during naturalistic language comprehension in language-selective cortex. J Neurosci. 2022:42(39):7412–7430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  228. Shashidhara  S, Mitchell  DJ, Erez  Y, Duncan  J. Progressive recruitment of the frontoparietal multiple-demand system with increased task complexity, time pressure, and reward. J Cogn Neurosci. 2019:31(11):1617–1630. [DOI] [PMC free article] [PubMed] [Google Scholar]
  229. Sihvonen  AJ, Särkämö  T, Leo  V, Tervaniemi  M, Altenmüller  E, Soinila  S. Music-based interventions in neurological rehabilitation. Lancet Neurol. 2017:16(8):648–660. [DOI] [PubMed] [Google Scholar]
  230. Slevc  LR, Rosenberg  JC, Patel  AD. Making psycholinguistics musical: self-paced reading time evidence for shared processing of linguistic and musical syntax. Psychon Bull Rev. 2009:16(2):374–381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  231. Slevc  LR, Reitman  J, Okada  B. 2013. Syntax in music and language: the role of cognitive control. In: Proceedings of the Annual Meeting of the Cognitive Science Society; 2013 Jul 31-Aug 3; Berlin, Germany; p. 3414–3419. [Google Scholar]
  232. Slevc  LR, Okada  BM. Processing structure in language and music: a case for shared reliance on cognitive control. Psychon Bull Rev. 2015:22(3):637–652. [DOI] [PubMed] [Google Scholar]
  233. Slevc  LR, Faroqi-Shah  Y, Saxena  S, Okada  BM. Preserved processing of musical structure in a person with agrammatic aphasia. Neurocase. 2016:22(6):505–511. [DOI] [PubMed] [Google Scholar]
  234. Stromswold  K, Caplan  D, Alpert  N, Rauch  S. Localization of syntactic comprehension by positron emission tomography. Brain Lang. 1996:52(3):452–473. [DOI] [PubMed] [Google Scholar]
  235. Sueoka  Y, Paunov  A, Ivanova  A, Blank  IA, Fedorenko  E. The language network reliably ‘tracks’ naturalistic meaningful non-verbal stimuli. bioRxiv. 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  236. Sullivan  GM, Feinn  R. Using effect size—or why the P value is not enough. J Grad Med Educ. 2012:4(3):279–282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  237. Sundberg  J, Lindblom  B. Generative theories in language and music descriptions. Cognition. 1976:4(1):99–122. [Google Scholar]
  238. Swain  JP. The concept of musical syntax. Music Q. 1995:79(2):281–308. [Google Scholar]
  239. Tahmasebi  AM, Davis  MH, Wild  CJ, Rodd  JM, Hakyemez  H, Abolmaesumi  P, Johnsrude  IS. Is the link between anatomical structure and function equally strong at all cognitive levels of processing?  Cereb Cortex. 2012:22(7):1593–1603. [DOI] [PubMed] [Google Scholar]
  240. Tallal P, Gaab N. Dynamic auditory processing, musical experience and language development. Trends Neurosci. 2006:29(7):382–390. [DOI] [PubMed] [Google Scholar]
  241. Tarantola  A. Inverse problem theory and methods for model parameter estimation. Philadelphia (PA): Society for Industrial and Applied Mathematics; 2005. [Google Scholar]
  242. Temperley  D. Music and language. Annu Rev Linguist. 2022:8:153–170. [Google Scholar]
  243. te  Rietmolen  NA, Mercier  M, Trebuchon  A, Morillon  B, Schon  D. Speech and music recruit frequency-specific distributed and overlapping cortical networks. bioRxiv. 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  244. Tillmann  B, Janata  P, Bharucha  JJ. Activation of the inferior frontal cortex in musical priming. Cogn Brain Res. 2003:16(2):145–161. [DOI] [PubMed] [Google Scholar]
  245. Tillmann  B, Koelsch  S, Escoffier  N, Bigand  E, Lalitte  P, Friederici  AD, von  Cramon  DY. Cognitive priming in sung and instrumental music: activation of inferior frontal cortex. NeuroImage. 2006:31(4):1771–1782. [DOI] [PubMed] [Google Scholar]
  246. Tillmann  B. Music and language perception: expectations, structural integration, and cognitive sequencing. Top Cogn Sci. 2012:4(4):568–584. [DOI] [PubMed] [Google Scholar]
  247. Trehub  SE. The developmental origins of musicality. Nat Neurosci. 2003:6(7):669–673. [DOI] [PubMed] [Google Scholar]
  248. Tyler  LK, Marslen-Wilson  WD, Randall  B, Wright  P, Devereux  BJ, Zhuang  J, Papoutsi  M, Stamatakis  EA. Left inferior frontal cortex and syntax: function, structure and behaviour in patients with left hemisphere damage. Brain. 2011:134(2):415–431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  249. Van de Cavey  J, Hartsuiker  RJ. Is there a domain-general cognitive structuring system? Evidence from structural priming across music, math, action descriptions, and language. Cognition. 2016:146:172–184. [DOI] [PubMed] [Google Scholar]
  250. Varley  R, Siegal  M. Evidence for cognition without grammar from causal reasoning and “theory of mind” in an agrammatic aphasic patient. Curr Biol. 2000:10(12):723–726. [DOI] [PubMed] [Google Scholar]
  251. Varley  RA, Klessinger  NJ, Romanowski  CA, Siegal  M. Agrammatic but numerate. Proc Natl Acad Sci U S A. 2005:102(9):3519–3524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  252. Vázquez-Rodríguez  B, Suárez  LE, Markello  RD, Shafiei  G, Paquola  C, Hagmann  P, van den  Heuvel  MP, Bernhardt  BC, Spreng  RN, Misic  B. Gradients of structure–function tethering across neocortex. Proc Natl Acad Sci U S A. 2019:116(42):21219–21227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  253. Vuust  P, Heggli  OA, Friston  KJ, Kringelbach  ML. Music in the brain. Nat Rev Neurosci. 2022:23(5):287–305. [DOI] [PubMed] [Google Scholar]
  254. Wehbe  L, Blank  I, Shain  C, Futrell  R, Levy  R, Malsburg  T, Smith  N, Gibson  E, Fedorenko  E. Incremental language comprehension difficulty predicts activity in the language network but not the multiple demand network. Cereb Cortex. 2021:31(9):4006–4023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  255. Westfall  J, Kenny  DA, Judd  CM. Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli. J Exp Psychol. 2014:143(5):2020–2045. [DOI] [PubMed] [Google Scholar]
  256. Willems  RM, Van der Haegen  L, Fisher  SE, Francks  C. On the other hand: including left-handers in cognitive neuroscience and neurogenetics. Nat Rev Neurosci. 2014:15(3):193–201. [DOI] [PubMed] [Google Scholar]
  257. Wilson  SM, Saygın  AP. Grammaticality judgment in aphasia: deficits are not specific to syntactic structures, aphasic syndromes, or lesion sites. J Cogn Neurosci. 2004:16(2):238–252. [DOI] [PubMed] [Google Scholar]
  258. Wilson  SM, Galantucci  S, Tartaglia  MC, Gorno-Tempini  ML. The neural basis of syntactic deficits in primary progressive aphasia. Brain Lang. 2012:122(3):190–198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  259. Wong  PC, Perrachione  TK. Learning pitch patterns in lexical identification by native English-speaking adults. Appl Psycholinguist. 2007:28(4):565–585. [Google Scholar]
  260. Wong PC, Skoe E, Russo NM, Dees T, Kraus N. Musical experience shapes human brainstem encoding of linguistic pitch patterns. Nat Neurosci. 2007:10(4):420–422. [DOI] [PMC free article] [PubMed] [Google Scholar]
  261. Woolgar  A, Duncan  J, Manes  F, Fedorenko  E. Fluid intelligence is supported by the multiple-demand system not the language system. Nat Hum Behav. 2018:2(3):200–204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  262. Zatorre  RJ. Musical perception and cerebral function: a critical review. Music Percept. 1984:2(2):196–221. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Chen_et_al_SI_bhad087

Data Availability Statement

The data sets generated during and/or analyzed during the current study are available in the OSF repository: https://osf.io/68y7c/.


Articles from Cerebral Cortex (New York, NY) are provided here courtesy of Oxford University Press

RESOURCES