Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Aug 1.
Published in final edited form as: Hum Brain Mapp. 2018 Apr 17;39(8):3285–3307. doi: 10.1002/hbm.24077

An adaptive semantic matching paradigm for reliable and valid language mapping in individuals with aphasia

Stephen M Wilson 1,*, Melodie Yen 1, Dana K Eriksson 2
PMCID: PMC6045968  NIHMSID: NIHMS956639  PMID: 29665223

Abstract

Research on neuroplasticity in recovery from aphasia depends on the ability to identify language areas of the brain in individuals with aphasia. However, tasks commonly used to engage language processing in people with aphasia, such as narrative comprehension and picture naming, are limited in terms of reliability (test-retest reproducibility) and validity (identification of language regions, and not other regions). On the other hand, paradigms such as semantic decision that are effective in identifying language regions in people without aphasia can be prohibitively challenging for people with aphasia. This paper describes a new semantic matching paradigm that uses an adaptive staircase procedure to present individuals with stimuli that are challenging yet within their competence, so that language processing can be fully engaged in people with and without language impairments. The feasibility, reliability and validity of the adaptive semantic matching paradigm were investigated in sixteen individuals with chronic post-stroke aphasia and fourteen neurologically normal participants, in comparison to narrative comprehension and picture naming paradigms. All participants succeeded in learning and performing the semantic paradigm. Test-retest reproducibility of the semantic paradigm in people with aphasia was good (Dice coefficient = 0.66), and was superior to the other two paradigms. The semantic paradigm revealed known features of typical language organization (lateralization; frontal and temporal regions) more consistently in neurologically normal individuals than the other two paradigms, constituting evidence for validity. In sum, the adaptive semantic matching paradigm is a feasible, reliable and valid method for mapping language regions in people with aphasia.

Introduction

Damage to brain regions involved in language processing typically results in aphasia. However, language function can improve over time, either spontaneously (Kertesz and McCabe, 1977; Swinburn, Porter, & Howard, 2004), or potentially in response to behavioral (Brady, Kelly, Godwin, Enderby, & Campbell, 2016), neuromodulatory (Shah, Szaflarski, Allendorfer, & Hamilton, 2013) or pharmacological (Berthier & Davila, 2014) interventions. Recovery from aphasia after damage to language regions of the brain is thought to depend on neural plasticity, that is, functional reorganization of surviving brain regions such that they take on new or expanded roles in language processing (Geranmayeh, Brownsett, & Wise, 2014; Heiss & Thiel, 2006; Nadeau, 2014; Price & Crinion, 2005; Saur et al., 2006; Saur & Hartwigsen, 2012; Turkeltaub, Messing, Norise, & Hamilton, 2011). There is currently a great deal of interest in characterizing the nature of this putative process of functional reorganization. A better understanding of when and how reorganization takes place, how different patterns of reorganization depend on patient-specific factors, and how different patterns are associated with better or worse outcomes, could inform the design of new therapies, and could facilitate optimal targeting of specific interventions to individual patients (Fridriksson, Richardson, Fillmore, & Cai, 2012; Thompson & den Ouden, 2008).

This line of research critically depends on being able to identify brain regions involved in language processing in individual patients, and being able to determine with statistical rigor whether they change over time (Kiran et al., 2013; Meinzer et al., 2013; Wilson, Bautista, Yen, Lauderdale, & Eriksson, 2017). Language areas of the brain can be identified with functional magnetic resonance imaging (fMRI) using language mapping paradigms, which generally contrast conditions that involve language processing to conditions that do not (Binder, Swanson, Hammeke, & Sabsevitz, 2008). To support research on functional reorganization of language regions in recovery from aphasia, a language mapping paradigm needs to meet at least three criteria.

First, it must be feasible and appropriate for individuals with aphasia. The most common clinical application of language mapping is in presurgical contexts, in which the aim is to avoid resecting eloquent cortex. These patients usually do not have significant language impairments, so they can readily perform a range of language tasks (Binder et al., 2008). In contrast, individuals with aphasia are impaired in language processing, which implies that they will likely experience difficulty with language tasks, and depending on the task, may not be able to perform it at all. It is difficult to interpret activation maps associated with failure to perform a task (Price, Crinion, & Friston, 2006), presenting a challenge: how can language processing be engaged in a controlled manner in people whose language function is by definition compromised?

Second, the language mapping paradigm must be reliable. In other words, a map of language regions obtained in a given participant on a given occasion should be reproducible in the same participant on a different occasion (Bennett & Miller, 2010). This is referred to as test-retest reproducibility. Research on neuroplasticity requires being able to distinguish genuine changes from scan-to-scan variability (Kiran et al., 2013; Meinzer et al., 2013; Wilson et al., 2017).

Third, the language mapping paradigm must be valid, that is, it must identify all and only the regions that are actually critical for language, as opposed to perceptual, motor, cognitive, and executive regions that may be recruited by different tasks. This implied dichotomy between language and non-language regions is an oversimplification, given that domain-general regions are also needed to support language processing (Fedorenko & Thompson-Schill, 2014), but in most neurologically normal individuals, there are core frontal and temporal language regions, which are lateralized to the left hemisphere (Knecht et al., 2003; Seghier, Kherif, Josse, & Price, 2011; Springer et al., 1999; Tzourio-Mazoyer et al., 2010; Bradshaw, Thompson, Wilson, Bishop, & Woodhead, 2017). In neurological populations, other patterns of organization may be observed (Berl et al., 2014).

We will show with reference to three commonly used paradigms that these three criteria—feasibility, reliability and validity—have rarely if ever been simultaneously met in research to date. Narrative comprehension paradigms have often been used in aphasia recovery research (Crinion, Warburton, Lambon Ralph, Howard, & Wise, 2006; Crinion & Price, 2005; Warren, Crinion, Lambon Ralph, & Wise, 2009). Narrative comprehension is feasible for most individuals with aphasia, since it requires no responses, and comprehension and/or recall can be quantified in post-scan testing. Acoustically matched control conditions, such as backwards speech, have typically been used. Probably due to the relatively unconstrained nature of both the language and the control conditions, the reliability of narrative comprehension paradigms has been empirically demonstrated to be moderate at best (Harrington, Buonocore, & Farias, 2006; Maldijan, Laurienti, Driskill, & Burdette, 2002; Wilson et al., 2017). The validity of narrative comprehension for mapping the language network is somewhat marginal. While activations are somewhat left-lateralized (Harrington et al., 2006; Maldijan et al., 2002; Wilson et al., 2017), there is a substantial right hemisphere component in neurologically normal individuals (Crinion et al., 2003), so that right hemisphere activation in patients cannot be interpreted as reflecting reorganization (Crinion and Price, 2005). Moreover, only temporal lobe regions are activated with good sensitivity (Crinion, Lambon Ralph, Warburton, Howard, & Wise, 2003; Wilson et al., 2017), so frontal regions important for language processing are not amenable to study.

Picture naming is another simple and feasible task that has been widely used (Abel, Weiller, Huber, Willmes, & Specht, 2015; Fridriksson, Baker, & Moser, 2009; Postman-Caucheteux et al., 2010). Anomia is ubiquitous in aphasia, and correct and incorrect items can be separated and compared (Fridriksson et al., 2009; Postman-Caucheteux et al., 2010). Scrambled pictures have often been used as control stimuli. Picture naming paradigms are moderately reliable, but the reproducible activations tend to be in bilateral sensorimotor areas which are uninformative with respect to language localization (Harrington et al., 2006; Jansen et al., 2006; Meltzer, Postman-Caucheteux, McArdle, & Braun, 2009; Rau et al., 2007; Rutten, Ramsey, van Rijen, & van Veelen, 2002; Wilson et al., 2017). Validity is significantly limited: picture naming paradigms show only modest lateralization, and often do not activate frontal and/or temporal sites (Harrington et al., 2006; Jansen et al., 2006; Rau et al., 2007; Rutten et al., 2002; Wilson et al., 2017).

Semantic decision paradigms are widely used clinically in presurgical language mapping with non-aphasic patients. For example, Binder et al. (1997) described a paradigm that has been used in many subsequent studies (Binder et al., 2008). In the language condition, participants hear a series of animal names and have to decide if each animal is found in the United States and used by humans. In the control condition, participants make perceptual decisions about sequences of high and low tones (deciding whether exactly two high tones occurred). Semantic decision paradigms show good test-retest reproducibility (Fernández et al., 2003; Fesl et al., 2010), probably due to the highly constrained processing involved in both the language and control tasks. Moreover, the deep, active, challenging language processing engendered by semantic tasks reliably activates strongly lateralized frontal and temporal language regions in people without language deficits, providing evidence for validity (Binder et al., 1997; 2008; Fesl et al., 2010; Janecek et al., 2013; Springer et al., 1999; Szaflarski et al., 2008).

However, despite the good reliability and validity of semantic decision paradigms, the feasibility of these tasks in individuals with aphasia is questionable. Variants of the Binder task have been used in aphasia recovery research (Eaton et al., 2008; Griffis et al., 2017; Kim, Karunanayaka, Privitera, Holland, & Szaflarski, 2011; Szaflarski, Allendorfer, Banks, Vannest, & Holland, 2013). Not surprisingly, given the complexity of the tasks, patient performance has been very poor. For instance, in one study, individuals with unresolved aphasia post middle cerebral artery stroke performed at chance not only on the semantic condition (47.6%) but also on the tone decision control condition (52.2%) (Szaflarski et al., 2013). It appears to be prohibitively challenging for many individuals with aphasia to maintain the complex verbal instructions pertaining to each condition, switch between them, apply the criteria to evaluate incoming stimuli, and select between two different response buttons.

In sum, these three language mapping paradigms that have often been used in studies investigating neuroplasticity in recovery from aphasia all have significant limitations in terms of feasibility, validity and/or reliability. This has been a roadblock to progress in understanding functional reorganization of language regions in recovery from aphasia. Various other paradigms have also been used in studies of people with aphasia, including word repetition (Heiss, Kessler, Thiel, Ghaemi, & Karbe, 1999; Weiller et al., 1995), sentence comprehension (Saur et al., 2006), verb generation (Allendorfer, Kissela, Holland, & Szaflarski, 2012; Weiller et al., 1995), and simpler semantic decision paradigms that certain patients are able to perform (Kiran et al., 2015; Robson et al., 2014; Sharp, Scott, & Wise, 2004; van Oers et al., 2010; Zahn et al., 2004). These paradigms vary widely in terms of their feasibility (Price et al., 2006), reliability (Wilson et al., 2017) and validity (Bradshaw et al., 2017), but to our knowledge, no studies to date have evaluated all three of these psychometric properties of the paradigms they have used.

To address this limitation, we developed a paradigm that builds on the strong reliability and validity of semantic decision paradigms, yet is modified so as to be suitable for individuals with aphasia. Specifically, we developed an adaptive semantic matching paradigm in which a conceptually simple task is melded with an adaptive staircase procedure (Leek, 2001) that tailors item difficulty to individual performance, so that the same paradigm can be used to map language regions in people with different degrees of language impairment, as well as in individuals with normal language function. The adaptive nature of the paradigm entails that regardless of their level of language function, all people are performing a focused, challenging task that is highly constrained in terms of the linguistic and other processing required.

We investigated the feasibility, reliability and validity of this adaptive semantic paradigm, in comparison to narrative comprehension and picture naming paradigms. Sixteen individuals with chronic and stable post-stroke aphasia were each scanned on two separate occasions, and fourteen neurologically normal individuals were each scanned once. During each scanning session, all participants performed the three language mapping paradigms. Structural imaging data were also acquired, and language deficits were quantified. Feasibility was evaluated in terms of the ability of individuals with aphasia to learn the tasks and perform them in the scanner. Reliability was assessed in individuals with aphasia by quantifying the similarity of the activation maps obtained in the two sessions using the Dice coefficient of similarity. Validity was evaluated primarily in the neurologically normal individuals, in whom there are strong a priori expectations that lateralized frontal and temporal language regions should be activated; validity in individuals with aphasia was also examined in an exploratory manner. Finally, we examined whether our findings were impacted by several analysis parameters: region of interest, absolute or relative vowelwise thresholds, and cluster extent threshold.

Methods

Adaptive semantic matching paradigm

The adaptive semantic matching paradigm comprises two conditions: a semantic matching task, and a perceptual matching task. The tasks are presented in alternating 20-s blocks in a simple AB block design. There are 10 blocks per condition, for a total scan time of 400 s (6:40). Each block contains between 4 and 10 items (inter-trial interval 5–2 s), depending on the level of difficulty.

In the semantic condition, each item consists of a pair of words, which are presented one above the other in the center of the screen (Figure 1A). Half of the pairs are semantically related (e.g. boy-girl, lizard-snake, grass-lawnmower), while the other half are not related (e.g. walnut-bicycle). The participant presses a single button with a finger of their left hand if they decide that the words are semantically related. If the words are not related, they do nothing.

Figure 1.

Figure 1

Methodological details. (A) Example semantic item. This item is a match, and is shown surrounded by a box that appears when the ‘match’ button is pressed (the box confirms the button press, but no information on accuracy is provided). (B) Example perceptual item. This item is a mismatch, so the button should not be pressed. (C) An illustration of how the Dice coefficient of similarity captures the extent of overlap between two thresholded images. (D) Regions of interest in a representative participant and projected onto the lateral surfaces of a template brain. ROI 1 (Brain) encompassed the whole brain. ROI 2 (Supra) encompassed regions shown in red, green or blue. ROI 3 (Lang+) corresponded to regions shown in red or green. ROI 4 (Lang) is shown in red.

In the perceptual condition, each item comprises a pair of false font strings, presented one above the other (Figure 1B). Half of the pairs are identical (e.g. ΔΘδЂϞ-ΔΘδЂϞ), while the other half are not identical (e.g. ΔΘδЂϞ-ϞΔƕƘΔ). The participant presses the button if the strings are identical, and does nothing if they differ.

The semantic and perceptual tasks are equivalent in terms of sensorimotor, executive and decision-making components, yet make differential demands on language processing. Task-switching demands are minimized because both conditions involve an essentially similar task: pressing a button to matching pairs. The use of just a single button obviates the need to learn an arbitrary association between ‘match’ and one button, and ‘mismatch’ and another button. The left hand is used for the button press because many individuals with post-stroke aphasia have right-sided hemiparesis.

Critically, both the semantic task and the perceptual task are independently adaptive to participant performance. Each task has seven levels of difficulty. Whenever the participant makes two successive correct responses on a given condition, they move to the next highest level of difficulty on the subsequent trial of that condition. Whenever they make an incorrect response, they move two levels down on the next trial. This is a 2-down-1-up adaptive staircase with weighted step sizes (up twice as large as down), which theoretically should converge at just over 80% accuracy (García-Pérez, 1998). Note that the difficulty level is manipulated independently for the semantic and perceptual conditions, even though sets of items from the two conditions are interleaved due to the AB block design.

Similar contrasts between semantic matching and perceptual matching tasks, without any adaptive component, have been used in several previous studies. Most similar to the present study, Seghier et al. (2004) contrasted a semantic category matching task with a false font string matching perceptual task. Several other studies have contrasted synonym judgments with match-mismatch judgments on letter strings (e.g. Fernández et al., 2001; Gitelman, Nobre, Sonty, Parrish, & Mesulam, 2005).

Manipulation of item difficulty

Two versions of the experiment were constructed, differing in terms of how item difficulty was manipulated. Most of the data reported in this paper were acquired using the first version. However some ceiling effects were observed in neurologically normal individuals, so a revised version was constructed, which we advocate the use of in future studies. Here both versions are described. Example items from both conditions at each level are shown for the original version in Table 1 and for the revised version in Table 2.

Table 1.

Original version of the adaptive semantic matching paradigm

Level Frequency Concreteness Forward
strength
Backward
strength
Match
example
Mismatch
example
Perceptual
match
Perceptual
mismatch
1 7.38 ± 1.01 601 ± 52 0.64 ± 0.11 0.29 ± 0.27 girl
boy
king
mom
ΘƟδ
ΘƟδ
ƩʖƩ
ƘΞΘƩΘΓ
2 6.51 ± 1.49 547 ± 94 0.45 ± 0.12 0.16 ± 0.17 nest
bird
fuel
tree
ΔδΦƱ
ΔδΦƱ
ЖƋƩʖδƟ
ƧΘƟƧ
3 6.13 ± 1.36 548 ± 100 0.32 ± 0.09 0.10 ± 0.13 calendar
date
orange
wet
ʖƱʖδΔ
ʖƱʖδΔ
ƱΨЖδƜΦ
δŒΦƘƧ
4 5.67 ± 1.28 530 ± 108 0.26 ± 0.07 0.07 ± 0.10 onion
cry
shiny
ballet
ΘƟδΓΓ
ΘƟδΓΓ
ƋƩʖδƟ
ΘƧΘƟƧ
5 5.32 ± 1.29 503 ± 112 0.21 ± 0.05 0.05 ± 0.07 organize
neat
limb
owe
ƋΔδΦƱ
ƋΔδΦƱ
ϞΓʖϞΨŒ
ƧΓʖϞŒʖ
6 4.52 ± 1.33 467 ± 116 0.17 ± 0.05 0.04 ± 0.06 magnify
enlarge
bristle
sour
ƜϞΞΞΦʖ
ƜϞΞΞΦʖ
ΨƜΘЖƘΞ
ΨƘΘŒƘΞ
7 3.42 ± 1.33 391 ± 108 0.10 ± 0.05 0.02 ± 0.04 menthol
eucalyptus
caress
astonish
ЖϞʖΞʖƕ
ЖϞʖΞʖƕ
ΔƟЖƏŒ
ΔƟΔƏŒ
Table 2.

Revised version of the adaptive semantic matching paradigm

Level Frequency Concreteness Length Age of acquisition Match
example
Mismatch
example
Perceptual
match
Perceptual
mismatch
1 7.85 ± 1.20 544 ± 91 4.16 ± 0.89 4.11 ± 0.93 rat
mouse
read
house
ʖƱʖδΔ
ʖƱʖδΔ
ƱΨЖδƜΦ
δŒΦƘƧ
2 6.98 ± 1.39 525 ± 99 4.68 ± 1.28 5.22 ± 1.37 kiss
love
sea
fork
ΘƟδΓΓ
ΘƟδΓΓ
ƋƩʖδƟ
ΘƧΘƟƧ
3 6.17 ± 1.42 518 ± 96 5.20 ± 1.43 6.22 ± 1.60 salt
vinegar
shark
empty
ƋΔδΦƱ
ƋΔδΦƱ
ϞΓʖϞΨŒ
ƧΓʖϞŒʖ
4 5.78 ± 1.64 483 ± 104 5.80 ± 1.69 7.26 ± 1.83 symbol
ornament
squirrel
selection
ŒΘʖʖƕ
ŒΘʖʖƕ
δΘϞΦΨ
ƧδϞΦΨ
5 5.35 ± 1.59 462 ± 106 6.37 ± 1.96 8.45 ± 1.94 limousine
carriage
fever
honeycomb
δδƱƱƘΦ
δδƱƱƘΦ
δƧϞΞϞ
ϞƧϞΞϞ
6 4.88 ± 1.67 419 ± 112 6.95 ± 2.20 9.54 ± 2.02 arrangement
agreement
ceremony
expulsion
ƜϞΞΞΦʖ
ƜϞΞΞΦʖ
ΨƜΘЖƘΞ
ΨƘΘŒƘΞ
7 3.93 ± 1.54 367 ± 103 8.06 ± 2.22 11.46 ± 1.97 catastrophe
upheaval
intermission
socialist
ЖϞʖΞʖƕ
ЖϞʖΞʖƕ
ΔƟЖƏŒ
ΔƟΔƏŒ

In the original version, the difficulty of semantic items was manipulated by varying four factors: as the level of difficulty increased, words had lower lexical frequency, words were less concrete, pairs of matching words were less closely related to one another, and the presentation rate was faster. Word pairs were extracted from the University of South Florida Free Association Norms (Nelson, McEvoy, & Schreiber, 1998). This is a large dataset of word pairs derived from a procedure in which participants were presented with single words and asked to “provide the first word that came to mind that is meaningfully related or strongly associated to the presented word” (Nelson et al., 1998). A MATLAB program was used to select a total of 1327 pairs that varied systematically according to the first three factors listed above. Lexical frequency was obtained from the American National Corpus (Reppen, Ide, & Suderman, 2005), concreteness norms were obtained from the MRC database (Coltheart, 1981), and degree of relatedness between pairs of words was defined in terms of forward strength (cue-to-target strength) and backward strength (target-to-cue strength) of the free association norms (Nelson et al., 1998).

The paradigm was revised with the main goal of making the difficult levels more difficult, in order to avoid ceiling effects in people without language impairments. In the revised version of the paradigm, the difficulty of semantic items was manipulated by lexical frequency, concreteness, degree of relatedness, and presentation rate, similar to the original version, but also by word length and age of acquisition. The free association norms were not used. Instead, all words with concreteness ratings in the MRC database (approximately 4000) were retrieved. A difficulty metric was computed for each of these words, by summing the z scores for frequency (Reppen et al., 2005), concreteness (Coltheart, 1981), age of acquisition (Kuperman, Stadthagen-Gonzalez, & Brysbaert, 2012), and length (number of letters); frequency and age of acquisition were doubly weighted. The pairwise semantic distances between all of the 4000 words were estimated with snaut (Mandera, Keuleers, & Brys, 2017), a prediction-based model of distributional semantics derived from corpora. For each word, the ten most related words output by snaut were considered as possible match items (these items were not necessarily actually closely related to the target word, given the limitations of the computational model). A matching word was manually selected where the salience of the semantic relationship was subjectively similar to the difficulty metrics of the words involved. For instance, the word rat is frequent, acquired early, concrete, and short, so it was paired with a closely related associate, mouse to create an easy item. In contrast, elopement is infrequent, acquired late, abstract, and long, so it was paired with the tangentially related consummation (rather than closer possible associates such as marriage) to create a difficult item. A total of 1934 items were created in this way, then ordered by the average difficulty metric of the pair of words.

For both versions of the experiment, mismatching items were created by shuffling adjacent pairs in the difficulty-ordered list, then manually adjusting any incidentally created matches. In both versions, item difficulty in the perceptual condition was manipulated in two ways: as the level of difficulty increased, mismatching pairs were more similar, and presentation rate was faster. However in the revised version of the experiment, mismatching items were more similar at the lower levels, making the perceptual task more difficult at these levels.

In order to match sensorimotor and executive demands across the semantic and perceptual conditions, it is necessary to yoke presentation rate across conditions. Presentation rate is adjusted at the start of each semantic block and remains fixed for the upcoming semantic block and perceptual block. The ‘ideal’ inter-trial interval for each condition is defined as the block length (20 s) divided by the ideal number of items per block (4 through 10, for difficulty levels 1 through 7). The number of items per block is then selected to be as large as possible without exceeding the average of the two ‘ideal’ inter-trial intervals.

Training

There are three phases of training. In the first phase, which typically takes about five minutes, the examiner explains the tasks to the participant in language that is appropriate for the individual, taking into account the nature and severity of their aphasia, if any. It is recommended that the examiner be experienced in communicating with individuals with aphasia. As the examiner explains the tasks, they present items manually using key presses. Match and mismatch semantic and perceptual items can be presented at any level of difficulty and with any timing appropriate to the situation. The examiner can also press the ‘match’ button to demonstrate, and then have the participant practice pressing it. The examiner can present as many practice items as are necessary for the participant to learn the task.

Once the participant is comfortable responding to individually presented items, the second training phase begins. In this phase, stimuli are delivered continuously in a block design, identical to the functional imaging experiment except that the presentation rate is not yoked across conditions (because it does not need to be). The participant thus becomes familiar with the structure and presentation rate of the actual experiment. The researcher can stop this training phase when the participant is familiar with the paradigm. For most individuals with aphasia, two or three blocks of each condition should be presented (about two minutes). The third phase of training is the same as the second phase, except that it takes place after the participant has been placed in the scanner, for instance, during acquisition of localizer or structural images. In this way, the participant becomes accustomed to performing the task in the unfamiliar environment of the scanner bore. One to two minutes are generally sufficient, but this third phase of training does not add to overall testing time, since it takes place concurrent with acquisition of structural images.

Technical implementation

The semantic paradigm is implemented in a MATLAB program called AdaptiveLanguageMapping using the Psychophysics Toolbox version 3 (Brainard, 1997; Pelli, 1997), which requires MATLAB R2012a or later. The AdaptiveLanguageMapping program has been tested on Linux, Windows and Mac OS X, and has no dependencies besides the freely available Psychophysics Toolbox. This program can also be used to present the other two paradigms described in this paper (narrative comprehension and picture naming). A manual is provided describing the operation of the program. Log files are generated allowing for analysis of behavioral data. AdaptiveLanguageMapping is freely available for download at http://aphasialab.org/alm.

Narrative comprehension paradigm

The narrative comprehension paradigm was also a simple AB block design with two conditions: narrative comprehension, and backwards speech. As in the semantic paradigm, there were 10 blocks per condition, for a total scan time of 400 s (6:40). In the narrative blocks, the opening pages of an audiobook were presented, in one session Who was Albert Einstein? (Brallier, 2002) and in the other session Who were the Beatles? (Edgers, 2006), each of which were written for children aged eight to twelve years, and so contain relatively simple language. In the backwards speech blocks, the same segments were played in reverse. To avoid sentences being split across blocks, the blocks varied in length from 13.65 to 26.59 s (mean = 19.63 ± 3.45 s). There was a gap of 0.37 s between blocks.

To aid comprehension by providing a visual reminder of the topic of the narrative, an iconic picture of Albert Einstein or the Beatles was displayed in the center of the screen prior to the start of the run and throughout the run (during both conditions). After the end of the scanning session, individuals with aphasia were asked six true/false questions about the narrative they had heard.

Picture naming paradigm

The picture naming paradigm was a jittered rapid event-related design with two conditions: real pictures and scrambled pictures. An event-related design was used in order to match previous applications of similar paradigms in aphasia, in which the goal has often been to separately model correct and incorrect items. Continuous acquisition rather than sparse sampling was used because the delay of the hemodynamic response ensured that speech-related and neural effects were temporally distinct (Birn, Bandettini, Cox, & Shaker, 1999), and the former were largely accounted for in preprocessing (see below). There were 40 real pictures and 20 scrambled pictures, presented in a total scanning time of 400 s (6:40). Each picture was displayed for 3 seconds. Participants were instructed to name the real pictures, overtly if possible, and to just look at the scrambled pictures. The mean inter-trial interval (from the onset of one item to the onset of the next item) was 6.5 ± 2.2 s (range 4–14 s), leaving adequate time for spoken responses. In between trials, participants were asked to fixate on a central crosshair.

Spoken responses were recorded with a scanner-compatible microphone (FORMI-III, OptoAcoustics, Mazor, Israel). Responses were coded as correct, incorrect, or no response. Reaction times were measured from the presentation time to the onset of the first response, not including any fillers, false starts or fragments.

The stimuli were colorized versions (Rossion & Pourtois, 2004) of the Snodgrass and Vanderwart (1980) pictures. Only items with mono-morphemic targets and name agreement of at least 60% were used. The mean length of the target names was 3.9 ± 0.9 phonemes (range 3–6), the mean log frequency of the targets based on the HAL corpus (Lund & Burgess, 1996) was 8.8 ± 1.5 (range 4.9–13.2), and the mean name agreement was 89.8 ± 9.8% (range 62–100%). The means and distributions of these variables were matched across two versions of the paradigm that were presented in the two sessions.

Participants

Sixteen individuals with chronic post-stroke aphasia were recruited from an aphasia center in Tucson, Arizona, or were prior participants in aphasia research at the University of Arizona. The inclusion criteria were (1) persistent and stable aphasia of any etiology; (2) at least six months post stroke; (3) aged 18 to 90; (4) fluent and literate in English premorbidly. The exclusion criteria were (1) dementia; (2) major psychiatric disorders; (3) serious substance abuse. Of the 16 participants, 15 had experienced left hemisphere strokes, while one (A5) had experienced bilateral strokes, with the right hemisphere stroke being more extensive.

Fourteen neurologically normal participants were recruited mostly from a neighborhood listserv in Tucson, Arizona. They reported no neurological or psychiatric history. The Mini Mental State Examination (Folstein, Folstein, & McHugh, 1975) was administered to each participant, and scores ranged from 27 to 30.

Demographic information is presented in Table 3. All but four participants were native speakers of English. Two individuals with aphasia and two neurologically normal participants were native speakers of Spanish whose primary language was now English and who were fluent in English.

Table 3.

Demographic and language data

Individuals with aphasia Neurologically normal
Number of participants 16 14
Age (years) 60.4 ± 14.8 (32.0–79.0) 53.1 ± 15.1 (27.0–80.0)
Sex (M/F) 12/4 8/6
Handedness (R/L) 14/2 10/4
Education (years) 15.1 ± 2.5 (12.0–20.0) 17.1 ± 1.9 (14.0–20.0)
Days post stroke 1955 ± 1220 (208–3960)
Quick aphasia battery
Spoken word comprehension 9.5 ± 1.2 (5.3–10.0) 10.0 ± 0.0 (10.0–10.0)
Sentence comprehension 5.0 ± 2.9 (0.7–9.9) 9.3 ± 1.3 (5.4–10.0)
Word finding 4.8 ± 2.8 (0.0–9.4) 9.8 ± 0.5 (8.5–10.0)
Grammatical construction 5.2 ± 3.4 (0.0–9.7) 9.8 ± 0.2 (9.5–10.0)
Speech motor programming 6.6 ± 4.3 (0.0–10.0) 10.0 ± 0.0 (10.0–10.0)
Repetition 5.5 ± 2.7 (0.0–9.9) 9.7 ± 0.4 (8.8–10.0)
Written word comprehension 9.7 ± 1.0 (5.8–10.0)
Reading aloud 5.5 ± 3.3 (0.0–9.9) 9.6 ± 0.4 (8.8–10.0)
Quick aphasia battery overall 6.0 ± 2.3 (1.4–9.7) 9.8 ± 0.3 (9.0–10.0)
Other language measures
WAB Aphasia quotient 67.8 ± 24.5 (18.9–98.8)
CAT Written word to picture (/30) 27.4 ± 2.8 (20–30)
Pyramids and Palm Trees (/14) 13.9 ± 0.3 (13–14)

WAB = Western Aphasia Battery; CAT = Comprehensive Aphasia Test.

For the revised version of the adaptive semantic matching paradigm, a second group of 16 neurologically normal participants were recruited from a neighborhood listserv in Nashville, Tennessee (age 57.0 ± 15.0 years (range 23–77 years); 6 male, 10 female; 12 right-handed, 3 left-handed, 1 ambidextrous; education 16.7 ± 2.2 years (range 12–20 years); MMSE range 27–30). These participants did not complete the other two paradigms, and were scanned on a different scanner, so they were not directly compared to the other participants.

All participants gave written informed consent and were modestly compensated for their time. The study was approved by the institutional review boards at the University of Arizona and Vanderbilt University, and all study procedures were performed in accordance with Declaration of Helsinki.

Language assessments

Individuals with aphasia completed three study sessions. In the first session, the study was explained to them and they provided written informed consent, demographic and medical history information, and were screened for MRI safety. To characterize language deficits, they completed one of three equivalent forms of Quick Aphasia Battery (QAB; Wilson, Eriksson, Schneck, & Lucanie, 2018), as well as the Western Aphasia Battery (WAB; Kertesz, 1982). Because the adaptive semantic matching task depends on comprehension and semantic processing of written words, patients then completed the written word to picture matching subtest of the Comprehensive Aphasia Test (Swinburn et al., 2004), written word to picture matching using the word comprehension items from another form of the QAB, and a 14-item short version (Breining et al., 2015) of the Pyramids and Palm Trees Test (Howard & Patterson, 1992). In the second and third sessions, the other two forms of the QAB were administered, along with written word to picture matching using QAB items.

Language data are shown in Table 3. Four patients presented with Broca’s aphasia per clinical impression (A1, A2, A3, A4), one patient was non-verbal with good comprehension (A5), five patients had conduction aphasia (A6, A7, A8, A9, A10), one patient was agrammatic in production but with good comprehension, fitting no traditional subtype (A11), four patients had anomic aphasia (A12, A13, A14, A15), and one patient was almost completely recovered (A16). The patients spanned a range of aphasia severity: per WAB criteria, five had severe aphasia, four had moderate aphasia, six had mild aphasia, and one was within normal limits. The patients’ language function is described in more detail elsewhere (Wilson et al., 2018). Written word comprehension was excellent in all but one of the patients, the only exception being the person with the most severe Broca’s aphasia, who comprehended about two thirds of written words. Non-verbal semantic function was intact in all patients.

Neurologically normal individuals completed just one study session, in which their language function was evaluated with a single form of the QAB.

Neuroimaging

Individuals with aphasia were scanned with structural and functional MRI during their second and third study sessions. The first imaging session took place 15.9 ± 3.8 days after the consent/behavioral session, and there were 11.4 ± 6.8 days between the two imaging sessions. Neurologically normal participants were scanned during their only study session.

Prior to entering the scanner, each participant was trained on the three language mapping tasks. MRI data were acquired on a Siemens Skyra 3 Tesla scanner with a 20-channel head coil at the University of Arizona. Visual stimuli were presented on a 24″ MRI-compatible LCD monitor (BOLDscreen, Cambridge Research Systems, Rochester, UK) positioned at the end of the bore, which participants viewed through a mirror mounted to the head coil. Auditory stimuli were presented using insert earphones (S14, Sensimetrics, Malden, MA) padded with foam to attenuate scanner noise and reduce head movement. The presentation volume was adjusted to a comfortable level for each participant.

For each of the three language mapping paradigms, T2*-weighted BOLD echo planar images were collected with the following parameters: 200 volumes + 3 initial volumes discarded; 30 axial slices in interleaved order; slice thickness = 3.5 mm with a 0.9 mm gap; field of view = 240 × 218 mm; matrix = 86 × 78; repetition time (TR) = 2000 ms; echo time (TE) = 30 ms; flip angle = 90°; voxel size = 2.8 × 2.8 × 4.4 mm. The order of the three language mapping paradigms was counterbalanced across participants. High resolution T2-weighted images were acquired coplanar with the functional images in each session, to aid coregistration.

For anatomical reference and lesion delineation, T1-weighted MPRAGE structural images (voxel size = 0.9 × 0.9 × 0.9 mm) were acquired (in the first imaging session in patients). In patients only, to provide more information for lesion delineation, T2-weighted FLAIR images (voxel size = 0.5 × 0.5 × 2.0 mm) were acquired in the second imaging session.

The second group of neurologically normal participants was scanned on a Philips Achieva 3 Tesla scanner with a 32-channel head coil at Vanderbilt University. Visual stimuli were projected onto a screen at the end of the bore, which participants viewed through a mirror mounted to the head coil. T2*-weighted BOLD echo planar images were collected with the following parameters: 200 volumes + 4 initial volumes discarded; 35 axial slices in interleaved order; slice thickness = 3.0 mm with 0.5 mm gap; field of view = 220 × 220 mm; matrix = 96 × 96; repetition time (TR) = 2000 ms; echo time (TE) = 30 ms; flip angle = 75°; SENSE factor = 2; voxel size = 2.3 × 2.3 × 3.5 mm. Coplanar T2-weighted images and T1-weighted structural images were also acquired.

Analysis of neuroimaging data

The functional data were first preprocessed with tools from AFNI (Cox, 1996). Head motion was corrected, with six translation and rotation parameters saved for use as covariates. Next, the data were detrended with a Legendre polynomial of degree 2, and smoothed with a Gaussian kernel (FWHM = 6 mm). Then, independent component analysis (ICA) was performed using the FSL tool melodic (Beckmann & Smith, 2004). Noise components were manually identified with reference to the criteria of Kelly et al. (2010) and removed using fsl_regfilt.

First level models were fit independently for each of the six functional runs (two sessions, three paradigms per session). The adaptive semantic matching and narrative comprehension paradigms were modeled with simple boxcar functions. For the picture naming paradigm, explanatory variables were created for picture items and scrambled items; an additional analysis was performed in which correct and incorrect items were modeled separately.

Task models were convolved with a hemodynamic response function (HRF) based on the difference of two gamma density functions (time to first peak = 5.4 s, FWHM = 5.2 s; time to second peak = 15 s; FWHM = 10 s; coefficient of second gamma density = 0.09), and fit to the data with the program fmrilm from the FMRISTAT package (Worsley et al., 2002). The six motion parameters were included as covariates, as were time-series from white matter and CSF regions (means of voxels segmented as white matter or CSF in the vicinity of the lateral ventricles) to account for nonspecific global fluctuations, and three cubic spline temporal trends.

Lesions were manually demarcated based on T1-weighted and FLAIR images. The T1-weighted anatomical images were warped to MNI space using unified segmentation in SPM5 (Ashburner & Friston, 2005) with cost-function masking of the lesion (Brett, Leff, Rorden, & Ashburner, 2001). Functional images were coregistered with structural images via coplanar T2-weighted structural images using SPM, and warped to MNI space. Functional images were inclusively masked with a gray matter mask obtained by smoothing the segmented gray matter proportion image with a 4 mm FWHM Gaussian kernel, then applying a cutoff of 0.25. Patient images were exclusively masked with the lesion mask. These two steps were performed in order to increase reliability by excluding activations that are likely to be spurious.

To identify brain regions activated by each paradigm when normal language function is intact, a random effects analysis was carried out on the 14 neurologically normal individuals. Each contrast was thresholded at voxelwise p < 0.005, then corrected for multiple comparisons at p < 0.01 based on cluster extent according to Gaussian random field theory as implemented in SPM5 (Worsley et al., 1996). This stringent corrected threshold was chosen because Gaussian random field theory can inflate false positive rates (Eklund, Nichols, & Knutsson, 2016). A group analysis was performed in the same way on the second group of 16 neurologically normal participants who were scanned on the revised version of the semantic paradigm.

Quantification of reliability

Reliability of fMRI paradigms is generally assessed by scanning the same participants two or more times, and then calculating a similarity metric between the activations obtained on each occasion (Bennett & Miller, 2010). Our similarity metric was the Dice coefficient of similarity (Rombouts et al., 1997), which is a measure of the extent of overlap between thresholded activation maps. The Dice coefficient is calculated as 2.Voverlap / (V1 + V2) where Voverlap is the number of overlapping voxels, V1 is the number of voxels activated in the first scan, and V2 is the number of voxels activated in the second scan (Figure 1C). The advantages of the Dice coefficient are that it is easy to interpret (Bennett & Miller, 2010), is widely used (e.g. Fernández et al., 2003; Fesl et al., 2010; Gross & Binder, 2014; Harrington et al., 2006; Rutten et al., 2002), can be calculated in any individual without reference to a group (Bennett & Miller, 2010); and yields a single metric of overall activation similarity encompassing all brain regions under consideration (Wilson et al., 2017). In these last two respects, the Dice coefficient is more useful than the intraclass correlation coefficient, another metric sometimes used in research on language mapping paradigms (Fernández et al., 2003; Eaton et al., 2008; Meltzer et al., 2009). In this paper, Dice coefficients will be described as poor (< 0.40), fair (0.40–0.60), good (0.60–0.75) or excellent (≥ 0.75), following Cicchetti (1994), since Dice coefficients are conceptually related to the kappa statistic (Zijdenbos, Dawant, Margolin, & Palmer, 1994).

Quantification of validity

Validity is the extent to which a language mapping paradigm identifies all and only the regions that are actually critical for language processing. The validity of language mapping paradigms has been investigated in several different ways. Concurrent validity has been examined by comparing fMRI to the Wada test (intracarotid amobarbital test) for language lateralization (Janecek et al., 2013; Woermann et al., 2003), and to electrocortical stimulation mapping for language localization within a hemisphere (Giussani et al., 2010). These invasive approaches are not feasible in our population; moreover, neither are infallible as a ground truth: fMRI has been shown to out-perform Wada (Janecek et al., 2013), and stimulation mapping is limited to the exposed surfaces of gyri, and language areas are not identified at all in a significant minority of patients (Sanai, Mirzadeh, & Berger, 2008). Therefore, we took a different approach to quantifying validity.

Specifically, it is firmly established that language is lateralized to the left hemisphere in the vast majority of neurologically normal individuals (Knecht et al., 2003; Seghier et al., 2011; Springer et al., 1999; Tzourio-Mazoyer et al., 2010; Bradshaw et al., 2017). This is why aphasia results from left hemisphere lesions, and not from right hemisphere lesions. Therefore, a valid language mapping paradigm should yield left-lateralized activation maps in the majority of neurologically normal participants. Lateralization indices (LIs) were calculated according to the standard formula: LI = (VLeft – VRight) / (VLeft + VRight), where VLeft is the number of voxels activated in the left hemisphere, and VRight is the number of voxels activated in the right hemisphere. LI ranges from −1 (all activation in the right hemisphere) to +1 (all activation in the left hemisphere). In individuals with aphasia, who were each scanned twice, the LI was averaged across the two sessions. When no voxels were activated in either hemisphere, LI was set to 0.

Another known fact about language organization is that frontal and temporal regions in the dominant hemisphere are involved in language processing in the majority of neurologically normal individuals (Knecht et al., 2003; Seghier et al., 2011; Springer et al., 1999; Tzourio-Mazoyer et al., 2010; Bradshaw et al., 2017). Therefore, a second assay of validity was the sensitivity of paradigms to detect expected language activations in these regions. The frontal ROI was defined as the inferior frontal gyrus (AAL regions 11, 13 and 15; Tzourio-Mazoyer et al., 2002), while the temporal ROI was defined as the middle temporal gyrus (85), angular gyrus (65), and the ventral part of the superior temporal gyrus (81); specifically, voxels within 8 mm of the middle temporal gyrus. The extent of activation within these ROIs was compared.

Validity was assessed primarily in the neurologically normal group, since only in these participants were there clear expectations about the lateralization and localization of language regions. However, in exploratory analyses, lateralization and sensitivity to frontal and temporal language regions were also calculated for individuals with aphasia, even though lateralization may have changed, and frontal and/or temporal language regions in some cases have been destroyed.

Analysis parameter sets

Calculations of reliability and validity are strongly influenced by analysis parameters such as region of interest (ROI), vowelwise threshold, and cluster extent threshold (Wilke & Lidzba, 2007; Wilson et al., 2017). The impact of these three parameters was systematically explored using different combinations that will be referred to as analysis parameter sets. However, an a priori parameter set was also selected based on previous findings as explained below, for display of individual activation maps and for statistical comparisons between the three paradigms.

Four regions of interest were defined, each of which was symmetrical across the two hemispheres (Figure 1D). The first ROI, labeled Brain, was simply the whole brain, i.e. no mask was applied except for the gray matter and lesion masks described above. LIs were not calculated for the Brain ROI, since the inclusion of the cerebellum would make LI invalid in this case, given that language activations in the cerebellum are usually, but not always, contralateral to cortical activations (Gelinas, Fitzpatrick, Kim, & Bjornson, 2014).

The second ROI, labeled Supra (Figure 1D, red, green or blue), consisted of all supratentorial gray matter regions (AAL regions 1–90), i.e. cortical gray matter, the thalamus and the basal ganglia.

The third ROI, termed Lang+ (Figure 1D, red or green), consisted of a very liberal set of brain regions which are either known language regions or plausible candidate regions for functional reorganization. Included in this ROI were the inferior frontal gyrus (AAL regions 11/13/15 and their right hemisphere counterparts); middle frontal gyrus (7); superior frontal gyrus (3/23); supplementary motor area (19); precentral gyrus (1); postcentral gyrus (57); supramarginal gyrus (63); angular gyrus (65); the remainder of the inferior parietal lobule (61); the superior parietal lobule (59); the ventral part of the superior temporal gyrus (81), specifically voxels within 8 mm of the middle temporal gyrus; the middle temporal gyrus (85); the temporal pole (83/87); the inferior temporal gyrus (89); the fusiform gyrus (55); the parahippocampal gyrus (39); and the hippocampus (37). This ROI would be a good choice for studies of functional reorganization in aphasia, because it can be expected to improve reliability by excluding potentially spurious activations in unlikely loci for functional reorganization. Accordingly, this ROI was selected as the a priori ROI for this study.

The fourth ROI, termed Lang (Figure 1D, red), was narrowly defined as likely language regions and their right hemisphere homotopic counterparts, specifically: the inferior frontal gyrus (11/13/15); the ventral part of the superior temporal gyrus (81) defined as above; the middle temporal gyrus (85); and the angular gyrus (65). This ROI would be a good choice for studies interested primarily in language lateralization within known language regions.

Images of t statistics were thresholded at 14 different voxelwise thresholds. Seven of these were absolute: p < 0.1, p < 0.05, p < 0.01, p < 0.005, p < 0.001, p < 0.0005, and p < 0.0001. The other seven thresholds were relative: the top 10%, 7.5%, 5%, 4%, 3%, 2% or 1% of most highly activated voxels were considered active, the denominator being the total number of voxels included in the Brain ROI (so that the total extent of activation would be consistent across ROIs). Note that relative thresholds of 10%, 7.5% and 5% were not calculated for the Lang ROI, since the Lang ROI included only 13.7 ± 1.1% of the voxels in the Brain ROI, so these more liberal relative thresholds would have entailed over 30% of voxels in the ROI being considered active. Approaches involving relative thresholds often improve reliability (Gross & Binder, 2014; Knecht et al., 2003; Voyvodic, 2012; Wilson et al., 2017), therefore the a priori threshold was a relative threshold, specifically top 5%, similar to Gross and Binder (2014).

Four cluster extent thresholds were applied: none, 1 cm3, 2 cm3, and 4 cm3. The a priori cluster extent threshold was 2 cm3, selected based on previous findings for a narrative comprehension paradigm (Wilson et al., 2017).

Results

Behavioral data

For the adaptive semantic matching paradigm, overall accuracy was plotted as a function of condition (semantic, perceptual) in individuals with aphasia and neurologically normal participants (Figure 2). A mixed effects ANOVA revealed a significant interaction of group by condition (F(1, 28) = 7.84; p = 0.0091), such that patients performed equivalently accurately on the two conditions (semantic: 84.1 ± 7.9 (SD)%; perceptual: 84.4 ± 3.0%; |t(15)| = 0.13, p = 0.90), while neurologically normal participants were more accurate on the semantic condition (94.7 ± 4.5%) than the perceptual condition (86.7 ± 3.7%; |t(13)| = 4.91; p = 0.0003). The better performance of the neurologically normal group on the semantic condition was presumably due to the items not being difficult enough even at the most difficult level. To address this limitation, the semantic paradigm was subsequently revised; data from the revised paradigm are presented below in the subsection entitled ‘Revised adaptive semantic stimuli’.

Figure 2.

Figure 2

Behavioral data for the adaptive semantic matching paradigm. Accuracy, item difficulty, and reaction time on the semantic and perceptual control conditions in individuals with aphasia and neurologically normal participants. Perc = Perceptual.

The mean difficulty level of items presented was plotted as a function of condition in individuals with aphasia and neurologically normal participants (Figure 2). A mixed effects ANOVA revealed a significant interaction of group by condition (F(1, 28) = 4.46; p = 0.044), such that for patients, mean item difficulty level did not differ between the semantic condition (4.64 ± 1.27) and the perceptual condition (5.04 ± 0.65; |t(15)| = 1.11; p = 0.29), while neurologically normal participants performed at a higher level of difficulty on the semantic condition (6.16 ± 0.61) than on the perceptual condition (5.66 ± 0.47; |t(13)| = 2.61; p = 0.022). This significant interaction was expected due to language deficits in individuals with aphasia, but note that the precise pattern of differences is less important, because while both tasks have seven levels of difficulty, the degree of difficulty of the two tasks is not inherently matched at each level.

Reaction times on correct items were plotted as a function of condition in individuals with aphasia and neurologically normal participants (Figure 2). There was no significant interaction of group by condition (F(1, 28) = 1.12; p = 0.30). There was a main effect of group, such that patients responded slower than neurologically normal participants (F(1, 28) = 24.23; p < 0.0001), and a main effect of condition, such that reaction times were faster in the semantic condition (F(1, 28) = 15.22; p = 0.0005). While it would be preferable for reaction times to be equivalent, it is less problematic for responses to be faster in the condition of interest than the control condition, as opposed to vice versa, which could result in areas modulated by time on task being misidentified as language areas (Binder, Medler, Desai, Conant, & Liebenthal, 2005).

For the narrative comprehension paradigm, comprehension was assessed after each scanning session with six true/false questions. Individuals with aphasia answered these questions with a mean accuracy of 90.6 ± 11.7% (range 66.7–100%), indicating that narratives were generally attended and comprehended, albeit imperfectly in some cases. Neurologically normal participants were not asked the questions, but all reported having heard and attended to the narratives.

For the picture naming paradigm, individuals with aphasia responded correctly to 65.2 ± 33.7% of items (range 0–100%), provided incorrect responses to 21.3 ± 24.8% of items (range 0–80.0%), and did not respond to 13.5 ± 24.1% of items (range 0–100%). The mean reaction time on correct items was 1504 ± 250 ms (range 1045–2074 ms). There was one non-verbal participant who did not provide any overt responses to any item, and another participant who provided only one correct response across the two sessions. To avoid excluding these two individuals, we report functional imaging analyses carried out on all items, not just correct items. Analyses based on correct items only were also performed (excluding these two participants) and yielded essentially similar activation patterns and estimates of reliability and validity (data not shown).

Neurologically normal participants were more accurate (98.8 ± 1.3% (range 97.5–100%) than individuals with aphasia (|t(15.05)| = 3.99; p = 0.0012) on the picture naming task, and responded more quickly (1153 ± 155 ms; range 862–1379 ms; |t(21.71)| = 4.46; p = 0.0002).

Language activation maps

To identify the brain regions activated by each paradigm in the absence of any damage to language networks, random effects group analyses were carried out for each paradigm in the group of 14 neurologically normal participants.

For the adaptive semantic matching paradigm, the semantic condition was contrasted to the perceptual condition (Figure 3A; Table 4). This contrast activated the left inferior frontal gyrus (pars opercularis, triangularis and orbitalis) extending to the precentral sulcus and posterior middle frontal gyrus, and also into the temporal pole; the left posterior superior temporal sulcus and middle temporal gyrus, extending into the angular gyrus, anteriorly along the superior temporal sulcus, and ventrally into the inferior temporal gyrus and fusiform gyrus; the anterior hippocampi bilaterally; and the right cerebellum.

Figure 3.

Figure 3

Language activation maps derived from the adaptive semantic matching paradigm. (A) Group analysis in 14 neurologically normal participants. Whole brain activations were thresholded at voxelwise p < 0.005 then corrected for multiple comparisons at p < 0.01 based on cluster extent. (B) Activation maps in 16 individuals with aphasia at two time points each. The patients are arranged in groups according to clinical impression, then in ascending order of overall QAB score within each group. See Participants section, and for more detailed language data, see Figure 14 of Wilson et al. (2018), which is laid out the same way. Voxels with the highest 5% of t statistics were plotted, subject to a minimum cluster volume of 2 cm3, in an ROI comprising known language regions or plausible candidate regions for functional reorganization (Lang+ ROI); note that the cerebellum was not included (unlike panel A). Inset axial slices show lesion reconstructions. T1 = first imaging session; T2 = second imaging session; Dice = Dice coefficient of similarity; LI = lateralization index.

Table 4.

Brain regions activated by the adaptive semantic matching paradigm in neurologically normal participants

Brain region(s) MNI coordinates Extent (mm3) Max t p
x y z
Left inferior frontal gyrus (pars opercularis, triangularis and orbitalis); precentral sulcus; posterior middle frontal gyrus; temporal pole −45.4 25.5 8.5 31784 10.28 < 0.001
Left superior temporal sulcus; middle temporal gyrus; angular gyrus; inferior temporal gyrus; fusiform gyrus; hippocampus −42.6 −28.2 −6.4 27000 8.53 < 0.001
Right cerebellum 25.3 −76 −32.8 13104 11.48 < 0.001
Left superior frontal gyrus −6.7 47.7 42.6 1976 7.40 0.002
Right hippocampus 24.3 −9.5 −16.7 1904 7.01 0.002

MNI coordinates show centers of mass.

For the narrative comprehension paradigm, the narrative comprehension condition was contrasted to the backwards speech condition (Figure 4A). The most prominent activations were in the left and right temporal lobes. In the left hemisphere, there was activation along the length of the left superior temporal sulcus and adjacent superior temporal and middle temporal gyri, extending from the angular gyrus to the temporal pole. In the right hemisphere, a similar pattern of activation was seen except that activation was absent in the posterior superior temporal sulcus. There was also an activation in the left inferior frontal gyrus extending to the precentral sulcus. Other areas activated were the head of the left caudate nucleus; the precuneus spanning the midline; the superior frontal gyrus spanning the midline; and the right cerebellum.

Figure 4.

Figure 4

Language activation maps derived from the narrative comprehension paradigm. (A) Group analysis in 14 neurologically normal participants. (B) Activation maps in 4 of the 16 individuals with aphasia at two time points each. These 4 patients are the 4 patients in the third row of Figure 3. See Figure 3 caption for additional definitions and explanations.

For the picture naming paradigm, picture naming was contrasted to viewing scrambled pictures (Figure 5A). This contrast activated an extensive set of bilateral regions including ventral occipito-temporal cortex, the posterior superior temporal gyrus, the precentral and postcentral gyri, inferior frontal gyri, and numerous other cortical and subcortical regions. With the exception of inferior frontal activity, which was somewhat left-lateralized, the pattern of activation was generally symmetrical, and largely reflected sensorimotor processes (object perception, speech motor, hearing one’s own voice).

Figure 5.

Figure 5

Language activation maps derived from the picture naming paradigm. (A) Group analysis in 14 neurologically normal participants. (B) Activation maps in 4 of the 16 individuals with aphasia at two time points each. These 4 patients are the 4 patients in the third row of Figure 3. See Figure 3 caption for additional definitions and explanations.

Test-retest reproducibility

Each patient’s two activation maps derived from the adaptive semantic matching paradigm in the two separate sessions are shown in Figure 3B. Activation maps for a subset of four patients (P9-P12, the third row of Figure 3B) are shown for the narrative (Figure 4B) and picture naming (Figure 5B) paradigms. This particular row of patients was selected for visual comparison of the other paradigms because they exemplified a variety of structural and functional patterns.

With the a priori analysis parameter set, the mean Dice coefficient of similarity across patients for the semantic paradigm was 0.66 ± 0.15 (range 0.40–0.82) (Figure 6A), which was higher than the Dice coefficients for the narrative (0.47 ± 0.17; |t(15)| = 4.93; p = 0.0002; Cohen’s dz = 1.23) and picture naming (0.43 ± 0.17; |t(15)| = 4.98; p = 0.0002; Cohen’s dz = 1.24) paradigms, indicating that the semantic paradigm yielded the most reproducible activation maps.

Figure 6.

Figure 6

Psychometric assessment of the three paradigms. (A) Test-retest reproducibility of the three paradigms in individuals with aphasia. The distribution of the Dice coefficient is plotted (relative voxelwise threshold = 5%; minimum cluster volume = 2 cm3; ROI = Lang+, i.e. language regions and plausible candidates for reorganization). (B) Lateralization of language maps. The distribution of the lateralization index is plotted for neurologically normal individuals, and for individuals with aphasia, for each of the three paradigms. Individuals with apparent right hemisphere language dominance are indicated with black dots (NN14 and A12). (C) Sensitivity for detection of dominant hemisphere frontal language area. The distribution of the activation volume in the dominant inferior frontal gyrus is plotted for neurologically normal individuals, and for individuals with aphasia, for each of the three paradigms. (D) Sensitivity for detection of dominant hemisphere temporal language area. The distribution of the activation volume in the temporal ROI is plotted for neurologically normal individuals, and for individuals with aphasia, for each of the three paradigms.

To explore the generality of this finding, the mean Dice coefficient was then plotted for each paradigm as a function of ROI, voxelwise threshold, and cluster extent threshold (Figure 7A). This analysis showed that the semantic paradigm yielded Dice coefficients in the good range under many different parameter combinations. Many of these parameter sets also demonstrated strong evidence for validity (indicated with white rectangles), as will be described below. In contrast, Dice coefficients for the other two paradigms were mostly in the fair range. For these paradigms, activation maps tended to be more reproducible when the ROI was more circumscribed and when voxelwise thresholds were more lenient. However, these parameter sets did not yield good evidence for validity, as will be described below.

Figure 7.

Figure 7

Impact of analysis parameter sets. (A) Impact of analysis parameters on test-retest reproducibility of the three paradigms in individuals with aphasia. Mean Dice coefficients of similarity across participants are plotted as a function of absolute and relative voxelwise thresholds (y axes), region of interest (x axes) and minimum cluster volume (x axes). Regions of interest: Brain = whole brain; Supra = supratentorial cortical regions; Lang+ = known language regions and plausible candidates for reorganization; Lang = known language regions and their homotopic counterparts. White outlines indicate parameter sets that exemplified desirable psychometric properties across the board (Dice ≥ 0.60; LI ≥ 0.60; frontal and temporal regions detected without fail in the neurologically normal group). Thick black outlines show the a priori analysis parameter set. (B) Impact of analysis parameters on language lateralization in neurologically normal participants. Mean lateralization indices across participants are plotted for each paradigm as a function of absolute and relative voxelwise thresholds (y axes), region of interest (x axes) and minimum cluster volume (x axes). (C) Impact of analysis parameters on language lateralization in individuals with aphasia.

Lateralization of language maps

A valid language mapping paradigm is expected to yield left-lateralized activation maps in the majority of neurologically normal participants. Activation maps for the adaptive semantic matching paradigm in each of the 14 neurologically normal participants are shown in Figure 8. Activations were clearly lateralized to the left hemisphere in 12 of 14 participants, were somewhat left-lateralized in one participant (NN13), and surprisingly, were right-lateralized in another participant (NN14).

Figure 8.

Figure 8

Language activation maps derived from the adaptive semantic matching paradigm in 14 neurologically normal individuals. See Figure 3 caption for more information.

The distribution of LIs on the three paradigms in the neurologically normal group is shown in Figure 6B. The individual with right-lateralized activations on the semantic task (NN14) is indicated with black dots. Importantly, NN14 also showed the most rightward lateralization of activation on the narrative comprehension paradigm. Accordingly, we assumed that she actually did have right hemisphere dominance for language, and so for the purpose of all subsequent analyses, her images were mirror-reversed. This means that in NN14, rightward lateralization was interpreted as reflecting correct lateralization (evidence for validity), and when evaluating sensitivity, activations in right rather than left frontal and temporal regions were quantified.

With the a priori analysis parameter set, the mean LI in controls for the semantic paradigm was 0.81 ± 0.24 (range 0.26–1.00), which was higher than the LIs for the narrative (0.34 ± 0.38; |t(13)| = 6.83; p < 0.0001; Cohen’s dz = 1.82) and picture naming (0.11 ± 0.20; |t(13)| = 10.39; p < 0.0001; Cohen’s dz = 2.78) paradigms. This indicates that the semantic paradigm yielded the most lateralized activation maps.

In an exploratory analysis, the distribution of LIs across the three paradigms was also investigated in individuals with aphasia (Figure 6B). This analysis was exploratory because there are no clear predictions as to what valid language maps should look like in individuals with aphasia. This analysis showed that activation maps derived from the semantic paradigm were clearly left-lateralized in 15 out of 16 patients (see also Figure 3B). The only exception was participant A12, whose LIs are indicated with black dots in Figure 6B. This patient also had the second-most right-lateralized activation map for the narrative paradigm, and the most right-lateralized map for the picture naming paradigm. Right lateralization for all three paradigms was reproduced across both sessions (Figure 3B, 4B, 5B). Accordingly, we assumed that this patient actually did have right hemisphere dominance for language (whether or not due to reorganization), and so for the purpose of subsequent analyses, his images were flipped, just as for NN14.

The mean LI in patients for the semantic paradigm was 0.81 ± 0.22 (range 0.38–1.00), which was higher than the LIs for the narrative (0.26 ± 0.43; |t(15)| = 7.59; p < 0.0001; Cohen’s dz = 1.89) and picture naming (0.00 ± 0.34; |t(15)| = 10.28; p < 0.0001; Cohen’s dz = 2.57) paradigms, indicating that the semantic paradigm yielded the most lateralized activation maps in individuals with aphasia, just as it did in the neurologically normal group. We also investigated the test-retest reproducibility of LI in patients: intraclass correlation coefficients (type A-1) were excellent for the semantic (r = 0.88) and narrative (r = 0.83) paradigms, but poor for the picture naming paradigm (r = 0.38).

To explore the generality of these findings, mean LIs were then plotted for each paradigm as a function of ROI, voxelwise threshold, and cluster extent threshold in neurologically normal participants (Figure 7B) and individuals with aphasia (Figure 7C). This analysis showed that the semantic paradigm yielded lateralized activation maps under many different parameter combinations, many of which also showed good reliability and sensitivity (white rectangles). Remarkably similar patterns were seen in the patient group. The narrative paradigm yielded lateralized activation maps under many sets of parameters, albeit always to a lesser extent than the semantic paradigm. The highest LIs arose when voxelwise thresholds were relative and stringent; these parameter sets did not, however, have good reliability. Individuals with aphasia showed less strongly lateralized maps than neurologically normal participants, which follows from the fact that activations are more bilateral for this paradigm, so when the left hemisphere was damaged, the left hemisphere component of the activation map was decreased, while the right hemisphere component remained intact. The picture naming paradigm generally resulted in largely symmetrical language maps, except when the ROI comprised only known language regions and the voxelwise threshold was stringent. Test-retest reproducibility under those circumstances was poor.

Detection of frontal and temporal language areas

Besides revealing lateralized activation maps, a valid language mapping paradigm should activate dominant hemisphere frontal and temporal regions in essentially all neurologically normal participants. Therefore we compared the extent of activations in these regions across the three paradigms.

In neurologically normal participants, the extent of dominant hemisphere inferior frontal activation for the semantic paradigm was 16.0 ± 4.0 cm3 (range 10.5–22.4 cm3), which was greater than the activation extent for the narrative (4.6 ± 3.8 cm3; |t(13)| = 8.21; p < 0.0001; Cohen’s dz = 2.19) or picture naming (3.4 ± 2.5 cm3; |t(13)| = 9.72; p < 0.0001; Cohen’s dz = 2.60) paradigms (Figure 6C). The semantic paradigm yielded substantial frontal activations in all participants (Figure 8), indicating that the dominant frontal region was detected without fail in this cohort, while the narrative and picture naming paradigms yielded small or no activations in some participants (Figure 6C), suggesting a lack of sensitivity.

The extent of dominant hemisphere temporal activation for the semantic paradigm was 13.8 ± 3.6 cm3 (range 8.4–19.1 cm3), which did not differ from the extent for narrative (16.1 ± 5.1 cm3; |t(13)| = 1.69; p = 0.11; Cohen’s dz = −0.45), but was greater than the extent for naming (5.5 ± 3.9 cm3; |t(13)| = 5.80; p < 0.0001; Cohen’s dz = 1.55) (Figure 6D). The semantic and narrative paradigms produced substantial temporal activations in all participants (semantic: Figure 8), indicating that the dominant temporal region was detected without fail in this cohort, while the picture naming paradigm did not.

In individuals with aphasia, there was no expectation that either language region should necessarily be present, due to structural damage as well as possible functional changes. We found that the semantic paradigm yielded frontal activation with extent within or slightly above the range of activation extents of the neurologically normal participants in 12 out of 16 patients (including right-lateralized A12) (Figure 6C); in 10 of these patients, the inferior frontal gyrus was intact and in 2 it was partially damaged (A2, A14) (Figure 3B). Four patients (A1, A8, A13, A15) showed reduced frontal activation that fell outside the normal range (Figure 6C). The inferior frontal gyrus was largely destroyed in two of these cases (A1, A13) and substantially damaged in the other two (A8, A15) (Figure 3B). There was perilesional activation in all four cases (Figure 3B). The narrative and naming paradigms yielded similar extents of frontal activation in individuals with aphasia and neurologically normal participants, suggesting a lack of sensitivity to detect reduced frontal activation (Figure 6C).

The semantic paradigm yielded temporal activation with extent within or slightly above the normal range in 8 patients (A3, A4, A9, A11, A12 (right-lateralized), A13, A14, A15) and close to the normal range and typically localized in 2 patients (A5, A16) (Figure 6D). In these 10 patients, the temporal language ROI was intact in 7 patients and partially damaged in 3 (A9, A13, A14) (Figure 3B). There were 5 patients with markedly reduced temporal activation (A2, A6, A7, A8, A10) and 1 with temporal activation close to the normal extent but atypically localized (A1) (Figure 6D). The temporal language region was severely damaged in all of these six patients (Figure 3B). All showed perilesional activation (Figure 3B). The narrative paradigm showed 9 patients within or slightly above the normal range and 7 below the normal range (A1, A2, A6, A7, A8, A9, A10) (Figure 6D), including all 6 who showed abnormal activation with the semantic paradigm. This supports the validity of both paradigms with respect to detection of reduced or abnormal temporal activation. The picture naming paradigm showed somewhat reduced temporal activation in patients, but temporal activations were not consistent in neurologically normal participants so this could not be interpreted (Figure 6D).

The impact of ROI, voxelwise threshold, and cluster volume cutoff on sensitivity to detect frontal and temporal language regions is described in Supporting Information online.

Revised adaptive semantic stimuli

A second group of neurologically normal individuals was scanned on the revised version of the adaptive semantic matching paradigm, which was modified to minimize ceiling effects in people without aphasia. This second group still performed slightly better on the revised semantic task (87.3 ± 4.0%) than on the revised perceptual task (85.2 ± 2.6% accuracy; |t(15)| = 2.20; p = 0.044), but the ceiling effect on the semantic task was greatly ameliorated (Figure 9A). The increased difficulty of the revised paradigm at harder levels should not pose problems for individuals with aphasia, since the difficulty of the easier levels was not increased.

Figure 9.

Figure 9

Revised adaptive semantic matching paradigm. (A) Accuracy, item difficulty, and reaction time on the revised semantic and perceptual control conditions in a second group of 16 neurologically normal participants. Perc = Perceptual. (B) Group analysis in this neurologically normal group. Whole brain activations were thresholded at voxelwise p < 0.005 then corrected for multiple comparisons at p < 0.01 based on cluster extent.

The mean difficulty level of items presented was greater for semantic trials (5.09 + 0.67) than perceptual trials (4.56 + 0.52; |t(15)| = 3.26; p = 0.0052) (Figure 9A), but this is not especially important since the two tasks are not inherently matched across specific difficulty levels. Reaction times on correct trials were faster on semantic trials (1465 + 172 ms) than perceptual trials (1861 + 159 ms; |t(15)| = 7.82; p < 0.0001) (Figure 9A). While it would be preferable for reaction times to be equivalent, this is not a serious concern as explained earlier.

In this neurologically normal group, the contrast between the semantic and perceptual conditions activated a similar set of brain regions to the original version of the paradigm, including the left inferior frontal gyrus (pars opercularis, triangularis and orbitalis) and posterior superior temporal gyrus, superior temporal sulcus and middle temporal gyrus (Figure 9B). Compared to the original version of the paradigm, there appeared to be more robust activation of left anterior temporal cortex and right inferior frontal cortex, which might be expected given the increased difficulty of the task, but the data from the two versions of the task could not be directly compared, since they were obtained on different scanners.

Discussion

Our findings show that the adaptive semantic matching paradigm is appropriate for individuals with aphasia, has good test-retest reproducibility, and consistently identifies lateralized frontal and temporal language regions. The suitability of the paradigm for people with aphasia was demonstrated by the ability of all 16 participants to learn the task and perform above chance in the scanner. Test-retest reproducibility was quantified in terms of the Dice coefficient of similarity, which was higher for the adaptive semantic matching paradigm than the two comparison paradigms, and was good over a wide range of potential analysis parameter sets. Validity was demonstrated by showing that the adaptive semantic matching paradigm yields more lateralized language maps than the two comparison paradigms, and identifies frontal and temporal language regions with high sensitivity over a wide range of analysis parameter sets.

Design features underlying the feasibility, reliability and validity of the semantic paradigm

The most important feature that makes the semantic paradigm suitable for individuals with aphasia is its adaptive nature, which permits the same paradigm to be used to map language regions in people with different degrees of language impairment, as well as individuals with normal language function. The difficulty of language items is manipulated in terms of lexical frequency, concreteness, word length, age of acquisition of words, degree of relatedness of word pairs, and presentation rate, all of which are soundly established factors that impact the difficulty of linguistic tasks (Coltheart, 1981). The difficulty of the perceptual control task is also adaptively manipulated, ensuring that both tasks remained similarly challenging. Some previous developmental studies have used language mapping paradigms with different predetermined levels of difficulty for children of different age ranges (Berl et al., 2010, 2014; Gaillard et al., 2007; You et al., 2011), and in some studies of aphasia recovery, presentation rates have been adapted for individual patients (e.g. Heiss et al., 1999). Our adaptive semantic paradigm goes beyond these previous approaches in three ways: first, by dynamically selecting stimuli that are uniquely adapted to each individual, second, by manipulating multiple theoretically motivated linguistic variables, and third, by also adjusting the difficulty of the control condition, which was not varied in previous studies.

Another simple but important feature is the use of just a single button by which participants identify ‘match’ trials, such that mismatch trials require no response. This avoids participants having to learn an arbitrary association between responses and buttons, which can prove difficult for some neurological patients. Furthermore, the multiple phases of training—a flexible examiner-guided phase, an independent phase, and another independent phase in the scanner before acquisition of functional data begins—are important in ensuring that all participants can perform the task.

The test-retest reproducibility of the adaptive semantic matching paradigm was good, with a mean Dice coefficient of similarity of 0.66 with our a priori analysis parameter set, and even higher Dice coefficients with other parameter sets. This compares favorably to all other reliability coefficients that have been reported for language mapping paradigms (Billingsley-Marshall, Simos, & Papanicolaou, 2004; Brannen et al., 2001; Fernández et al., 2003; Fesl et al., 2010; Gross & Binder, 2014; Harrington et al., 2006; Maldijan et al., 2002; Rau et al., 2007; Rutten et al., 2002; see Wilson et al., 2017 for review); to our knowledge, the highest Dice coefficient previously reported for a paradigm with reasonable validity is 0.61 (Fesl et al., 2010), and participants in that study were neurologically normal. Other investigations of language mapping reliability in aphasia have used metrics such as the voxelwise intraclass correlation coefficients that do not provide an overall assessment of reliability (Eaton et al., 2008; Meltzer et al., 2009).

Probably the most important factor underlying the good test-retest reproducibility of the semantic paradigm is the fact that both the semantic task and the perceptual task are highly constrained in terms of the linguistic and other processing required. This ensures that similar cognitive states in both conditions are induced each time participants perform the task. In comparison, a contrast such as narrative comprehension versus backwards speech may lead to very different results on different occasions depending on how attention is modulated by the extent to which the participant is interested in the narrative, the extent to which they are interested in the backwards speech, and so on (Bautista & Wilson, 2016; Wild, Davis, & Johnsrude, 2012). Many other factors are likely to have contributed to the reliability observed, such as a preprocessing pipeline optimized for neurological patients (e.g. manual removal of noise components); judicious use of individual gray matter and lesion masks; and systematic consideration of a range of plausible regions of interest, absolute and relative voxelwise thresholds, and cluster volume cutoffs, which can have a large impact on reliability (Bennett & Miller, 2010; Wilson et al., 2017).

Validity was assessed in terms of each paradigm’s ability to reveal known features of normal language organization: lateralization, and activation of frontal and temporal regions (Knecht et al., 2003; Seghier et al., 2011; Springer et al., 1999; Tzourio-Mazoyer et al., 2010; Bradshaw et al., 2017). The strongly lateralized activation maps derived from the semantic paradigm are likely driven by the fact that it is an active, rather than a passive, task (Vannest et al., 2009). That is, it requires specific linguistic processing and overt responses. In contrast, passive tasks such as narrative comprehension yield less lateralized activation patterns (Maldijan et al., 2002; Crinion et al., 2003; Harrington et al., 2006), in accordance with the fact that language comprehension is more bilaterally represented than language production (Hickok & Poeppel, 2007). This is not to imply that the less lateralized language maps derived from the narrative comprehension paradigm are ‘wrong’ per se. But for the purpose of investigating language reorganization after damage to language regions, a paradigm that highlights lateralized aspects of the language network seems to offer the most potential for documenting any functional reorganization, because aphasia so consistently follows from damage to the left, but not the right hemisphere. Note that picture naming is also an active task, but presumably the active and lateralized function of lexical access is so effortless in most cases that activation related to it is swamped by bilateral activations to sensorimotor aspects of the task.

The high sensitivity of the adaptive semantic matching paradigm to both frontal as well as temporal language regions is likely due to the fact that it implicates both active semantic decision making and comprehension processes (Mbwana et al., 2009). A similar but non-adaptive contrast in a well-powered previous study activated the temporal language region much less robustly than we observed (Seghier et al., 2004), suggesting that the challenging processing elicited by the adaptive nature of our paradigm may contribute to its sensitivity. While the narrative paradigm also had excellent temporal sensitivity, it did not have good frontal sensitivity, and neither region was identified consistently by the picture naming paradigm; these findings are consistent with prior research (Rau et al., 2007; Wilson et al., 2017).

Limitations

Despite the overall success of the adaptive semantic matching paradigm, several important limitations should be noted. First, the semantic paradigm identifies only one component of the language network: specifically, the lateralized frontal and temporal regions. While these two lateralized regions arguably constitute the core of the language network, there are several other interacting networks that are also critical for language function. As mentioned earlier, language comprehension is supported not only by these lateralized regions, but also by bilateral regions in the anterior temporal lobe and angular gyrus (Binder, Desai, Graves, & Conant, 2009; Crinion & Price, 2005; Hickok & Poeppel, 2007; Patterson, Nestor, & Rogers, 2007; Rice, Hoffman, & Lambon Ralph, 2015). In addition, there are other lateralized fronto-parietal networks supporting other language processes such as phonological retrieval and encoding (Pillay, Stengel, Humphries, Book, & Binder, 2014) and motor speech programming (Graff-Radford et al., 2014). In a future publication, we will describe two adaptive phonological tasks that attempt to identify some of these other regions.

Second, despite our best efforts to design a maximally simple paradigm, not all individuals with aphasia will be able to perform the adaptive semantic matching task. In our cohort of 16 people with chronic post-stroke aphasia, drawn mostly from a community aphasia group, all but one patient had excellent comprehension of written single words, and all patients demonstrated preserved non-verbal semantic processing. If either or both of these functions were impaired, as may be expected in syndromes such as global aphasia, Wernicke’s aphasia, or semantic dementia, then task performance would clearly be impacted. Indeed, the one patient with impaired single word comprehension showed a floor effect on the semantic task (though his performance was still better than chance).

Third, and related to this observation, the adaptive design was not entirely successful in equating performance across the semantic and perceptual tasks. Not only was there a floor effect in the semantic condition in one patient, but in the original version of the paradigm (on which most of the data were acquired), there were also ceiling effects for all but one of the neurologically normal participants, and even one of the individuals with aphasia (notably, the one who had almost completely recovered). The revised paradigm, intended for future use, largely resolved this problem. However floor effects will likely continue to pose problems in individuals with more impaired language; if these individuals are not performing at the same accuracy rate as others, then this would be a limitation when comparing their language maps to less impaired patients or neurologically normal individuals. Finally it should be noted that the yoking of presentation rate across the two conditions, which was necessary to counterbalance sensorimotor aspects of the tasks, will inherently interfere with the independence of the interweaved adaptive staircase procedures.

Fourth, while a mean Dice coefficient of similarity of 0.66 reflects very satisfactory test-retest reproducibility as can be readily appreciated in Figure 3, there were nevertheless sometimes still significant discrepancies between the language maps obtained in the two sessions. For example, patient A1 had a much more prominent right frontal activation in the second session compared to the first session (this participant’s Dice coefficient was 0.42, which was one of the lowest). Therefore, it is still necessary to be cautious in attributing observed changes in language activation maps to neuroplasticity. Approaches such as multiple sessions at each time point, and random effects group analyses where feasible, will still be important in documenting any functional reorganization that may take place (Kiran et al., 2013; Meinzer et al., 2013). It is also noteworthy that better reliability was obtained with relative rather than absolute thresholds. This makes sense, given that relative thresholds essentially factor out general activation strength, which is one important source of undesirable variability (Gross & Binder, 2014; Knecht et al., 2003; Voyvodic, 2012; Wilson et al, 2017). However, the downside of relative thresholds is that any genuine changes in the extent of cortical regions devoted to language processing would not be captured, which could be a significant limitation depending on one’s hypotheses.

Language localization in chronic post-stroke aphasia

While it was not the primary goal of this paper to address empirical questions regarding language localization in chronic aphasia, our findings were clear enough in some respects that some preliminary observations can be made.

First, the dominant hemisphere frontal and temporal regions that were activated in every neurologically normal individual were also activated in many individuals with aphasia. Specifically, twelve patients showed essentially normal frontal activations and ten showed normal or near-normal temporal activations (albeit reduced in extent in some cases). Generally these normally activated regions were structurally spared, but there were several cases in which partially damaged regions showed normal or near-normal activation, suggesting that partially damaged regions may often retain their functional role in language processing. This basic preservation of language activation patterns in chronic post-stroke aphasia is consistent with prior research (Griffis et al., 2017; Heiss & Thiel, 2006; Saur et al., 2006).

Second, four patients showed reduced or absent inferior frontal activations, and six patients showed reduced, absent or aberrant temporal activations. In all of these cases, the structures in question were substantially damaged or destroyed, and in all cases perilesional activation was observed. It is not yet clear whether these perilesional activations represent the outcome of functional reorganization (Robson et al., 2014), or whether these perilesional regions would have been involved in language processing premorbidly, since there was considerable variability in neurologically normal participants in terms of the extent and precise localization of frontal and temporal activations.

Third, it is apparent that reorganization of language processing to the right hemisphere (Weiller et al., 1995; Turkeltaub et al., 2011) is an uncommon outcome at best. Only one individual with aphasia (A12) showed right-lateralized language regions. It cannot be determined whether this represents the outcome of reorganization, or whether he had premorbid right-lateralized or bilateral language function. Given that his left hemisphere stroke did result in aphasia, right hemisphere dominance for language seems unlikely. But bilateral language, similar to neurologically normal participant NN13, is a distinct possibility. The absence of any strong evidence for reorganization to the right hemisphere in any other patient is especially striking in view of how well recovered many language functions were relative to lesion location. In particular, four of the five individuals with conduction aphasia (A6, A7, A8, A10) had large left posterior temporal lesions and none had normal temporal functional activation. Yet all of these participants had excellent single word comprehension and semantic processing abilities, as evidenced by their task performance, as well as language and neuropsychological data (Wilson et al., 2018). They also were fluent, could produce sentences (albeit with paragrammatic and paraphasic errors), and were able to retrieve words, repeat words and sentences, and read aloud, to varying extents (Wilson et al., 2018). All of these functions normally depend on the posterior left temporal language area, but show quite rapid recovery after this region is damaged (Kertesz, Lau, & Polk, 1993; Naeser, Helm-Estabrooks, Haas, Auerbach, & Srinivasan, 1987; Selnes, Knopman, Niccum, Rubens, & Larson, 1983; Selnes, Niccum, Knopman, & Rubens, 1984; Yagata et al., 2017). There was little evidence for comparable activation of the homotopic right hemisphere region, even though it would be a plausible substrate for reorganization given the relative bilaterality of temporal lobe language regions to begin with (Hickok & Poeppel, 2007). While these four patients each showed small right temporal lobe activations in one or both sessions, these activations were limited in extent, were mostly not homotopic to the left posterior temporal language region, and their specific locations were never replicated across the two sessions.

Why then is language function as good as it is in these individuals? There are several possibilities, which cannot yet be distinguished between. It may be that small and displaced residual left hemisphere temporal and/or parietal activations that each of these participants evidenced (e.g. anterior temporal in A6, ventral temporal in A7, angular gyrus in A8, etc.) are sufficient to support the relevant language functions to the extent that they are spared. Another possibility is that right temporal cortex is in fact now critical for language function, but is not activated by the semantic task for some unknown reason, though this would of course be difficult to explain given the robust activation of the left temporal cortex by the semantic contrast in neurologically normal individuals. Addressing these kinds of questions will require recruiting larger groups of individuals with aphasia so that relationships between functional activation of different regions, and patterns of spared and impaired language function, can be investigated.

Conclusion

The adaptive semantic matching paradigm described in this paper is appropriate for individuals with aphasia, has good test-retest reproducibility, and successfully identifies lateralized frontal and temporal language regions. Accordingly, it should prove to be a useful tool for research on neuroplasticity in recovery from aphasia. Moreover, the paradigm also has a potential application in presurgical language mapping (Binder et al., 2008), especially in the minority of presurgical patients who present with significant language deficits. Given the substantial psychometric differences we observed between paradigms, it is clear that future language mapping paradigms should be carefully assessed in terms of feasibility, reliability and validity.

Supplementary Material

Supp info

Acknowledgments

This research was supported in part by the National Institutes of Health (National Institute on Deafness and Other Communication Disorders) under grants R01 DC013270 and R21 DC016080. We thank Fabi Hirsch, Pélagie Beeson and Kindle Rising for facilitating recruitment of individuals with aphasia, Stefanie Lauderdale and Scott Squire for assistance with data collection, Andrew DeMarco for helpful discussions and assistance with lesion delineation, four anonymous reviewers for their constructive comments, and all of the participants who took part in the study.

References

  1. Abel S, Weiller C, Huber W, Willmes K, Specht K. Therapy-induced brain reorganization patterns in aphasia. Brain. 2015;138:1097–1112. doi: 10.1093/brain/awv022. [DOI] [PubMed] [Google Scholar]
  2. Allendorfer JB, Kissela BM, Holland SK, Szaflarski JP. Different patterns of language activation in post-stroke aphasia are detected by overt and covert versions of the verb generation fMRI task. Medical Science Monitor. 2012;18:CR135–CR147. doi: 10.12659/MSM.882518. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Ashburner J, Friston KJ. Unified segmentation. NeuroImage. 2005;26:839–851. doi: 10.1016/j.neuroimage.2005.02.018. [DOI] [PubMed] [Google Scholar]
  4. Bautista A, Wilson SM. Neural responses to grammatically and lexically degraded speech. Language, Cognition and Neuroscience. 2016;31:567–574. doi: 10.1080/23273798.2015.1123281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Beckmann CF, Smith SM. Probabilistic independent component analysis for functional magnetic resonance imaging. IEEE Transactions on Medical Imaging. 2004;23:137–152. doi: 10.1109/TMI.2003.822821. [DOI] [PubMed] [Google Scholar]
  6. Bennett CM, Miller MB. How reliable are the results from functional magnetic resonance imaging? Annals of the New York Academy of Sciences. 2010;1191:133–155. doi: 10.1111/j.1749-6632.2010.05446.x. [DOI] [PubMed] [Google Scholar]
  7. Berl MM, Duke ES, Mayo J, Rosenberger LR, Moore EN, VanMeter J, Gaillard WD. Functional anatomy of listening and reading comprehension during development. Brain and Language. 2010;114:115–125. doi: 10.1016/j.bandl.2010.06.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Berl MM, Zimmaro LA, Khan OI, Dustin I, Ritzl E, Duke ES, Gaillard WD. Characterization of atypical language activation patterns in focal epilepsy. Annals of Neurology. 2014;75:33–42. doi: 10.1002/ana.24015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Berthier ML, Davila G, editors. Pharmacology and aphasia. New York: Routledge; 2014. [Google Scholar]
  10. Billingsley-Marshall R, Simos P, Papanicolaou A. Reliability and validity of functional neuroimaging techniques for identifying language-critical areas in children and adults. Developmental Neuropsychology. 2004;26:541–563. doi: 10.1207/s15326942dn2602_1. [DOI] [PubMed] [Google Scholar]
  11. Binder JR, Desai RH, Graves WW, Conant LL. Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cerebral Cortex. 2009;19:2767–2796. doi: 10.1093/cercor/bhp055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Binder JR, Frost JA, Hammeke TA, Cox RW, Rao SM, Prieto T. Human brain language areas identified by functional magnetic resonance imaging. Journal of Neuroscience. 1997;17:353–362. doi: 10.1523/JNEUROSCI.17-01-00353.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Binder JR, Medler DA, Desai R, Conant LL, Liebenthal E. Some neurophysiological constraints on models of word naming. NeuroImage. 2005;27:677–693. doi: 10.1016/j.neuroimage.2005.04.029. [DOI] [PubMed] [Google Scholar]
  14. Binder JR, Swanson SJ, Hammeke TA, Sabsevitz DS. A comparison of five fMRI protocols for mapping speech comprehension systems. Epilepsia. 2008;49:1980–1997. doi: 10.1111/j.1528-1167.2008.01683.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Birn RM, Bandettini PA, Cox RW, Shaker R. Event-related fMRI of tasks involving brief motion. Human Brain Mapping. 1999;7:106–114. doi: 10.1002/(SICI)1097-0193(1999)7:2&#x0003c;106::AID-HBM4&#x0003e;3.0.CO;2-O. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Bradshaw AR, Thompson PA, Wilson AC, Bishop DVM, Woodhead ZVJ. Measuring language lateralisation with different language tasks: a systematic review. PeerJ. 2017;5:e3929. doi: 10.7717/peerj.3929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Brady MC, Kelly H, Godwin J, Enderby P, Campbell P. Speech and language therapy for aphasia following stroke. Cochrane Database of Systematic Reviews. 2016;6:CD000425. doi: 10.1002/14651858.CD000425.pub4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Brainard DH. The psychophysics toolbox. Spatial Vision. 1997;10:433–436. [PubMed] [Google Scholar]
  19. Brallier J. Who was Albert Einstein? New York: Grosset & Dunlap; 2002. [Google Scholar]
  20. Brannen JH, Badie B, Moritz CH, Quigley M, Meyerand ME, Haughton VM. Reliability of functional MR imaging with word-generation tasks for mapping Broca’s area. American Journal of Neuroradiology. 2001;22:1711–1718. [PMC free article] [PubMed] [Google Scholar]
  21. Breining BL, Lala T, Cuitiño MM, Manes F, Peristeri E, Tsapkini K, Hillis AE. A brief assessment of object semantics in primary progressive aphasia. Aphasiology. 2015;29:488–505. [Google Scholar]
  22. Brett M, Leff AP, Rorden C, Ashburner J. Spatial normalization of brain images with focal lesions using cost function masking. NeuroImage. 2001;14:486–500. doi: 10.1006/nimg.2001.0845. [DOI] [PubMed] [Google Scholar]
  23. Cicchetti DV. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment. 1994;6:284–290. [Google Scholar]
  24. Coltheart M. The MRC psycholinguistic database. Quarterly Journal of Experimental Psychology A: Human Experimental Psychology. 1981;33A:497–505. [Google Scholar]
  25. Cox RW. AFNI: Software for analysis and visualization of functional magnetic resonance neuroimages. Computers and Biomedical Research. 1996;29:162–173. doi: 10.1006/cbmr.1996.0014. [DOI] [PubMed] [Google Scholar]
  26. Crinion J, Price CJ. Right anterior superior temporal activation predicts auditory sentence comprehension following aphasic stroke. Brain. 2005;128:2858–2871. doi: 10.1093/brain/awh659. [DOI] [PubMed] [Google Scholar]
  27. Crinion JT, Lambon Ralph MA, Warburton EA, Howard D, Wise RJS. Temporal lobe regions engaged during normal speech comprehension. Brain. 2003;126:1193–1201. doi: 10.1093/brain/awg104. [DOI] [PubMed] [Google Scholar]
  28. Crinion JT, Warburton EA, Lambon Ralph MA, Howard D, Wise RJS. Listening to narrative speech after aphasic stroke: the role of the left anterior temporal lobe. Cerebral Cortex. 2006;16:1116–1125. doi: 10.1093/cercor/bhj053. [DOI] [PubMed] [Google Scholar]
  29. Eaton KP, Szaflarski JP, Altaye M, Ball AL, Kissela BM, Banks C, Holland SK. Reliability of fMRI for studies of language in post-stroke aphasia subjects. NeuroImage. 2008;41:311–322. doi: 10.1016/j.neuroimage.2008.02.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Edgers G. Who were the Beatles? New York: Grosset & Dunlap; 2006. [Google Scholar]
  31. Eklund A, Nichols TE, Knutsson H. Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates. Proceedings of the National Academy of Sciences USA. 2016;113:7900–7905. doi: 10.1073/pnas.1602413113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Fedorenko E, Thompson-Schill SL. Reworking the language network. Trends in Cognitive Sciences. 2014;18:120–126. doi: 10.1016/j.tics.2013.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Fernández G, de Greiff A, von Oertzen J, Reuber M, Lun S, Klaver P, Elger CE. Language mapping in less than 15 minutes: Real-time functional MRI during routine clinical investigation. NeuroImage. 2001;14:585–594. doi: 10.1006/nimg.2001.0854. [DOI] [PubMed] [Google Scholar]
  34. Fernández G, Specht K, Weis S, Tendolkar I, Reuber M, Fell J, Elger CE. Intrasubject reproducibility of presurgical language lateralization and mapping using fMRI. Neurology. 2003;60:969–975. doi: 10.1212/01.wnl.0000049934.34209.2e. [DOI] [PubMed] [Google Scholar]
  35. Fesl G, Bruhns P, Rau S, Wiesmann M, Ilmberger J, Kegel G, Brueckmann H. Sensitivity and reliability of language laterality assessment with a free reversed association task—a fMRI study. European Radiology. 2010;20:683–695. doi: 10.1007/s00330-009-1602-4. [DOI] [PubMed] [Google Scholar]
  36. Folstein MF, Folstein SE, McHugh PR. “Mini-mental state” : A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research. 1975;12:189–198. doi: 10.1016/0022-3956(75)90026-6. [DOI] [PubMed] [Google Scholar]
  37. Fridriksson J, Baker JM, Moser D. Cortical mapping of naming errors in aphasia. Human Brain Mapping. 2009;30:2487–2498. doi: 10.1002/hbm.20683. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Fridriksson J, Richardson JD, Fillmore P, Cai B. Left hemisphere plasticity and aphasia recovery. NeuroImage. 2012;60:854–863. doi: 10.1016/j.neuroimage.2011.12.057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Gaillard WD, Berl MM, Moore EN, Ritzl EK, Rosenberger LR, Weinstein SL, Theodore WH. Atypical language in lesional and nonlesional complex partial epilepsy. Neurology. 2007;69:1761–1771. doi: 10.1212/01.wnl.0000289650.48830.1a. [DOI] [PubMed] [Google Scholar]
  40. García-Pérez MA. Forced-choice staircases with fixed step sizes: asymptotic and small-sample properties. Vision Research. 1998;38:1861–1881. doi: 10.1016/s0042-6989(97)00340-4. [DOI] [PubMed] [Google Scholar]
  41. Gelinas JN, Fitzpatrick KPV, Kim HC, Bjornson BH. Cerebellar language mapping and cerebral language dominance in pediatric epilepsy surgery patients. NeuroImage: Clinical. 2014;6:296–306. doi: 10.1016/j.nicl.2014.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Geranmayeh F, Brownsett SLE, Wise RJS. Task-induced brain activity in aphasic stroke patients: What is driving recovery? Brain. 2014;137:2632–2648. doi: 10.1093/brain/awu163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Gitelman DR, Nobre AC, Sonty S, Parrish TB, Mesulam MM. Language network specializations: An analysis with parallel task designs and functional magnetic resonance imaging. NeuroImage. 2005;26:975–985. doi: 10.1016/j.neuroimage.2005.03.014. [DOI] [PubMed] [Google Scholar]
  44. Giussani C, Roux FE, Ojemann J, Sganzerla EP, Pirillo D, Papagno C. Is preoperative functional magnetic resonance imaging reliable for language areas mapping in brain tumor surgery? Review of language functional magnetic resonance imaging and direct cortical stimulation correlation studies. Neurosurgery. 2010;66:113–120. doi: 10.1227/01.NEU.0000360392.15450.C9. [DOI] [PubMed] [Google Scholar]
  45. Graff-Radford J, Jones DT, Strand EA, Rabinstein AA, Duffy JR, Josephs KA. The neuroanatomy of pure apraxia of speech in stroke. Brain and Language. 2014;129:43–46. doi: 10.1016/j.bandl.2014.01.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Griffis JC, Nenert R, Allendorfer JB, Vannest J, Holland S, Dietz A, Szaflarski JP. The canonical semantic network supports residual language function in chronic post-stroke aphasia. Human Brain Mapping. 2017;38:1636–1658. doi: 10.1002/hbm.23476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Gross WL, Binder JR. Alternative thresholding methods for fMRI data optimized for surgical planning. NeuroImage. 2014;84:554–561. doi: 10.1016/j.neuroimage.2013.08.066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Harrington GS, Buonocore MH, Farias ST. Intrasubject reproducibility of functional MR imaging activation in language tasks. American Journal of Neuroradiology. 2006;27:938–944. [PMC free article] [PubMed] [Google Scholar]
  49. Heiss WD, Kessler J, Thiel A, Ghaemi M, Karbe H. Differential capacity of left and right hemispheric areas for compensation of poststroke aphasia. Annals of Neurology. 1999;45:430–438. doi: 10.1002/1531-8249(199904)45:4<430::aid-ana3>3.0.co;2-p. [DOI] [PubMed] [Google Scholar]
  50. Heiss WD, Thiel A. A proposed regional hierarchy in recovery of post-stroke aphasia. Brain and Language. 2006;98:118–123. doi: 10.1016/j.bandl.2006.02.002. [DOI] [PubMed] [Google Scholar]
  51. Hickok G, Poeppel D. The cortical organization of speech processing. Nature Reviews Neuroscience. 2007;8:393–402. doi: 10.1038/nrn2113. [DOI] [PubMed] [Google Scholar]
  52. Howard D, Patterson K. Pyramids and palm trees: A test of semantic access from pictures and words. Bury St. Edmunds; Thames Valley: 1992. [Google Scholar]
  53. Janecek JK, Swanson SJ, Sabsevitz DS, Hammeke TA, Raghavan ME, Rozman M, Binder JR. Language lateralization by fMRI and Wada testing in 229 patients with epilepsy: Rates and predictors of discordance. Epilepsia. 2013;54:314–322. doi: 10.1111/epi.12068. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Jansen A, Menke R, Sommer J, Förster AF, Bruchmann S, Hempleman J, Knecht S. The assessment of hemispheric lateralization in functional MRI—Robustness and reproducibility. NeuroImage. 2006;33:204–217. doi: 10.1016/j.neuroimage.2006.06.019. [DOI] [PubMed] [Google Scholar]
  55. Kelly RE, Jr, Alexopoulos GS, Wang Z, Gunning FM, Murphy CF, Morimoto SS, Hoptman MJ. Visual inspection of independent components: Defining a procedure for artifact removal from fMRI data. Journal of Neuroscience Methods. 2010;189:233–245. doi: 10.1016/j.jneumeth.2010.03.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Kertesz A. Western Aphasia Battery. New York: Grune and Stratton; 1982. [Google Scholar]
  57. Kertesz A, Lau WK, Polk M. The Structural determinants of recovery in Wernicke’s aphasia. Brain and Language. 1993;44:153–164. doi: 10.1006/brln.1993.1010. [DOI] [PubMed] [Google Scholar]
  58. Kertesz A, McCabe P. Recovery patterns and prognosis in aphasia. Brain. 1977;100:1–18. doi: 10.1093/brain/100.1.1. [DOI] [PubMed] [Google Scholar]
  59. Kim KK, Karunanayaka P, Privitera MD, Holland SK, Szaflarski JP. Semantic association investigated with functional MRI and independent component analysis. Epilepsy & Behavior. 2011;20:613–622. doi: 10.1016/j.yebeh.2010.11.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Kiran S, Ansaldo A, Bastiaanse R, Cherney LR, Howard D, Faroqi-Shah Y, Thompson CK. Neuroimaging in aphasia treatment research: Standards for establishing the effects of treatment. NeuroImage. 2013;76:428–435. doi: 10.1016/j.neuroimage.2012.10.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Kiran S, Meier EL, Kapse KJ, Glynn PA. Changes in task-based effective connectivity in language networks following rehabilitation in post-stroke patients with aphasia. Frontiers in Human Neuroscience. 2015;9:316. doi: 10.3389/fnhum.2015.00316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Knecht S, Jansen A, Frank A, van Randenborgh J, Sommer J, Kanowski M, Heinze HJ. How atypical is atypical language dominance? NeuroImage. 2003;18:917–927. doi: 10.1016/s1053-8119(03)00039-9. [DOI] [PubMed] [Google Scholar]
  63. Kuperman V, Stadthagen-Gonzalez H, Brysbaert M. Age-of-acquisition ratings for 30,000 English words. Behavior Research Methods. 2012;44:978–990. doi: 10.3758/s13428-012-0210-4. [DOI] [PubMed] [Google Scholar]
  64. Leek MR. Adaptive procedures in psychophysical research. Perception & Psychophysics. 2001;63:1279–1292. doi: 10.3758/bf03194543. [DOI] [PubMed] [Google Scholar]
  65. Lund K, Burgess C. Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers. 1996;28:203–208. [Google Scholar]
  66. Maldjian JA, Laurienti PJ, Driskill L, Burdette JH. Multiple reproducibility indices for evaluation of cognitive functional MR imaging paradigms. American Journal of Neuroradiology. 2002;23:1030–1037. [PMC free article] [PubMed] [Google Scholar]
  67. Mandera P, Keuleers E, Brysbaert M. Explaining human performance in psycholinguistic tasks with models of semantic similarity based on prediction and counting: A review and empirical validation. Journal of Memory and Language. 2017;92:57–78. [Google Scholar]
  68. Mbwana J, Berl MM, Ritzl EK, Rosenberger L, Mayo J, Weinstein S, Gaillard WD. Limitations to plasticity of language network reorganization in localization related epilepsy. Brain. 2009;132:347–356. doi: 10.1093/brain/awn329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Meinzer M, Beeson PM, Cappa S, Crinion J, Kiran S, Saur D, Neuroimaging in Aphasia Treatment Research Workshop Neuroimaging in aphasia treatment research: consensus and practical guidelines for data analysis. NeuroImage. 2013;73:215–224. doi: 10.1016/j.neuroimage.2012.02.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Meltzer JA, Postman-Caucheteux WA, McArdle JJ, Braun AR. Strategies for longitudinal neuroimaging studies of overt language production. NeuroImage. 2009;47:745–755. doi: 10.1016/j.neuroimage.2009.04.089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Nadeau SE. Neuroplastic mechanisms of language recovery after stroke. In: Tracy JI, Hampstead BM, Sathian K, editors. Cognitive plasticity in neurologic disorders. Oxford: Oxford University Press; 2014. pp. 61–84. [Google Scholar]
  72. Naeser MA, Helm-Estabrooks N, Haas G, Auerbach S, Srinivasan M. Relationship between lesion extent in ‘Wernicke’s area’ on computed tomographic scan and predicting recovery of comprehension in Wernicke’s aphasia. Archives of Neurology. 1987;44:73–82. doi: 10.1001/archneur.1987.00520130057018. [DOI] [PubMed] [Google Scholar]
  73. Nelson D, McEvoy C, Schreiber T. The University of South Florida word association, rhyme, and word fragment norms. 1998 doi: 10.3758/bf03195588. http://w3.usf.edu/FreeAssociation. [DOI] [PubMed]
  74. Patterson K, Nestor PJ, Rogers TT. Where do you know what you know? The representation of semantic knowledge in the human brain. Nature Reviews Neuroscience. 2007;8:976–987. doi: 10.1038/nrn2277. [DOI] [PubMed] [Google Scholar]
  75. Pelli DG. The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision. 1997;10:437–442. [PubMed] [Google Scholar]
  76. Pillay SB, Stengel BC, Humphries C, Book DS, Binder JR. Cerebral localization of impaired phonological retrieval during rhyme judgment. Annals of Neurology. 2014;76:738–746. doi: 10.1002/ana.24266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Postman-Caucheteux WA, Birn RM, Pursley RH, Butman JA, Solomon JM, Picchioni D, Braun AR. Single-trial fMRI shows contralesional activity linked to overt naming errors in chronic aphasic patients. Journal of Cognitive Neuroscience. 2010;22:1299–1318. doi: 10.1162/jocn.2009.21261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Price CJ, Crinion J. The latest on functional imaging studies of aphasic stroke. Current Opinion in Neurology. 2005;18:429–434. doi: 10.1097/01.wco.0000168081.76859.c1. [DOI] [PubMed] [Google Scholar]
  79. Price CJ, Crinion J, Friston KJ. Design and analysis of fMRI studies with neurologically impaired patients. Journal of Magnetic Resonance Imaging. 2006;23:816–826. doi: 10.1002/jmri.20580. [DOI] [PubMed] [Google Scholar]
  80. Rau S, Fesl G, Bruhns P, Havel P, Braun B, Tonn JC, Ilmberger J. Reproducibility of activations in Broca area with two language tasks: A functional MR imaging study. American Journal of Neuroradiology. 2007;28:1346–1353. doi: 10.3174/ajnr.A0581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Reppen R, Ide N, Suderman K. American National Corpus (ANC) second release. Philadelphia: Linguistic Data Consortium; 2005. [Google Scholar]
  82. Rice GE, Hoffman P, Lambon Ralph MA. Graded specialization within and between the anterior temporal lobes. Annals of the New York Academy of Sciences. 2015;1359:84–97. doi: 10.1111/nyas.12951. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Robson H, Zahn R, Keidel JL, Binney RJ, Sage K, Lambon Ralph MA. The anterior temporal lobes support residual comprehension in Wernicke’s aphasia. Brain. 2014;137:931–943. doi: 10.1093/brain/awt373. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Rombouts SA, Barkhof F, Hoogenraad FG, Sprenger M, Valk J, Scheltens P. Test-retest analysis with functional MR of the activated area in the human visual cortex. American Journal of Neuroradiology. 1997;18:1317–1322. [PMC free article] [PubMed] [Google Scholar]
  85. Rossion B, Pourtois G. Revisiting Snodgrass and Vanderwart’s object pictorial set: The role of surface detail in basic-level object recognition. Perception. 2004;33:217–236. doi: 10.1068/p5117. [DOI] [PubMed] [Google Scholar]
  86. Rutten GJM, Ramsey NF, van Rijen PC, van Veelen CWM. Reproducibility of fMRI-determined language lateralization in individual subjects. Brain and Language. 2002;80:421–437. doi: 10.1006/brln.2001.2600. [DOI] [PubMed] [Google Scholar]
  87. Sanai N, Mirzadeh Z, Berger MS. Functional outcome after language mapping for glioma resection. New England Journal of Medicine. 2008;358:18–27. doi: 10.1056/NEJMoa067819. [DOI] [PubMed] [Google Scholar]
  88. Saur D, Hartwigsen G. Neurobiology of language recovery after stroke: lessons from neuroimaging studies. Archives of Physical Medicine and Rehabilitation. 2012;93:S15–S25. doi: 10.1016/j.apmr.2011.03.036. [DOI] [PubMed] [Google Scholar]
  89. Saur D, Lange R, Baumgaertner A, Schraknepper V, Willmes K, Rijntjes M, Weiller C. Dynamics of language reorganization after stroke. Brain. 2006;129:1371–1384. doi: 10.1093/brain/awl090. [DOI] [PubMed] [Google Scholar]
  90. Seghier ML, Kherif F, Josse G, Price CJ. Regional and hemispheric determinants of language laterality: Implications for preoperative fMRI. Human Brain Mapping. 2011;32:1602–1614. doi: 10.1002/hbm.21130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Seghier ML, Lazeyras F, Pegna AJ, Annoni JM, Zimine I, Mayer E, Khateb A. Variability of fMRI activation during a phonological and semantic language task in healthy subjects. Human Brain Mapping. 2004;23:140–155. doi: 10.1002/hbm.20053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Selnes OA, Knopman DS, Niccum N, Rubens AB, Larson D. Computed tomographic scan correlates of auditory comprehension deficits in aphasia: A prospective recovery study. Annals of Neurology. 1983;13:558–566. doi: 10.1002/ana.410130515. [DOI] [PubMed] [Google Scholar]
  93. Selnes OA, Niccum N, Knopman DS, Rubens AB. Recovery of single word comprehension: CT-scan correlates. Brain and Language. 1984;21:72–84. doi: 10.1016/0093-934x(84)90037-3. [DOI] [PubMed] [Google Scholar]
  94. Shah PP, Szaflarski JP, Allendorfer J, Hamilton RH. Induction of neuroplasticity and recovery in post-stroke aphasia by non-invasive brain stimulation. Frontiers in Human Neuroscience. 2013;7:888. doi: 10.3389/fnhum.2013.00888. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Sharp DJ, Scott SK, Wise RJS. Retrieving meaning after temporal lobe infarction: The role of the basal language area. Annals of Neurology. 2004;56:836–846. doi: 10.1002/ana.20294. [DOI] [PubMed] [Google Scholar]
  96. Snodgrass JG, Vanderwart M. A standardized set of 260 pictures: Norms for name agreement, image agreement, familiarity, and visual complexity. Journal of Experimental Psychology. 1980;6:174–215. doi: 10.1037//0278-7393.6.2.174. [DOI] [PubMed] [Google Scholar]
  97. Springer JA, Binder JR, Hammeke TA, Swanson SJ, Frost JA, Bellgowan PS, Mueller WM. Language dominance in neurologically normal and epilepsy subjects: A functional MRI study. Brain. 1999;122:2033–2046. doi: 10.1093/brain/122.11.2033. [DOI] [PubMed] [Google Scholar]
  98. Swinburn K, Porter G, Howard D. Comprehensive Aphasia Test. Hove: Psychology Press; 2004. [Google Scholar]
  99. Szaflarski JP, Allendorfer JB, Banks C, Vannest J, Holland SK. Recovered vs. not-recovered from post-stroke aphasia: The contributions from the dominant and non-dominant hemispheres. Restorative Neurology and Neuroscience. 2013;31:347–360. doi: 10.3233/RNN-120267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Szaflarski JP, Holland SK, Jacola LM, Lindsell C, Privitera MD, Szaflarski M. Comprehensive presurgical functional MRI language evaluation in adult patients with epilepsy. Epilepsy & Behavior. 2008;12:74–83. doi: 10.1016/j.yebeh.2007.07.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Thompson CK, den Ouden DB. Neuroimaging and recovery of language in aphasia. Current Neurology and Neuroscience Reports. 2008;8:475–483. doi: 10.1007/s11910-008-0076-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Turkeltaub PE, Messing S, Norise C, Hamilton RH. Are networks for residual language function and recovery consistent across aphasic patients? Neurology. 2011;76:1726–1734. doi: 10.1212/WNL.0b013e31821a44c1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, Joliot M. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage. 2002;15:273–289. doi: 10.1006/nimg.2001.0978. [DOI] [PubMed] [Google Scholar]
  104. Tzourio-Mazoyer N, Petit L, Razafimandimby A, Crivello F, Zago L, Jobard G, Mazoyer B. Left hemisphere lateralization for language in right-handers is controlled in part by familial sinistrality, manual preference strength, and head size. Journal of Neuroscience. 2010;30:13314–13318. doi: 10.1523/JNEUROSCI.2593-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. van Oers CAM, Vink M, van Zandvoort MJ, van der Worp HB, de Haan EH, Kappelle LJ, Dijkhuizen RM. Contribution of the left and right inferior frontal gyrus in recovery from aphasia: A functional MRI study in stroke patients with preserved hemodynamic responsiveness. NeuroImage. 2010;49:885–893. doi: 10.1016/j.neuroimage.2009.08.057. [DOI] [PubMed] [Google Scholar]
  106. Vannest JJ, Karunanayaka PR, Altaye M, Schmithorst VJ, Plante EM, Eaton KJ, Holland SK. Comparison of fMRI data from passive listening and active-response story processing tasks in children. Journal of Magnetic Resonance Imaging. 2009;29:971–976. doi: 10.1002/jmri.21694. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Voyvodic JT. Reproducibility of single-subject fMRI language mapping with AMPLE normalization. Journal of Magnetic Resonance Imaging. 2012;36:569–580. doi: 10.1002/jmri.23686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Warren JE, Crinion JT, Lambon Ralph MA, Wise RJS. Anterior temporal lobe connectivity correlates with functional outcome after aphasic stroke. Brain. 2009;132:3428–3442. doi: 10.1093/brain/awp270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Weiller C, Isensee C, Rijntjes M, Huber W, Müller S, Bier D, Diener HC. Recovery from Wernicke’s aphasia: A positron emission tomographic study. Annals of Neurology. 1995;37:723–732. doi: 10.1002/ana.410370605. [DOI] [PubMed] [Google Scholar]
  110. Wild CJ, Davis MH, Johnsrude IS. Human auditory cortex is sensitive to the perceived clarity of speech. NeuroImage. 2012;60:1490–1502. doi: 10.1016/j.neuroimage.2012.01.035. [DOI] [PubMed] [Google Scholar]
  111. Wilke M, Lidzba K. LI-tool: A new toolbox to assess lateralization in functional MR-data. Journal of Neuroscience Methods. 2007;163:128–136. doi: 10.1016/j.jneumeth.2007.01.026. [DOI] [PubMed] [Google Scholar]
  112. Wilson SM, Bautista A, Yen M, Lauderdale S, Eriksson DK. Validity and reliability of four language mapping paradigms. NeuroImage: Clinical. 2017;16:399–408. doi: 10.1016/j.nicl.2016.03.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. Wilson SM, Eriksson DK, Schneck SM, Lucanie JM. A quick aphasia battery for efficient, reliable, and multidimensional assessment of language function. PLoS ONE. 2018;13:e0192773. doi: 10.1371/journal.pone.0192773. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Woermann FG, Jokeit H, Luerding R, Freitag H, Schulz R, Guertler S, Ebner A. Language lateralization by Wada test and fMRI in 100 patients with epilepsy. Neurology. 2003;61:699–701. doi: 10.1212/01.wnl.0000078815.03224.57. [DOI] [PubMed] [Google Scholar]
  115. Worsley KJ, Liao CH, Aston J, Petre V, Duncan GH, Morales F, Evans AC. A general statistical analysis for fMRI data. NeuroImage. 2002;15:1–15. doi: 10.1006/nimg.2001.0933. [DOI] [PubMed] [Google Scholar]
  116. Worsley KJ, Marrett S, Neelin P, Vandal AC, Friston KJ, Evans AC. A unified statistical approach for determining significant signals in images of cerebral activation. Human Brain Mapping. 1996;4:58–73. doi: 10.1002/(SICI)1097-0193(1996)4:1<58::AID-HBM4>3.0.CO;2-O. [DOI] [PubMed] [Google Scholar]
  117. Yagata SA, Yen M, McCarron A, Bautista A, Lamair-Orosco G, Wilson SM. Rapid recovery from aphasia after infarction of Wernicke’s area. Aphasiology. 2017;31:951–980. doi: 10.1080/02687038.2016.1225276. [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. You X, Adjouadi M, Guillen MR, Ayala M, Barreto A, Rishe N, Gaillard WD. Sub-patterns of language network reorganization in pediatric localization related epilepsy: A multisite study. Human Brain Mapping. 2011;32:784–799. doi: 10.1002/hbm.21066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Zahn R, Drews E, Specht K, Kemeny S, Reith W, Willmes K, Huber W. Recovery of semantic word processing in global aphasia: A functional MRI study. Cognitive Brain Research. 2004;18:322–336. doi: 10.1016/j.cogbrainres.2003.10.021. [DOI] [PubMed] [Google Scholar]
  120. Zijdenbos AP, Dawant BM, Margolin RA, Palmer AC. Morphometric analysis of white matter lesions in MR images: method and validation. IEEE Transactions on Medical Imaging. 1994;13:716–724. doi: 10.1109/42.363096. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp info

RESOURCES