Abstract
Purpose:
The purpose of this project was to determine the feasibility of employing a functional magnetic resonance imaging (fMRI) task that captured activation associated with overt, unscripted (or free) discourse of people with aphasia (PWA), using a continuous scan paradigm.
Method:
Seven participants (six females, ages 48–70 years) with chronic poststroke aphasia underwent two fMRI scanning sessions that included a discourse fMRI paradigm that consisted of five 1-min picture description tasks, using personally relevant photographs, interspersed with two 30-s control periods where participants looked at a fixation cross. Audio during the continuous fMRI scan was collected and marked with speaking times and coded for correct information units. Activation maps from the fMRI data were generated for the contrast between speaking and control conditions. In order to show the effects of the multi-echo data analysis, we compared it to a single-echo analysis by using only the middle echo (echo time of 30 ms).
Results:
Through the implementation of the free discourse fMRI task, we were able to elicit activation that included bilateral regions in the planum polare, central opercular cortex, precentral gyrus, superior temporal gyrus, middle temporal gyrus, superior temporal gyrus, Crus I of the cerebellum, as well as bilateral occipital regions
Conclusions:
We describe a new tool for assessing discourse recovery in PWA. By demonstrating the feasibility of a natural language paradigm in patients with chronic, poststroke aphasia, we open a new area for future research.
Aphasia rehabilitation often targets the improvement of discourse (Dipper et al., 2021). However, classically, most functional magnetic resonance imaging (fMRI) paradigms focus on single-word production (den Ouden et al., 2009; Dietz et al., 2018; Heath et al., 2015). This is done, in large part, to counter motion artifacts induced by overt speech during the scan. Given the nature of real-world communication in humans—which we speak at the discourse level, rather than single words—these paradigms and their activation do not reflect the true neurobiology of overt, unscripted (or free) discourse. Moreover, traditional single-word language tasks require prescan teaching to ensure that people with aphasia (PWA) understand the task at hand—and may even exclude some from participation due to the language load of the task.
We predict that an in-scanner free discourse task would activate brain regions associated with language production, like those targeted during treatment. This could lead to better predictions of treatment response and generate novel information about the neurobiology of discourse beyond the word level. However, there are multiple challenges to successfully conducting an unscripted discourse task during an fMRI scan. First, the strong magnetic field limits the type of audio recording device. Second, the scanner noise (> 90 dB) obscures a participant's speech, making it difficult to monitor compliance and to collect spoken discourse samples. Finally, extended free discourse creates the concern of task-correlated head motion, disrupting the ability to accurately measure fMRI BOLD signal. Previous attempts to overcome these challenges have involved sparse or hemodynamics unrelated to scanner hardware (HUSH) sequences (Allendorfer, Kissela, et al., 2012; Allendorfer, Lindsell, et al., 2012) that pause the scanning at set intervals, allowing the participant to speak. The HUSH technique allows for the detection of the spoken language–related activation that occurs approximately 6 s after the speech due to the lag in the hemodynamic response (Rajapakse et al., 1998). In our experience, some PWA are challenged to generate a few words in a HUSH paradigm, as such, it stands to reason their ability to produce unscripted spoken discourse within a brief time window would be unattainable. One way to address some of the challenges is with a multi-echo fMRI acquisition. Acquiring multiple echoes allows for fitting a mono-exponential decay function from which one can compute the T2* and S0 maps that separate signal changes due to noise from those due to hemodynamic response (Kundu et al., 2017). Because the multi-echo acquisition is a continuous scan, participants can complete the task at their own pace.
In summary, an fMRI paradigm that more closely parallels natural communication would be beneficial in terms of increasing the accuracy of modeling the neurobiology of human language and in identifying treatment responders. However, such an approach would require careful management of known challenges such as management of scanner noise, to obtain a quality spoken discourse sample, and task-correlated motion. Therefore, the purpose of this project was to determine the feasibility of employing an fMRI task that captured activation associated with overt, unscripted (or free) discourse of PWA, using a continuous scan paradigm.
Method
This study was approved by the institutional review board at the University of Cincinnati; written informed consent was obtained from all participants.
Participants
As a part of a larger treatment study, the two initial scanning sessions of seven people with chronic poststroke aphasia (> 6 months; six females, ages 48.9–70.5 years; M = 59.4 years, SD = 8.6 years) were analyzed for this research note. Inclusion criteria required participants to have a left middle cerebral artery ischemic infarct, no history of major psychotic episodes and substance abuse, at least a high school education, not currently enrolled in speech-language therapy, and native speakers of U.S. English. All participants passed an audiometric hearing screening at 50 dB HL, in at least one ear at 1000, 2000, and 4000 Hz, reported normal or corrected vision, and passed a visual field/attention screening task. Moreover, all participants demonstrated the ability to intelligibly complete a picture description task and presented with no more than a mild to mild–moderate apraxia or dysarthria (Bunton et al., 2007; Haley et al., 2012). Table 1 summarizes the demographic data for the seven participants, and Table 2 contains information about the participants' lesion locations and volumes.
Table 1.
Patient demographic data.
| Participant | Age | Gender | Education level | Ethnicity | Months postonset | Handedness | Motor speech status | Aphasia type | Baseline aphasia quotient |
|---|---|---|---|---|---|---|---|---|---|
| 001 | 48.9 | Female | Undergraduate degree | White; non-Hispanic | 56.5 | Right | Mild–moderate apraxia | Anomic | 88.9 |
| 002 | 64.8 | Female | Undergraduate degree | White; non-Hispanic | 240.9 | Right | Mild apraxia | Conduction | 69.1 |
| 003 | 63.8 | Female | High school | African American; non-Hispanic | 175.2 | Right | Mild apraxia | Anomic | 85 |
| 004 | 70.5 | Female | Undergraduate degree | White; non-Hispanic | 76.9 | Right | Mild dysarthria | Anomic | 84.8 |
| 005 | 50.4 | Female | Some college | White; non-Hispanic | 25.9 | Right | Moderate apraxia | Broca's | 68.4 |
| 006 | 52.4 | Female | Undergraduate degree | White; non-Hispanic | 6.8 | Left | No impairment | Transcortical motor | 60.2 |
| 007 | 65.3 | Male | Master's degree | White; non-Hispanic | 140.7 | Right | No impairment | Broca's | 54.6 |
Note. Motor speech screening was based on Bunton et al. (2007) and Haley et al. (2012).
Table 2.
Participant lesion volume and location details.
| Participant | Lesion volume cm3 | Lesion location |
|---|---|---|
| 001 | 49.39 | Left temporoparietal junction including ventral angular and supramarginal gyri, primary auditory cortex, and posterior insula, extending posteriorly into lateral occipital cortex and precuneus |
| 002 | 80.61 | Left superior and middle temporal gyri extending from temporal pole to temporoparietal junction, adjacent areas of inferior frontal and pre/postcentral gyri, insula, and planum temporale extending into adjacent white matter |
| 003a | 77.17 | Left temporoparietal junction including supramarginal gyrus, insula, and posterior superior temporal gyrus, extending anteriorly into postcentral gyrus Right temporoparietal junction, primarily supramarginal gyrus |
| 004 | 47.33 | Left inferior frontal gyrus, precentral gyrus with some extension into postcentral gyrus, anterior superior temporal gyrus |
| 005 | 106.53 | Left pre- and postcentral gyrus, posterior inferior and middle frontal gyri, central insula, supramarginal gyrus, superior parietal lobule, extending posteriorly into lateral occipital cortex |
| 006 | 46.15 | Left central operculum, frontal operculum, insula, primary auditory cortex, planum temporale, anterior/central superior and middle temporal gyri, supramarginal gyrus |
| 007 | 88.28 | Left temporal pole and anterior superior and middle temporal gyri, inferior, middle and superior frontal gyri, pre and post central gyri, insula and primary auditory cortex, adjacent white matter |
During the initial scan, we observed a second right-sided infarct for Participant 003; however, the participant was clearly aphasic during her behavioral testing, likely due to her left-sided infarct (see Table 1). As such, we included her in the study. All other participants had only one left-hemispheric infarct.
Imaging Parameters
MRI was acquired on a Philips 3.0T Ingenia MRI scanner (Philips Healthcare) using a 32-channel head coil in the Imaging Research Center at Cincinnati Children's Hospital. High-resolution, whole-brain, anatomical images were obtained with a 3D T1–weighted sequence using a 1-mm isotropic resolution and acquired with compressed sensing; FOV: 240 mm × 160 mm × 240 mm, TE = 2.938 ms, TR = 7.9 s, flip angle of 8, and acquisition duration of 167 s. fMRI images were obtained using a multi-echo acquisition with TE at 10, 30, and 50 ms, TR = 1,029 ms, 3-mm isotropic voxel resolution, 350 volumes, FOV: 240 mm × 126 mm × 240 mm, flip angle of 75, and a SENSE factor of 3.
Prescan Instructions
Prior to entering the scanner, participants received explicit training on the spoken discourse elicitation task. Specifically, they were provided with a novel image and instructed that they would see a photo they provided us and then were instructed to tell us a story about the picture. A trained research assistant demonstrated what this would look like and then asked the participant to do the same. They were also told that this would repeat several times. Next, they were shown a fixation cross and told that this meant to stop talking. Finally, they were reminded to be as still as possible. Aphasia friendly principles were implemented, including slow rate of speech, simplified language, and use of photographic supports during the instruction process (Rose et al., 2003). See Supplemental Material S1 for detailed instructions and materials.
In-Scanner Spoken Discourse
Recordings of in-scanner spoken discourse during the discourse paradigm were acquired using a Fibersound Fiber-Optic Microphone and amplifier (Micro-Optics Technologies, Inc.). In-scanner audio for the discourse was recorded using Audacity (https://www.audacityteam.org/, Version 1.3.0, 2021). The fiberoptic microphone was placed directly in front of the participants' mouth, which was positioned just below the head coil and secured via a hook and loop fastener. Microphone placement was important for recording clear spoken discourse during the scan and was aided by a flexible microphone positioning arm created in-house. The flexible arm also helped to mechanically decouple the microphone from the scanner, reducing noise transfer due to scanner vibration. A foam cover was placed over the microphone to reduce popping and breath noises. A separate foam cover was used for each participant for hygienic purposes.
Discourse Paradigm
The discourse paradigm was designed to elicit multiple self-generated, autobiographical narrative generations (Stark et al., 2022), through the use of highly contextualized, personalized photograph supports, since they are known to elicit robust language, compared to nonpersonalized pictures or line drawings (Dietz et al., 2014; Griffith et al., 2014). More specifically, the paradigm included two runs, each with five 1-min periods of the participant describing, out loud, a picture that was displayed on the screen and two 26-s control periods during which a fixation cross was displayed (see Figure 1). In-between each trial, a drawing of either a mouth or an eye was displayed for 2 s to indicate the condition (speak or control conditions, respectively) of the upcoming trial. The two control periods occurred after the first and second pictures. The 10 pictures (five for each run of the paradigm) that the participant described were personal pictures from the participants' lives. They were instructed to bring in action-oriented pictures for stories that they would feel comfortable sharing. Each run was separated by 15–20 min, taken up by other scans not presented here. This allowed for adequate rest between each discourse paradigm run.
Figure 1.
Graphical depiction of the discourse task. On the timeline, blue indicates when the instruction pictures appear on the screen, green indicates when the participant's personalized pictures were displayed on the screen for 56 s, and red indicates when the participant was viewing a crosshair for 26 s.
Figure 1 depicts the paradigm sequence (total run time = 350 s). The paradigm was implemented in DirectRT (http://www.empirisoft.com, Version 2018) and viewed using a InroomViewingDevice (NordicNeuroLab; https://nordicneurolab.com). Participants were asked if they could clearly see the screen prior to scanning, and MRI-compatible corrective lenses were used when necessary. Supplemental Material S1 includes our exact instructions to elicit the discourse sample.
Analysis
Spoken discourse processing . Audio for the discourse paradigm was recorded in-scanner during a continuous fMRI acquisition protocol. This resulted in significant scanner-related noise that was removed using a custom noise removal algorithm, implemented in Python (https://www.python.org, Version 2.8). This process permitted two trained research assistants to independently orthographically transcribe the participants' discourse, verbatim—including fillers and pause durations (≥ 2 s) via InqScribe software (Inquirium, LLC, Version 2.2.4). Upon completion, 100% of transcriptions were cross-checked for interrater reliability; all disputes were resolved, and a single transcript was used for coding of correct information units (CIUs; Nicholas & Brookshire, 1993), via the Systematic Analysis of Language Transcripts program (Miller et al., 2015). The timing of the CIUs relative to the start of the fMRI was recorded for use in this study. There was not a minimal number of CIUs required for the data to be usable. Rather, the talk time was used (not the duration of the scan time that the picture was displayed) as a regressor. Please see Supplemental Material S2 for a checklist of Best Practice Guidelines for Reporting Spoken Discourse in Aphasia and Communication Disorders (Stark et al., 2022).
MRI preprocessing. The high-resolution T1-weighted anatomical images were processed using FSL (FMRIB's Software Library, http://www.fmrib.ox.ac.uk/fsl; Jenkinson et al., 2012). They were first reoriented to standard orientation using fslreorient2std and then bias field corrected using fast with bias field smoothing extent of 10-mm and five main-loop iterations (Zhang et al., 2001). Brain extraction was carried out using FSL's bet with a fractional intensity threshold of 0.3 (Smith, 2002). Finally, the images were normalized to the MNI brain template using flirt (Jenkinson et al., 2002; Jenkinson & Smith, 2001).
Multi-echo analysis . The fMRI multi-echo data were processed using FSL and Python scripts developed in-house. The R2* and S0 maps were computed at each voxel using a log-linear fit to the mono-exponential decay function (Kundu et al., 2017). The T2* map was then calculated from the inverse of the R2* map.
Outlier detection, motion correction, brain extraction, and T1 co-registration were run on a data set comprised of only the third echo (TE = 50 ms) due to its higher tissue contrast (Kundu et al., 2017). Outlying volumes were detected using FSL's fsl_motion_outliers with the RMS intensity difference metric and standard box plot cutoff threshold. Motion correction parameters were computed using FSL's mcflirt with all volumes being co-registered to the middle volume. A brain mask for the functional data was constructed with FSL's bet using the setting for four-dimensional (4D) fMRI data (Smith, 2002). The motion correction parameters and brain mask were applied to the computed T2* map using applyxfm4D. The motion-corrected T2* map was spatially and temporally smoothed using FSL's fslmaths with a spatial Gaussian kernel with a σ of 3 mm (mean filtered) and a high-pass temporal filter with a sigma of 60 s.
Single-echo analysis. A single-echo analysis was also performed using similar preprocessing steps, as listed above, but with a data set comprised only of the middle echo, TE = 30 ms, which is the same TE as most single-echo fMRI acquisitions. This allows for an assessment of the advantages of using the multi-echo acquisition compared to a typical single-echo acquisition.
Discourse activation. A discourse-related activation map was calculated using a general linear model, as implemented in FSL's fsl_glm, with the T2* map and a discourse design file. The spoken discourse design file consisted of two spoken discourse–related regressors, one modeling the timing of the CIUs and the other modeling the timing of all other spoken discourse; specifically, each word was time coded. Both regressors were convolved with a Gamma function (phase = 0 s, SD = 3 s, and M lag = 6 s) to account for the hemodynamic response function. The six motion parameters (x, y, z rotation and translation) and a separate regressor for each of the detected outlying volumes were also included in the design as regressors of no interest. The examined contrast was a linear combination of the CIU and non-CIU spoken discourse contrasts giving an activation profile of spoken discourse versus silence during the task. An FDR correction was used to correct for multiple comparisons using FSL's fdr, with activation threshold at an FDR-corrected p < .05. Scripts used to conduct this analysis can be found at this website (https://github.com/maloneytc/NAIL_Discourse).
Results
All seven participants attended each of the two initial visits. The discourse task data for Participant 006 was unable to be analyzed during their first visit due to technical issues with imaging data export (two runs). Participant 005 was not able to produce enough spoken discourse to analyze any of the four runs (two visits with two runs per visit; see Table 3). Finally, Participant 007 did not produce spoken discourse during either of their discourse task runs or during their first visit (two runs).
Table 3.
Summary of speech data during the in-scanner discourse task.
| Participant | Visit | Run 1 |
Run 2 |
Aphasia quotient | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| CIU count | Total CIU time (s) | Total other speech time (s) | Total speech time | CIU count | Total CIU time (s) | Total other speech time (s) | Total speech time | |||
| 001 | 1 | 165 | 139.4 | 121.6 | 260.9 | 209 | 139.9 | 139.1 | 279.0 | 88.9 |
| 2 | 216 | 130.2 | 133.3 | 263.5 | 227 | 144.6 | 128.2 | 272.8 | 87.6 | |
| 002 | 1 | 59 | 34.4 | 228.5 | 263.0 | 76 | 47.5 | 203.7 | 251.2 | 69.1 |
| 2 | 52 | 32.8 | 225.2 | 258.0 | 89 | 64.2 | 196.1 | 260.3 | 70.7 | |
| 003 | 1 | 36 | 40.0 | 123.0 | 163.0 | 149 | 74.6 | 85.0 | 159.6 | 85.0 |
| 2 | 73 | 41.2 | 36.7 | 77.9 | 153 | 84.2 | 90.7 | 174.9 | 91.1 | |
| 004 | 1 | 226 | 104.5 | 137.4 | 241.9 | 258 | 130.6 | 99.6 | 230.2 | 84.8 |
| 2 | 282 | 134.8 | 106.8 | 241.6 | 250 | 129.3 | 42.5 | 171.7 | 87.5 | |
| 005 | 1 | 0 | 0.0 | 0.0 | 0.0 | 0 | 1.0 | 0.0 | 1.0 | 68.4 |
| 2 | 1 | 2.0 | 0.0 | 2.0 | 7 | 7.2 | 0.0 | 7.2 | 68.2 | |
| 006 | 1 | 0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 60.2 |
| 2 | 12 | 9.0 | 43.8 | 52.8 | 8 | 5.0 | 0.0 | 5.0 | 64.7 | |
| 007 | 1 | 0 | 0.0 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 0.0 | 54.6 |
| 2 | 18 | 16.0 | 0.0 | 16.0 | 24 | 16.0 | 0.0 | 16.0 | 63.2 | |
Note. CIU = correct information unit.
Coding reliability. Interrater reliability checks on 100% (N = 100) of transcripts revealed very good agreement (Cohen, 1968; Fleiss et al., 2004), with a mean Cohen's κ of .92 for CIUs and .87 for speak time. This included 20 transcripts each (five discourse trials per run, two runs per visit, two visits) from 001, 002, 003, and 004, plus 10 transcripts each from Participants 006 and 007 (due to only one visit with spoken discourse).
Speak time. Excluding runs with no speech, speaking time ranged from 1 to 279.03 s with an Mdn (interquartile range [IQR]) of 172 s (16–258; see Table 3).
CIUs. The Mdn (IQR) CIU count for each run was 66 s (9–198) with Mdn (IQR) of the speaking time for CIUs being 41 s (7.65–123.08; see Table 3).
Motion. The median framewise displacement (FWD) over each run ranged from 0.20 to 0.84 s with median (IQR), over all participants and runs, of 0.31 (0.26–0.37). There is a significant correlation between each run's median FWD and the total speaking time during the task (Spearman rs = .61, p = .006). However, this significance is mostly driven by Participant 001, who had both the highest FWD and the most speaking time of all participants. When Participant 001 was removed from the analysis, the correlation was no longer significant (rs = .20, p = .47). See Supplemental Material S3 for additional information on the head motion parameters.
Spoken discourse–correlated motion. In general, at the participant level, the spoken discourse time course was moderately correlated with the FWD. The Spearman correlation coefficient between the FWD and spoken discourse time course ranged from −0.13 to 0.43 mm with Mdn (IQR) of 0.13 mm (0.09–0.18; see Table 3). Of the six motion subcomponents (x, y, z translation and rotation), the largest absolute median motion–spoken discourse correlation was about the x-axis for both rotation (Mdn = 0.28, IQR = −0.10 to 0.40) and translation (Mdn = 0.06, IQR = −0.14 to 0.32).
Multi-echo activation. The general pattern of activation included bilateral regions in the planum polare, central opercular cortex, precentral gyrus, superior temporal gyrus, middle temporal gyrus, superior temporal gyrus, Crus I of the cerebellum, as well as bilateral occipital regions. One participant, 007, showed minimal areas of activation, likely due to their minimal discourse production. The location of the participants' lesions affected the overall pattern of activation and either eliminated the activation in the region of the lesion or relegated the activation to perilesional areas of healthy tissue. The general pattern of activation in each participant was consistent across runs and visits. Figure 2 shows activation comparing spoken discourse versus silence using the multi-echo approach. Task-positive areas of activation at a relative threshold is shown in Supplemental Material S4; for each participant, task-positive voxels within the brain mask were ranked, and the top 10% of activated voxels are shown (Wilson et al., 2018). Using the relative thresholds, the activation patterns are similar to the absolute threshold activation but with a few additional areas of activation within the canonical language regions in Participants 003 and 004.
Figure 2.
Positive activation averaged over all initial scans for each participant for the contrast speech > silence using a multi-echo analysis. Activation was corrected for multiple comparisons using an FDR correction and is displayed with FDR corrected p < .05 and a z threshold of z > 2.3. All images are displayed in subject space using the participant's T1 image as the underlay.
Single-echo versus multi-echo comparison. Many of the activation maps using a single-echo approach were uninterpretable due to large regions of activation that were most likely associated with motion. These activation patterns presented on the z statistic maps as alternating strips of positive and negative regions that did not correspond to anatomical boundaries. Figure 3 shows discourse activation analyzed with the middle echo for each participant. These are the same data sets displayed in Figure 2 but displayed at a lower uncorrected threshold (z > 1.96) in order to show regions of activation. As such, the following results are based on the multi-echo approach.
Figure 3.
Positive activation averaged over all initial scans for each participant for the contrast speech > silence using a single-echo (TE = 30 ms) analysis. Activation is uncorrected for multiple comparisons and is displayed at a z threshold of z > 1.96. All images are displayed in subject space using the participant's T1 image as the underlay.
Motion-correlated activation. In order to examine the effect of motion on activation, we took the average of the absolute z statistic from regions of significant activation (multiple comparison correction, p < .05 and z > 2.3) in each run from the normalized activation maps using the multi-echo analysis. This was performed using the nibabel (for loading image data), scipy (for statistical analysis), and numpy (for data working with data arrays) packages in Python. The Spearman correlation between the activation and the median FWD trended toward significance, rs = .44, p = .06, when looking across all participants and visits. However, when looking only at the average positive activation, the correlation with median FWD became significant (r = .62, p = .005). This seems to be driven mostly by Participant 001; when the data from Participant 001 were removed from this analysis, the correlation between average positive activation and median FWD decreased, and was marginally significant (r = .49, p = .06). Within each participant, there was not a consistent association between average activation and median FWD (see Supplemental Material S5).
Discussion
The purpose of this project was to determine the feasibility of employing an fMRI task that captured brain activation associated with overt, unscripted (or free) discourse of PWA, using a continuous scan paradigm. Prior to moving into the discussion, it is important to operationalize feasibility. As outlined by Bowen et al. (2009), our primary question in this feasibility trial was, “Can it work?” (p. 454). This question requires the feasibility study to yield some data that “X” might work. Our preliminary findings and lessons learned are summarized in the following sections.
Feasibility
It is feasible to employ an fMRI task that captures activation associated with overt, free discourse produced by PWA—in other words it “can work” (Bowen et al., 2009). Despite language impairments from stroke, six out of the seven participants produced spoken discourse during at least one run of the free discourse task. Furthermore, during the task instruction phase, participants demonstrated little to no challenge understanding that they were to talk about their photograph and pause when looking at the crosshair. Moreover, the multi-echo fMRI comes at minimal cost to the scanning sequence and enables removal of signal not related to changes in BOLD. In turn, this increased activation in language-related regions of the brain compared to a single-echo analysis. Finally, the Mdn FWD remained below 0.5 mm for all but one participant, suggesting head motion can be controlled with adequate instruction.
Activation
Those with greater language impairment may require more mental effort despite producing less output, making the link between activation and total talk time difficult to generalize. Participants 001 and 006, for example, had similar amounts of activation despite Participant 001 having longer talk time over four runs (an average talk time of 175.23 s per run) compared to Participant 006's two runs (average talk time of 58.75 s per run).
For most participants, the task activation was as expected for spoken discourse generation. For some participants, activation was present in perilesional regions indicating that some healthy tissue may remain in the left-hemisphere language areas (Saur et al., 2006, 2010; Szaflarski et al., 2013). For some participants, it appears that the right hemisphere homologues may be more active, indicating a potential rightward shift during language recovery (Crosson et al., 2009; Zumbansen et al., 2014). However, some studies suggest canonical language areas are more important for poststroke language recovery than contralateral homologues, which may be related to the age when stroke occurred (Griffis et al., 2017). Activation in the visual regions of the occipital lobe is also seen in multiple participants owing to the visual nature of the task (Dietz et al., 2018; Harel et al., 2013a, 2013b; Peissig & Tarr, 2007; Vogel et al., 2014).
Limitations and Future Directions
The COVID-19 pandemic disrupted this project, leaving a small sample size for analysis (target enrollment was 20). Due to the heterogeneity of the participants' lesion locations and aphasia types, a group analysis of the activation maps did not provide any meaningful data. A larger cohort would permit group analyses and may permit recommendations such as minimum spoken words per run to elicit activation. As such, this paradigm will not be feasible for at least some PWA and will require stringent screening and careful prescanner instruction to identify successful candidates. Moreover, since this task employed a continuous scan paradigm, scanner noise made it difficult to determine, in real time, whether the participants were speaking throughout the scan; spoken language content could only be examined after audio recordings were processed. As such, this made task interpretation challenging during the analysis period because it was not possible to discern whether they were disengaged or whether they were experiencing anomia during the silent periods. Monitoring task compliance could be mitigated by using a real-time scanner noise cancellation system (if available), which would permit researchers to stop, reinstruct participants, and then restart the task and avoid losing data—as occurred in this study with Participants 005 and 007 who clearly demonstrated the ability to perform the task outside the scanner. Finally, since this was a feasibility study, we did not impose a cutoff for number of words spoken, and as clarified above, each word was time coded so that time points associated with no spoken language (e.g., silence) were not included in the discourse regressors. However, future work should endeavor to identify what a minimum sample of in-scanner language production should include to ensure a robust discourse sample. Future work must include two key steps to validate findings. First, data on in-scanner language behavior, activation, and motion for healthy volunteers (age and education matched) is necessary. Second, interscan reliability among the healthy volunteers is essential to validate this paradigm as a tool that we can reliably add to our growing repertoire of fMRI pre- or posttreatment assessments.
Conclusions
There has been reluctance to employ discourse-level speaking tasks due to the possibility of correlation between head motion and the discourse time course (Amemiya et al., 2019). One of the key enablers of analyzing free discourse is the collection of multi-echo fMRI. Collecting the additional echoes comes at a minimal cost during the scanning as it only requires additional images to be acquired at times that do not interfere with the single-echo image acquisition. By collecting the multiple echoes, we can estimate the T2*, thereby removing much of the signal not related to changes in BOLD signal.
The results presented show that, despite the many challenges, it is possible to collect fMRI with PWA performing an unscripted overt discourse task. This has required a novel combination of technologies including a high-quality fiberoptic in-scanner microphone, noise reduction algorithms to reduce the scanner noise and allow for transcription, and multi-echo functional imaging collection to reduce the motion artifacts and scanner-related noise. The ability to analyze brain activation during free discourse gives us a novel tool to assess the changes in brain activation associated with language-related treatments.
Data Availability Statement
The data that support the findings of this study are available upon reasonable request from the corresponding author, T.M. The data are not publicly available due to their sensitive nature and that they contain information that could compromise the privacy of research participants.
Acknowledgments
This work was supported by the National Institute on Deafness and Other Communication Disorders Grant R15DC017280, “A Preliminary Study of the Neurobiology of AAC-Induced Language Recovery in Post-Stroke Aphasia,” awarded to Aimee Dietz. This work reflects the opinion of the authors and not necessarily that of the National Institutes of Health.
The authors are ever so grateful to the participants for their eagerness to participate in this study—especially through the early stages of the COVID-19 pandemic. The authors thank Delaney Turner for help with the conduct of the study.
Funding Statement
This work was supported by the National Institute on Deafness and Other Communication Disorders Grant R15DC017280, “A Preliminary Study of the Neurobiology of AAC-Induced Language Recovery in Post-Stroke Aphasia,” awarded to Aimee Dietz. This work reflects the opinion of the authors and not necessarily that of the National Institutes of Health.
References
- Allendorfer, J. B., Kissela, B. M., Holland, S. K., & Szaflarski, J. P. (2012). Different patterns of language activation in post-stroke aphasia are detected by overt and covert versions of the verb generation fMRI task. Medical Science Monitor, 18(3), CR135−CR137. 10.12659/msm.882518 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Allendorfer, J. B., Lindsell, C. J., Siegel, M., Banks, C. L., Vannest, J., Holland, S. K., & Szaflarski, J. P. (2012). Females and males are highly similar in language performance and cortical activation patterns during verb generation. Cortex, 48(9), 1218–1233. 10.1016/j.cortex.2011.05.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Amemiya, S., Yamashita, H., Takao, H., & Abe, O. (2019). Integrated multi-echo denoising strategy improves identification of inherent language laterality. Magnetic Resonance in Medicine, 81(5), 3262–3271. 10.1002/mrm.27620 [DOI] [PubMed] [Google Scholar]
- Audacity. (2021). [Homepage]. https://audacityteam.org/
- Bowen, D. J., Kreuter, M., Spring, B., Cofta-Woerpel, L., Linnan, L., Weiner, D., Bakken, S., Kaplan, C. P., Squiers, L., Fabrizio, C., & Fernandez, M. (2009). How we design feasibility studies. American Journal of Preventative Medicine, 36(5), 452–457. 10.1016/j.amepre.2009.02.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bunton, K., Kent, R. D., Duffy, J. R., Rosenbek, J. C., & Kent, J. F. (2007). Listener agreement for auditory-perceptual ratings of dysarthria. Journal of Speech, Language, and Hearing Research, 50(6), 1481–1495. 10.1044/1092-4388(2007/102) [DOI] [PubMed] [Google Scholar]
- Cohen, J. (1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220. 10.1037/h0026256 [DOI] [PubMed] [Google Scholar]
- Crosson, B., Moore, A. B., McGregor, K. M., Chang, Y. L., Benjamin, M., Gopinath, K., Sherod, M. E., Wierenga, C. E., Peck, K. K., Briggs, R. W., Rothi, L. J., & White, K. D. (2009). Regional changes in word-production laterality after a naming treatment designed to produce a rightward shift in frontal activity. Brain and Language, 111(2), 73–85. 10.1016/j.bandl.2009.08.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- den Ouden, D. B., Fix, S., Parrish, T. B., & Thompson, C. K. (2009). Argument structure effects in action verb naming in static and dynamic conditions. Journal of Neurolinguistics, 22(2), 196–215. 10.1016/j.jneuroling.2008.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dietz, A., Vannest, J., Maloney, T., Altaye, M., Holland, S., & Szaflarski, J. P. (2018). The feasibility of improving discourse in people with aphasia through AAC: Clinical and functional MRI correlates. Aphasiology, 32(6), 693–719. 10.1080/02687038.2018.1447641 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dietz, A., Weissling, K., Griffith, J., McKelvey, M., & Macke, D. (2014). The impact of interface design during an initial high-technology AAC experience: A collective case study of people with aphasia. Augmentative and Alternative Communication, 30(4), 314–328. 10.3109/07434618.2014.966207 [DOI] [PubMed] [Google Scholar]
- Dipper, L., Marshall, J., Boyle, M., Botting, N., Hersh, D., Pritchard, M., & Cruice, M. (2021). Treatment for improving discourse in aphasia: A systematic review and synthesis of the evidence base. Aphasiology, 35(9), 1125–1167. 10.1080/02687038.2020.1765305 [DOI] [Google Scholar]
- Fleiss, J. L., Levin, B., & Paik, M. C. (2004). Statistical methods for rates and proportions, third edition. Wiley. [Google Scholar]
- Griffis, J. C., Nenert, R., Allendorfer, J. B., Vannest, J., Holland, S., Dietz, A., & Szaflarski, J. P. (2017, The canonical semantic network supports residual language function in chronic post-stroke aphasia. Human Brain Mapping, 38(3), 1636–1658. 10.1002/hbm.23476 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Griffith, J., Dietz, A., & Weissling, K. (2014). Supporting narrative retells for people with aphasia using augmentative and alternative communication: Photographs or line drawings? Text or no text? American Journal of Speech-Language Pathology, 23(2), S213–224. 10.1044/2014_AJSLP-13-0089 [DOI] [PubMed] [Google Scholar]
- Haley, K. L., Jacks, A., de Riesthal, M., Abou-Khalil, R., & Roth, H. L. (2012). Toward a quantitative basis for assessment and diagnosis of apraxia of speech. Journal of Speech, Language, and Hearing Research, 55(5), S1502–1517. 10.1044/1092-4388(2012/11-0318) [DOI] [PubMed] [Google Scholar]
- Harel, A., Kravitz, D., & Baker, C. I. (2013a). Beyond perceptual expertise: Revisiting the neural substrates of expert object recognition. Frontiers in Human Neuroscience, 7, Article 885. 10.3389/fnhum.2013.00885 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harel, A., Kravitz, D. J., & Baker, C. I. (2013b). Deconstructing visual scenes in cortex: Gradients of object and spatial layout information. Cerebral Cortex, 23(4), 947–957. 10.1093/cercor/bhs091 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heath, S., McMahon, K. L., Nickels, L. A., Angwin, A., MacDonald, A. D., van Hees, S., McKinnon, E., Johnson, K., & Copland, D. A. (2015). An fMRI investigation of the effects of attempted naming on word retrieval in aphasia. Frontiers in Human Neuroscience, 9, Article 291. 10.3389/fnhum.2015.00291 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jenkinson, M., Bannister, P., Brady, M., & Smith, S. (2002). Improved optimization for the robust and accurate linear registration and motion correction of brain images. NeuroImage, 17(2), 825–841. 10.1016/s1053-8119(02)91132-8 [DOI] [PubMed] [Google Scholar]
- Jenkinson, M., Beckmann, C. F., Behrens, T. E., Woolrich, M. W., & Smith, S. M. (2012). FSL. NeuroImage, 62(2), 782–790. 10.1016/j.neuroimage.2011.09.015 [DOI] [PubMed] [Google Scholar]
- Jenkinson, M., & Smith, S. (2001). A global optimisation method for robust affine registration of brain images. Medical Image Analysis, 5(2), 143–156. 10.1016/s1361-8415(01)00036-6 [DOI] [PubMed] [Google Scholar]
- Kundu, P., Voon, V., Balchandani, P., Lombardo, M. V., Poser, B. A., & Bandettini, P. A. (2017). Multi-echo fMRI: A review of applications in fMRI denoising and analysis of BOLD signals. NeuroImage, 154, 59–80. 10.1016/j.neuroimage.2017.03.033 [DOI] [PubMed] [Google Scholar]
- Miller, J. F., Andriacchi, K., & Nockerts, A. (2015). Assessing language production using SALT software (2nd ed.). SALT Software LLC. [Google Scholar]
- Nicholas, L. E., & Brookshire, R. H. (1993, A system for quantifying the informativeness and efficiency of the connected speech of adults with aphasia. Journal of Speech and Hearing Research, 36(2), 338–350. 10.1044/jshr.3602.338 [DOI] [PubMed] [Google Scholar]
- Peissig, J. J., & Tarr, M. J. (2007). Visual object recognition: Do we know more now than we did 20 years ago? Annual Review of Psychology, 581, 75–96. 10.1146/annurev.psych.58.102904.190114 [DOI] [PubMed] [Google Scholar]
- Rajapakse, J. C., Kruggel, F., Maisog, J. M., & von Cramon, D. Y. (1998). Modeling hemodynamic response for analysis of functional MRI time-series. Human Brain Mapping, 6(4), 283–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rose, T., Worrall, L., & McKenna, K. (2003). The effectiveness of aphasia-friendly principles for printed health education materials for people with aphasia following stroke. Aphasiology, 17(10), 947–963. 10.1080/02687030344000319 [DOI] [Google Scholar]
- Saur, D., Lange, R., Baumgaertner, A., Schraknepper, V., Willmes, K., Rijntjes, M., & Weiller, C. (2006). Dynamics of language reorganization after stroke. Brain, 129(6), 1371–1384. 10.1093/brain/awl090 [DOI] [PubMed] [Google Scholar]
- Saur, D., Ronneberger, O., Kummerer, D., Mader, I., Weiller, C., & Kloppel, S. (2010, Early functional magnetic resonance imaging activations predict language outcome after stroke. Brain, 133(4), 1252–1264. 10.1093/brain/awq021 [DOI] [PubMed] [Google Scholar]
- Smith, S. M. (2002). Fast robust automated brain extraction. Human Brain Mapping, 17(3), 143–155. 10.1002/hbm.10062 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stark, B. C., Bryant, L., Themistocleous, C., den Ouden, D.-B., & Roberts, A. C. (2022). Best practice guidelines for reporting spoken discourse in aphasia and neurogenic communication disorders. Aphasiology, 375, 761−784. 10.1080/02687038.2022.2039372 [DOI] [Google Scholar]
- Szaflarski, J. P., Allendorfer, J. B., Banks, C., Vannest, J., & Holland, S. K. (2013). Recovered vs. not-recovered from post-stroke aphasia: The contributions from the dominant and non-dominant hemispheres. Restorative Neurology and Neuroscience, 31(4), 347–360. 10.3233/RNN-120267 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vogel, A. C., Petersen, S. E., & Schlaggar, B. L. (2014). The VWFA: It's not just for words anymore. Frontiers in Human Neuroscience, 8, Article 88. 10.3389/fnhum.2014.00088 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson, S. M., Yen, M., & Eriksson, D. K. (2018). An adaptive semantic matching paradigm for reliable and valid language mapping in individuals with aphasia. Human Brain Mapping, 39(8), 3285–3307. 10.1002/hbm.24077 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang, Y., Brady, M., & Smith, S. (2001). Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Transactions on Medical Imaging, 20(1), 45–57. 10.1109/42.906424 [DOI] [PubMed] [Google Scholar]
- Zumbansen, A., Peretz, I., & Hebert, S. (2014). Melodic intonation therapy: Back to basics for future research. Frontiers in Neurology, 5, Article 7. 10.3389/fneur.2014.00007 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data that support the findings of this study are available upon reasonable request from the corresponding author, T.M. The data are not publicly available due to their sensitive nature and that they contain information that could compromise the privacy of research participants.



