Abstract
Background:
Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in individuals with non-fluent aphasia. One theory that may explain why SE improves speech output is that it synchronizes functional connectivity between anterior and posterior language regions to be more similar to that of neurotypical speakers.
Objectives:
The present study tested this by measuring functional connectivity between two regions shown to be necessary for speech production, and their right hemisphere homologues, in 24 persons with aphasia compared to 20 controls during both free (spontaneous) speech and SE.
Methods:
Functional connectivity values in each person with aphasia were normalized to the control data. Two analyses were carried out: 1) functional connectivity values were compared between persons with aphasia and controls during free speech and SE and 2) stepwise linear models with leave-one-out cross validation including normed functional connectivity during both tasks and proportion damage to the left hemisphere as independent variables were created for each language score.
Results:
Left hemisphere anterior-posterior functional connectivity and left hemisphere posterior to right hemisphere anterior functional connectivity were significantly more similar to connectivity of the control group during SE compared to free speech. Additionally, connectivity during free speech was more strongly associated with language measures than connectivity during SE.
Conclusions:
Overall, these results suggest that SE promotes normalization of functional connectivity (i.e., return to patterns observed in neurotypical controls), which may explain why individuals with aphasia produce more fluent speech during SE compared to spontaneous speech.
Keywords: Aphasia, Aphasia Recovery, Speech Entrainment, Functional Connectivity, Chronic Stroke
1. Introduction
Broca’s aphasia, one type of non-fluent aphasia, is a common result of damage to left hemisphere (LH) anterior and posterior speech regions1, and associated with non-fluent speech characterized by short, effortful utterances. Prior work has shown that speech entrainment (SE) may be a useful method to facilitate fluent speech in persons with Broca’s aphasia2. During SE, individuals are instructed to mimick, in real time, an audio-visual speech model consisting of a video of a speaker’s mouth. Speech is then pulled along, or “entrained,” by the model.
Not all individuals with aphasia benefit from SE to the same degree. While our previous work shows that those with Broca’s aphasia tend to benefit from SE, other aphasia types do not exhibit similar increases in speech fluency, perhaps due to patterns of lesion damage associated with those aphasia types2. In the original study investigating the effect of SE on speakers with non-fluent aphasia, language production guided by SE was compared to free speech (FS) in a quiet testing room. Speech fluency (i.e., greater variety and number of words) was significantly greater during SE compared to free speech2. This effect was confirmed in a large sample of patients3, confirming that SE induces speech fluency that is far superior than what is expected during free speech. Fridriksson and colleauges also revealed that the location of brain injury can predict which individuals experience a significant increase in speech fluency during SE compared to free speech3. Specifically, benefits were seen for individuals with damage localized to the inferior frontal and middle frontal gyri, suggesting that SE compensates for damage to networks important for speech production in inferior frontal areas. Similarly, Bonilha and colleagues found that those who are responsive to SE had greater dorsal stream damage, but a relatively spared ventral stream4. Moreover, persons with damage to the left posterior middle temporal gyrus (pMTG) experienced fewer benefits from SE. Bilateral activity in the pMTG and motor cortex has been associated with audio-visual sensory integration5. Specifically, the pMTG has been found to be a critical component to audio-visual “entrainment,” and has been suggested to provide visual targets to complimentary auditory targets5. These targets act as guides for speech production during entrainment, and when impaired, negatively affect the extent to which one can entrain to the speech model. Therefore, an intact left pMTG appears necessary for successful SE.
Neural structures supporting successful SE have been investigated via lesion analysis and white matter tractography2–4. Prior research from our group has demonstrated a largely bilateral pattern of activity during both SE and free, spontaneous speech production. One theory that may explain why SE improves fluency in individuals with non-fluent aphasia, such as Broca’s aphasia, is that SE induces greater connectivity between anterior and posterior cortical speech areas. This is supported by work that suggests functional connectivity (FC) between anterior-posterior language regions is correlated with language ability6–10. Mechanistically, greater anterior-posterior connectivity during speech production may be indicative of an intact efference copy11,12. An efference copy, or internal model, consists of motor plans that project from motor regions to the posterior auditory cortex, where auditory consequences are predicted and subsequently compared to one’s output11,13. An intact efference copy is integral to producing fluent speech, and when damaged, results in error-filled speech production with the inability to successfully self-correct. In a study investigating impaired efference copy in people with aphasia, Behroozmand and colleagues examined the effect of a brief perturbation to the pitch of a speaker’s verbal output and found that those with aphasia did not compensate for the perturbation to the same extent as neurotypicals14. Lesion patterns that predict the diminished compensation response were localized to the inferior frontal gyrus (IFG) and supramarginal gyrus (SMG) areas, areas associated with successful SE14. Therefore, it seems plausible that SE compensates for an impaired efference copy in those with non-fluent aphasia caused by damage to the left IFG.
Based on this evidence, we propose that FC may further explain the mechanisms supporting SE as a treatment approach. In a primary analysis, the present study tested this notion by measuring FC between LH regions of interest necessary for speech fluency– left pars opercularis (IFGop) and left pMTG–to investigate if SE elicits more ‘typical’ connectivity patterns compared to spontaneous, free speech. Due to previous studies reporting bilateral activity during speech entrainment, we also investigated intra- and interhemispheric FC with these regions’ right hemisphere homologues. Utilizing control data to standardize FC in our participants with aphasia, we hypothesized that the SE condition would promote FC similar to what is observed in neurotypical controls. In a secondary aim, we investigated which language measures are associated with FC during each speech production task (SE and FS). We hypothesized that greater FC during the FS task, particularly between anterior-posterior language regions, would be associated with quality of speech production. Specifically, we hypothesized that spontaneous speech, information content, fluency, and repetition, as measured by a standardized aphasia assessment, would be more strongly related to FC during FS compared to SE. This finding would provide strong evidence that speaking with the aid of an audio-visual model normalizes FC in individuals who respond well to SE.
2. Methods
2.1. Participants
Twenty-four individuals (5 women) with chronic Broca’s aphasia due to a single-event, left hemisphere stroke were studied. Participant data came from two separate research studies, which included the same methods for collecting behavioral and neuroimaging data2,15. Two participants did not complete all neuroimaging scans for reasons discussed in the Analyses section below. Details on imaging and task procedures are described in section 2.2. The mean age at stroke onset was 53 years (SD = 11, range = 32 – 75), and all but one participant with aphasia was premorbidly right-handed. All participants were in the chronic stage of stroke (mean = 6 years post-stroke, SD = 6, range = 1 – 26). As a control group, 20 neurotypical, right-handed adults (14 women) were enrolled in the study (mean age = 53, SD = 7 years, range = 40 – 76).
Upon enrollment, participants’ aphasia type and severity were determined using the Western Aphasia Battery – Revised (WAB-R)16. The WAB-R is a comprehensive aphasia test, commonly used in clinical practice, which evaluates both receptive and expressive language ability. Scores from the WAB-R used in the analyses here included the aphasia severity (Aphasia Quotient; AQ), and the following WAB-R sub-scores: Naming, Repetition, Comprehension Subscore, Information Content, and Fluency. Results from this assessment indicated that all participants with aphasia (PWA) presented with Broca’s aphasia, with a mean AQ severity of 48.6 (SD=21.3, range=15.7 – 80.7). Participant WAB sub-scores and proportion damage to the left hemisphere are presented in Table 1. Participants underwent high-resolution magnetic resonance imaging (MRI) upon enrollment to confirm left hemisphere damage. The region of greatest lesion overlap was the left longitudinal fasciculus and the middle portion of the left insula, where all but one participant had damage (Figure 1). Dice correlation coefficients were calculated for each individual between individual binary lesion maps and the lesion overlay map of all participants. The lesion overlay map consisted of values between 0 and 23, where 0 indicates areas where no participants have damage, 1 indicates voxels where one participant has damage, and so on. These correlation coefficients offers information about the general overlap between an individual’s lesion and group damage. These values are reported in Table 1.
Table 1.
Title: “Participant language scores and lesion volume.”
| Participant ID | Comp Subscore | Fluency | Information Cont. | Naming Subscore | Repetition Subscore | WAB-R AQ | Prop. LH Damage | Lesion overlap dice coef. |
|---|---|---|---|---|---|---|---|---|
| M2002 | 9.6 | 4 | 9 | 8.2 | 6.6 | 74.8 | .21 | 0.71 |
| M2025 | 7.6 | 4 | 7 | 7.6 | 6.8 | 66 | .21 | 0.50 |
| M2017 | 6.35 | 4 | 7 | 4.1 | 2.8 | 48.5 | .26 | 0.70 |
| M2030 | 5.4 | 3 | 5 | 2.7 | 6.2 | 44.6 | .21 | 0.69 |
| M2042 | 9.05 | 4 | 7 | 8.5 | 8.2 | 73.5 | .27 | 0.60 |
| M2024 | 8.25 | 2 | 3 | 5.1 | 4.4 | 45.5 | .17 | 0.75 |
| M2040 | 9.25 | 4 | 8 | 5.1 | 2 | 56.7 | .32 | 0.66 |
| M2044 | 9.25 | 5 | 8 | 9.5 | 7.6 | 80.7 | .21 | 0.09 |
| M2036 | 8.4 | 5 | 8 | 8.8 | 7.8 | 76 | .20 | 0.85 |
| M2084 | 10 | 4 | 7 | 6.6 | 6.2 | 67.6 | .08 | 0.55 |
| M2005 | 7.2 | 4 | 7 | 5.3 | 4.8 | 56.6 | .08 | 0.48 |
| M2101 | 5.85 | 0 | 2 | 0 | 0 | 15.7 | .28 | 0.66 |
| M2007 | 3.75 | 2 | 2 | 0.6 | 2 | 20.7 | .41 | 0.48 |
| M2118 | 4.3 | 3 | 2 | 2.2 | 3.8 | 30.6 | .18 | 0.78 |
| M2087 | 6.15 | 4 | 3 | 4.2 | 5.2 | 45.8 | .32 | 0.65 |
| M2115 | 6.45 | 3 | 3 | 2.1 | 2.4 | 33.9 | .28 | 0.50 |
| M2029 | 7.4 | 4 | 5 | 4.9 | 2.6 | 47.8 | .26 | 0.50 |
| M2137 | 4.2 | 1 | 3 | 1.2 | 1.6 | 22 | .28 | 0.60 |
| M2131 | 3.45 | 1 | 3 | 1.2 | 2.1 | 21.5 | .56 | 0.47 |
| M2181 | 10 | 4 | 8 | 7.8 | 6.4 | 72.4 | .07 | 0.55 |
| M2197 | 8 | 0 | 0 | 0.6 | 0.3 | 17.8 | .22 | 0.51 |
| M2094 | 6.8 | 4 | 8 | 6.4 | 4.6 | 59.6 | .41 | 0.60 |
| M2178 | 5.7 | 1 | 3 | 0.6 | 0.7 | 22 | .05 | 0.33 |
| M2168 | 9.2 | 4 | 8 | 8.1 | 4 | 66.6 | .16 | 0.66 |
Abbreviations: Comp=Comprehension; Cont=Content; WAB-R AQ=Western Aphasia Battery-Revised Aphasia Quotient”
Figure 1:
Lesion Overlay Map.
Lesion overlay map for PWA group. Red indicates 100% overlap.
All participants provided written informed consent for inclusion in this study. The institutional review board at the University of South Carolina approved the study protocol, and all study procedures adhered to the principles set forth in the Declaration of Helsinki.
2.2. Neuroimaging
Functional neuroimaging
MRI data were acquired using a Siemens 3T system equipped with a 12-channel head-coil. Participants completed three functional MRI runs, with the following conditions: (i) speech entrainment (SE); (ii) free, spontaneous speech (FS); (iii) perception only (P). During the SE condition, participants were instructed to mimic an audio-visual speech model in real time where only the mouth of the model was visible. The scripts used were segmented into sentences and phrases that were presented during the 8 seconds of silent intervals between sparse MRI image collection. The scripts were read by the model at a comfortable speaking rate and the content of the words in each script was controlled for number of words, number of different words, and word class and frequency. Functional MRI (fMRI) scanning was accomplished using sparse imaging, allowing for 8 seconds of silence to be presented while participants listened to stimuli and produced speech. In addition to the benefit of silence (which improves auditory comprehension), this method minimizes head/jaw motion during image acquisition, and leverages the fact that the hemodynamic response measured using fMRI lags the neural processing by several seconds. During the SE condition, participants mimicked the speech model that started immediately after the collection of each volume and ended at least 1 second before the start of the next collection. During the FS condition, participants were instructed to speak about the events of their current day while viewing a model’s mouth movements and listening to backwards speech. This condition controlled for low-level perceptual information while minimizing neural activity that would reflect inhibition of real language activation while participants produced FS2. It is important to note that the FS task included audio-visual presentation of speech, which was consistent with the SE condition, although presentation of auditory stimuli was presented with backwards speech to minimize the need to inhibit listening to normal language while simultaneously producing speech. Thus, speech in the FS task was created using the same auditory information as used in the SE task, but reversed, therefore lacked similar linguistic structure. Finally, in the P condition, participants were presented with the same stimuli as in the SE condition, but were instructed not to perform any overt verbal response. An audio recording was used to verify participants were not speaking during the P condition. Additionally, while there are differences in the cognitive demands between the SE and FS tasks, the fMRI paradigm attempted to reflect the clinical importance of the tasks and their comparisons. All overt responses were collected via MRI compatible microphone allowing for on-line confirmation that participants were attempting the tasks. Specific parameters for the fMRI sequence were as follows: voxel size = 3.25 × 3.25 × 3.20, TR=10s, TE=35ms, TA=2s, 64 × 64 axial matrix with 33 3.2mm thick slices (no gap); FOV=208 × 208mm, with 60 volumes, and a total TA=10min. A total of 30 stimulus events per condition were presented during each fMRI sequence.
Structural neuroimaging
Participants underwent high-resolution structural T1 and T2 MRI. T1-weighted images were acquired utilizing MP-RAGE sequence with the following protocol: sequence with a voxel size = 1 mm3, FOV = 256 × 256mm, 192 sagittal slices, a 9-degree flip angle using parallel imaging, TR = 2250ms, TI = 925ms, and TE = 4.15ms, GRAPPA=2, 80 reference lines. T2-weighted images were acquired using sampling perfection with application optimized contrasts using a varying flip angle evolution (3DSPACE) sequence protocol with the following parameters: voxel size = 1mm3, FOV = 256 × 256mm, 160 sagittal slices, TR = 3200ms, TE = 352ms, and no slice acceleration. Lesions were demarcated by a collaborating neurologist (LB) or trained expert (RNN) in MRIcroGL17 using individual T2 FLAIR MRI images in native space. T1 MRI images were used for qualitative reference of lesion boundaries.
2.3. Analyses
MRI preprocessing and data analyses
Image preprocessing and analyses were completed using SPM12 (Wellcome Institute of Cognitive Neurology, London, UK). Standard preprocessing steps were implemented including slice time correction, rigid body motion correction, coregistration of functional images to respective T1 structural images, and normalized to the Montreal Neurological Institute (MNI) template using 12 degrees of freedom (see Fridriksson et al., 2012). Lesion-masked cost function weighting was used to minimize the effect of damage on the normalization procedure18. Normalized functional images were then smoothed using a 6mm Gaussian filter. Individual subject analyses were carried out by constructing a general linear model for each condition of interest (SE and FS) to create mean cortical activation maps associated with each task.
To create activation maps for the speech entrainment and free speech condition, activation from the perception-only task was subtracted from both the SE and FS activation maps and regressors were then convolved with a synthetic hemodynamic response (gamma). Next, group-level activation maps were created using a fixed effects model incorporating individual contrasts generated by the first-level analysis. Significant clusters were identified using a voxel-wise FWE threshold of p<0.05 and are presented in Figure 2.
Figure 2:
“Mean Cortical Activity by Condition.”
Separate mean cortical activation for the contrasts ‘speech entrainment > audio/visual speech perception’ (red); ‘free speech > audio/visual speech perception’ (blue); and where the conditions overlap (purple) for the control group (A) and aphasia group (B).
Individual timeseries maps for SE and FS conditions followed the same preprocessing parameters as described above, but were were not corrected for perception-only activation or convolved with the hemodynamic response function. Rather, region of interest (ROI) FC analysis was conducted using the raw, sparse timeseries from the preprocessed imaging scans (with 30 stimulus presentations and 30 control presentations). All participants were presented with stimuli in the same order.
Regions of interest for the FC analysis were selected based on the literature suggesting an auditory-motor integration network that is recruited during speech motor control tasks19,20. Specifically, Venezia and colleagues (2016) identify left pMTG as a critical component of a potential visuomotor pathway for speech motor control5. Results from our previous studies show greater pMTG recruitment during SE compared to FS and further reinforces these hypotheses regarding an auditory-motor integration network2–4. Regions in the Johns Hopkins University (JHU) atlas21 which most closely matched the regions implicated in the aforementioned studies were utilized in the present study. Given that speech entrainment elicits greater bilateral activation compared to free (spontaneous) speech2, right hemisphere homologues, as defined in the JHU atlas, were used in analyses to investigate pairwise FC between four ROIs: 1) L IFGpo; 2) L pMTG; 3) R IFGpo; 4) R pMTG. Time-courses of the mean blood oxygen-level dependent (BOLD) activity for the ROIs during the SE and FS conditions were extracted from each individual’s cortical activation map, for all four ROIs. A total of 24 participants were enrolled in the study but one participant was unable to complete the FS fMRI task. Therefore, our primary FC analysis included all 24 participants, where participant M2030 did not complete the FS task. In our secondary analysis, only participants who completed both tasks were included (N=23).
Functional connectivity analysis
Functional connectivity (FC) was measured by correlating (Pearson) the mean BOLD signal time series between all ROI-pairs during each condition (SE and FS) separately (Supplementary Table I). For connectivity values involving any LH ROI, the proportion damage of the ROI was regressed out of the FC value and used in the subsequent analysis. Given our primary hypothesis was that FC observed during SE would be more similar (vs FS) to FC observed in neurotypical participants, we utilized control data to compare FC between individuals with aphasia and controls. To do this, we standardized FC values for the participants with aphasia by Z-transforming to the mean FC in the control group (see formula below), where values closer to zero indicate FC values more comparable to the neurotypical control group.
RPWA is the individual FC value, x̅Rcontrol is the mean FC value for the control group, and Sdcontrol is the standard deviation for the control group. This method was used to normalize the data among persons with aphasia. Normalizing FC values based on the control group aimed to mitigate any concern about SE and FS task differences.
To explore the relationship between WAB-R sub-scores and FC during both the SE and FS tasks, we performed stepwise linear regressions with leave-one-out (LOO) cross validation (using both forward and backward iterations), with the score as the dependent variable, to create two predictive models for each score: one including SE FC values and proportion LH damage; and the other including FS FC values and proportion LH damage. Investigating the relationship between behavior and FC for SE and FS separately allows us to examine if, when speaking along with an audio-visual model (which has been shown to increase speech fluency in persons with non-fluent aphasia), FC values predict behavior to the same extent as FC during spontaneous free speech. Each model included participants who completed both FS and SE functional scans (N=23), therefore 23 iterations for each regression model were run. Regression results serve to inform us of what functional connections predict language scores, and if these results are specific to FC during a specific task (SE vs. FS). Although exploratory and not the primary aim of the present study, this analysis aims to investigate how differences in FC relate to language scores, a topic that could inform clinicians of who may be an ideal candidate for using SE as a potential treatment approach in clinical practice.
Regression analyses were run using the leave-one-out toolbox for MATLAB (https://github.com/grigori-yourganov/leave_one_out). Variables that were significant in at least 5% of iterations are reported and p-values<0.025 (Bonferroni correction; given two models were created for each behavioral measure) were considered statistically significant. Output from the leave-one-out toolbox provides both results from a stepwise linear model and when using LOO cross-validation, therefore, t-statistics are provided for both. Results from these models should be interpreted by, first, considering the percent of iterations, which indicates the percent of iterations when a predictor was significant and entered into a model, and next the t-statistic, which is the mean weight of the predictor across all iterations.
3. Results
3.1. Activation Maps for Speech Entrainment and Free Speech
Cortical activation maps for the control group (Figure 2A) revealed largely bilateral spatial overlap across both conditions (free speech > perception only; speech entrainment > perception only). The cohort of PWA (Figure 2B) presented with similar patterns of activity to each other, however activation in left frontal cortex did not survive thresholding (FWE; p<0.05). Consistent with previous work from our group, the regions of greatest overlap across both conditions occurred mostly in the basal ganglia, superior/middle temporal gyri, motor cortex, and anterior cingulate cortex2.
3.2. Functional Connectivity for Speech Entrainment vs Free Speech
To investigate FC differences between SE and FS conditions, the normed FC values were compared for each of the six connections between the four ROIs. A Wilcoxon signed-rank test was conducted for every connection to determine if there was a significant difference in FC between SE and FS (Figure 3). We report corrected (Bonferroni) and uncorrected p-values for ease of interpretation. Two statistically significant differences in FC between the SE and FS tasks were revealed (one survived correction for multiple comparisons): LH pMTG-LH pars opercularis (p=0.033 uncorr.; p=.2 bonf.) and LH pMTG-RH pars opercularis (p=0.0002 uncorr.; p=.001 bonf.). No significant differences were revealed between the two conditions across the remaining ROI connections.
Figure 3:
“Connections and their corresponding SE and FS FC values”
“Figure 3: Boxplots illustrating normalized FC values for both SE and FS conditions for all connections for persons with aphasia. Gray dashed line (x=0) indicates mean connectivity among control participants. Each colored bar in the boxplot represents the respective neural connection depicted on the left.
FC=functional connectivity; FS=free speech; SE=speech entrainment; LH=left hemisphere; RH=right hemisphere; pMTG=posterior middle temporal gyrus; PO=pars opercularis.”
3.3. Functional Connectivity and Language Measures
Normed FC values were included as independent variables in leave-one-out stepwise regression (LOOSWR) models for each WAB-R subscore to investigate the relationship between these language measures and FC during each task. Right hemisphere anterior-posterior connectivity (RH pMTG – RH IFGpo) during FS and SE was included in the most iterations (≥96%) for all significant (p<0.05) models (WAB-R AQ, Information Content, Spontaneous Speech, Comprehension, Repetition, and Naming) compared to any other connection. Overall, FC values during FS were included in more models to predict language scores compared to FC during SE. No significant FS or SE connections were found to predict WAB-R Fluency scores, and no significant SE connections were found to predict WAB-R Repetition scores. Results from the FS regression models are presented in Table 3, and results from SE regression models are presented in Table 4. Scatterplots that show predicted scores versus actual scores from each model can be found in Supplementary material (Supplementary Figure I and II), along with a correlation (Pearson) matrix between FC values and language scores (Supplementary Figure III).
Table 3.
Title: “Stepwise linear regression with leave-one-out cross validation results with free speech functional connectivity and language performance on a standardized assessment”
| Dependent Variable | Model | Linear Regression Model Results | LOO Stepwise Regression Model | |||||
|---|---|---|---|---|---|---|---|---|
| Est. | Std. Error | tStat | pValue | T | % iterat. | |||
|
WAB-R AQ | (Intercept) | 77.4 | 4.7 | 16.6 | <.001 | ||
| FS RH pMTG–RH IFGpo | 13.2 | 1.3 | 9.9 | <.001 | 9.5 | 96% | ||
| FS LH IFGpo –RH pMTG | −8.8 | 2.0 | −4.4 | <.001 | −4.4 | 91% | ||
| FS LH pMTG–RH pMTG | −5.2 | 1.3 | −4.0 | <.001 | −4.0 | 91% | ||
| FS LH IFGpo –RH IFGpo | 9.5 | 1.4 | 6.9 | <.001 | 6.7 | 96% | ||
| Total Lesion Volume | −106.6 | 16.0 | −6.7 | <.001 | −6.1 | 96% | ||
|
Info. Content | (Intercept) | 7.3 | 0.7 | 10.4 | <.001 | ||
| FS RH pMTG–RH IFGpo | 1.1 | 0.3 | 4.2 | <.001 | 3.7 | 100% | ||
| FS LH IFGpo –RH IFGpo | 0.7 | 0.3 | 2.1 | .05 | 2.5 | 32% | ||
| Fluency | n.s. | n.s. | n.s. | n.s. | n.s. | n.s. | n.s. | |
|
Spont. Speech | (Intercept) | 11.6 | 1.1 | 11.0 | <.001 | ||
| FS RH pMTG–RH IFGpo | 1.6 | 0.4 | 4.0 | <.001 | 4.5 | 100% | ||
| FS LH IFGpo –RH IFGpo | 1.3 | 0.5 | 2.7 | .01 | 3.2 | 100% | ||
|
Comp. | (Intercept) | 8.8 | 0.6 | 13.8 | <.001 | ||
| FS LH pMTG–RH IFGpo | 0.7 | 0.2 | 3.3 | .003 | 3.3 | 91% | ||
|
Repetition | (Intercept) | 6.6 | 0.8 | 8.5 | <.001 | ||
| FS RH pMTG–RH IFGpo | 1.3 | 0.2 | 5.9 | <.001 | 3.4 | 77% | ||
| FS LH IFGpo –RH pMTG | −1.0 | 0.3 | −2.9 | .01 | −2.8 | 18% | ||
| FS LH pMTG–RH pMTG | −0.8 | 0.2 | −3.9 | .001 | −3.7 | 23% | ||
| FS LH IFGpo –RH IFGpo | 1.1 | 0.2 | 5.0 | <.001 | 3.9 | 36% | ||
| Total Lesion Volume | −10.7 | 2.6 | −4.0 | <.001 | −4.0 | 18% | ||
|
Naming | (Intercept) | 8.6 | 0.8 | 11.4 | <.001 | ||
| FS RH pMTG–RH IFGpo | 1.8 | 0.2 | 8.4 | <.001 | 8.2 | 96% | ||
| FS LH IFGpo –RH pMTG | −1.0 | 0.3 | −3.1 | .006 | −3.0 | 96% | ||
| FS LH pMTG–RH pMTG | −0.8 | 0.2 | −4.0 | .001 | −3.9 | 96% | ||
| FS LH IFGpo –RH IFGpo | 1.3 | 0.2 | 6.0 | <.001 | 5.8 | 96% | ||
| Total Lesion Volume | −13.9 | 2.6 | −5.4 | <.001 | −5.3 | 96% | ||
Abbreviations: LOO=leave-one-out; FS=free speech; SE=speech entrainment; LH=left hemisphere; RH=right hemisphere; pMTG=posterior middle temporal gyrus; IFGpo=pars opercularis; n.s.=not significant.
Table 4.
Title: “Stepwise linear regression with leave-one-out cross validation results with speech entrainment functional connectivity and language performance on a standardized assessment”
| Dependent Variable | Model | Linear Regression Model Results | LOO Stepwise Regression Model | |||||
|---|---|---|---|---|---|---|---|---|
| Est. | Std. Error | tStat | pValue | T | % iterat. | |||
|
WAB-R AQ | (Intercept) | 70.8 | 7.6 | 9.3 | <.001 | ||
| SE RH pMTG–RH IFGpo | 9.2 | 2.2 | 4.1 | <.001 | 4.0 | 96% | ||
| Total Lesion Volume | −81.5 | 27.8 | −3.0 | .009 | −2.9 | 86% | ||
|
Info. Content | (Intercept) | 7.8 | 0.9 | 8.5 | <.001 | ||
| SE RH pMTG–RH IFGpo | 1.3 | 0.3 | 4.9 | <.001 | 4.6 | 96% | ||
| Total Lesion Volume | −8.3 | 3.3 | −2.5 | .02 | −2.5 | 68% | ||
| Fluency | n.s. | n.s. | n.s. | n.s. | n.s. | n.s. | n.s. | |
|
Spont. Speech | (Intercept) | 10.1 | 0.8 | 12.2 | <.001 | ||
| SE RH pMTG–RH IFGpo | 1.5 | 0.4 | 3.4 | .003 | 3.4 | 100% | ||
| SE LH pMTG–RH IFGpo | 1.2 | 0.5 | 2.4 | .03 | 2.4 | 77% | ||
|
Comp. | (Intercept) | 9.6 | 0.6 | 15.7 | <.001 | ||
| SE RH pMTG–RH IFGpo | 0.7 | 0.2 | 4.1 | <.001 | 4.0 | 96% | ||
| SE LH pMTG–RH IFGpo | 0.5 | 0.2 | 2.1 | .05 | 2.4 | 82% | ||
| Total Lesion Volume | −7.5 | 2.6 | −2.9 | .01 | −3.2 | 91% | ||
| Repetition | n.s. | n.s. | n.s. | n.s. | n.s. | n.s. | n.s. | |
|
Naming | (Intercept) | 7.6 | 1.1 | 6.8 | <.001 | ||
| SE RH pMTG–RH IFGpo | 1.3 | 0.3 | 4.0 | <.001 | 3.8 | 100% | ||
| Total Lesion Volume | −10.9 | 4.1 | −2.7 | .02 | −2.6 | 91% | ||
Abbreviations: LOO=leave-one-out; FS=free speech; SE=speech entrainment; LH=left hemisphere; RH=right hemisphere; pMTG=posterior middle temporal gyrus; IFGpo=pars opercularis; n.s.=not significant.
4. Discussion
4.1. Summary of Findings
The primary purpose of this study was to compare functional connectivity during SE to functional connectivity during FS in a group of people with aphasia and a cohort of neurotypical controls. Group activation maps were created to compare BOLD response during FS and SE after subtracting out activity during our perception-only condition (Figure 2). Consistent with prior literature, cortical activation during FS and SE (relative to perception-only) conditions was very similar, suggesting both recruited the same network despite task differences between the two conditions2,3. We utilized control data from neurotypical adults to standardize the connectivity measures in PWA and to determine which task elicited FC more similar to controls. We provide evidence that FC in individuals with Broca’s aphasia is more similar to neurotypical individuals during SE compared to during FS. This difference is primarily seen in LH anterior-posterior connectivity (LH pMTG—LH IFGop) and LH posterior to RH anterior connectivity (LH pMTG—RH IFGop). A secondary, exploratory analysis revealed that connectivity during SE and FS tasks is related to language scores. Further, more connections were included in regression models predicting language scores during FS compared to SE, suggesting that FS connectivity is more likely to be a significant predictor of language scores than is SE connectivity. Right hemisphere anterior-posterior connectivity was included most frequently across all subscores, indicating the importance of connectivity in the intact hemisphere. Interestingly, no connectivity variables significantly predicted Fluency scores. We suggest that this might be due to the nature of the recruitment criteria: Fluency, as measured by the WAB-R, is scored on a 10-point scale and the present study recruited only participants diagnosed with non-fluent aphasia. Per the WAB-R a non-fluent aphasia classification is consistent with a Fluency score < 5, thus limiting the distribution of Fluency sub-scores. Given the analysis investigating FC and off-line behavioral measures was exploratory due to the study’s limited sample size, these results should be considered with caution, and replicating this study with a larger sample would be needed to validate the results found here. However, the observed results between connectivity and behavior pose some promising evidence that points to the measurable improvement in LH anterior-posterior synchrony during an audiovisual task: SE. Results suggest that in the presence of SE, individuals with nonfluent aphasia demonstrate not only behavioral gains, as measured by improved speech fluency, but neurophysiological changes, as indicated by neural synchrony that aligns more closely with that of a neurotypical control group. Improved synchrony (as measured by FC) suggests that practice with an audiovisual aid may facilitate improved communication between aberrant post-stroke connectivity, particularly between residual LH ROIs.
Our results are consistent with previously implicated neural mechanisms of successful SE from lesion symptom mapping studies3,4 and offer a complementary measure of task-based fMRI to further delineate the neural underpinnings that contribute to fluent speech in individuals with non-fluent aphasia.
4.2. Mechanistic Properties of SE
For individuals with non-fluent aphasia, particularly those with damage to LH pars opercularis, SE facilitates an improved rate of speech2–4. Initial investigations of the underlying mechanisms of SE revealed successful entrainment was attributed to greater bilateral activation in posterior temporal lobe (BA 37), IFG (BA 47), left pMTG and Broca’s area, compared to FS2. More recently, Bonilha et al. expanded upon these findings to suggest preservation of ventral stream regions, specifically, the pMTG, and associated white matter tracts such as the uncinate fasciculus, and inferior fronto-occipital faciculus, are associated with improved speech fluency during SE4. In addition to work from Fridriksson and colleagues, evidence for the importance of the pMTG for SE success has also been shown in studies investigating neurotypical individuals and those with primary progressive aphasia (PPA)22. In controls, Venezia and colleagues found that the audio-visual (AV) condition yielded more accurate responses compared to auditory-only and visual-only modalities5. Conditions including the visual modality activated the left pMTG which suggest that the pMTG establishes complementary visual targets for motor control, a process integral to speech production. In a study investigating SE ability in individuals with PPA, Henry and colleagues (2018) found that deterioration of the pMTG leads to poorer AV SE, which converges with results from previous work from Fridriksson and colleagues investigating the effects of SE in chronic aphasia that show that perseveration of the pMTG, as well as activity changes in the pMTG after SE training, correlate with entrainment success. Therefore, it seems likely that the pMTG is a critical area for successful AV entrainment.
Also critical for speech production is the coordination between posterior temporal (pMTG and pSTG) and anterior frontal (pars opercularis) language regions. According to the Hierarchical State Feedback Control (HSFC) model23, the feedforward process of speech production activates anterior and posterior regions by way of area Spt, a region located at the temporo-parietal junction. Consistent with Guenther and colleagues’ Directions into Velocities of Articulators (DIVA) model, auditory targets, which are generated in posterior regions, rely on communication with anterior motor programming regions in order to provide feedforward information for successful speech production11,13. Taken together, the DIVA and HSFC models suggest anterior-posterior connectivity is necessary, and perhaps critical to recruit the speech efference copy. In the context of SE, previous work suggests damage to anterior or posterior language regions, or the connections between the two, could negatively impact speech motor programming, which relies upon intact feedforward and feedback projections3,24. With regard to clinical disorders such as aphasia that result from brain injury such as a stroke, participants with non-fluent aphasia are hypothesized to have an impaired efference copy due to left IFG damage and subsequently, aberrant oscillatory function. To compensate for the absence of an efference copy, the literature suggests that an AV SE model provides an external efference copy by recruiting residual posterior regions (i.e., visual speech units in the pMTG) to induce fluency speech production.
In a recent study investigating sensorimotor feedback in PWA, Behroozmand and colleagues provide some evidence that may have theoretical implication for understanding the role of efference copy in speech production and for SE in aphasia14. Using altered auditory feedback (AAF) during speech production, this study involved perturbation of the acoustic speech signal so that the perceived output (feedback information) did not match the expected output (feedforward information). In healthy participants, AAF resulted in real-time modification of one’s output to account for the perceived output error (i.e., mismatch between expected and perceived output). Results from this study showed that individuals with aphasia did not compensate to the same degree as neurotypical controls, and found that damage to the STG and MTG predict poorer AAF response. This suggests damage to these temporal lobe regions may lead to impaired error detection and, more importantly, a failure to communicate between sensory and motor units, indicative of an impaired efference copy.
The aforementioned “communication” between sensory and motor units can be operationalized as FC between critical language regions. Prior research investigating resting state and task-based FC measures has emphasized the importance of greater fronto-temporal connectivity during successful language processing8 and production7,10,25. In a study by Ewald and colleagues, connectivity between occipito-temporal and frontal areas was observed in healthy participants during overt language production on a naming task26. The present study reinforces prior findings showing that greater speech output and improved fluency during SE elicits greater connectivity between critical language regions previously identified in speech production models (pars opercularis and pMTG). Specifically, the current results suggests task-based functional connectivity reflects improved connectivity across anterior and posterior regions for successful entrainment. For participants with aphasia, we observe increased anterior-posterior FC in the SE condition which may suggest that the audiovisual model provides an external efference copy.
4.3. Treating the Mechanism
The aim of the present study was to investigate the neural mechanisms that support successful entrainment to an AV speech model. While this study did not aim to evaluate the therapeutic response to SE, our results suggest SE training may be a viable approach to not only induce fluent speech but also to treat non-fluent speech production. Fluent speech production appears to be dependent upon connectivity between anterior and posterior speech regions10,25,27 and this connectivity is negatively impacted by a stroke. Considerable evidence suggests mal-adaptive brain changes following stroke can negatively influence recovery28,29. These negative influences of neural plasticity can be explained by the “learned non-use” phenomenon, suggesting that, once impaired, a stroke survivor may avoid or adopt maladaptive compensatory strategies for a given function29,30 and therefore, require extensive rehabilitation to mitigate31. Using SE, individuals with aphasia are able to practice scripts that have a similar rhythm, sentence structure, and rate with the intention to alleviate learned non-use. We suggest that repeated training with an external AV model, such as SE, elicits FC more similar to neurotypical individuals and that this external mechanism may reinforce anterior-posterior cohesion in the left hemisphere.
4.4. Limitations
While this study provides convincing evidence that SE elicits more typical FC patterns compared to FS, there are a few study limitations that should be addressed. First, we utilized sparse sampling fMRI to calculate connectivity between anterior and postior language regions. Compared to traditional approaches to measuring connectivity (i.e., via a continuous fMRI paradigm), sparse sampling has rather poor temporal resolution, making finegrained analyses of functional connectivity impossible. Although connectivity analysis using sparsely acquired scans is not ideal, they have been found to identify similar patterns of connectivity to those found during continuous scanning procedures32. Second, the present study utilizes FC measures as an indirect measure of typical performance (compared to neurotypical controls) at a given task. While the literature on FC during language tasks suggests greater FC values reflect better task performance, the present study is unable to relate behavioral SE success to FC during SE. However, a previous study that relied on a subset of the data used here showed that individuals with non-fluent aphasia produce, on average, more than twice as many words during SE compared to spontaneous speech2. We can reasonably suggest that the cohort tested in the current study performed similarly. Further, though the presentation of backwards speech during the FS task was used to better control for auditory input during the SE condition, the extent which this could contribute to the differences in functional connectivity between tasks is unclear. However, given the observed association between connectivity during FS and off-line behavioral measures, it appears to be the case that differences in connectivity between SE and FS are not merely driven by non-linguistic task demands (i.e., inhibiting perception of backwards speech during overt spontaneous speech). We propose that speech entrainment provides external support to mitigate stroke-induced disruptions that may result in a degraded or obsolete efference copy in participants with non-fluent aphasia. While this theory aligns well with prior literature and the results presented in the current study, we did not directly test the intactness of an efference copy. Therefore, it is possible that the presentation of an external speech entrainment model could improve feedforward processes for speech production via an alternative mechanism such as the recruitment of an alternative neural route. Measures to assess of the integrity of a speech efference copy have not been well established in the literature; however, recent work investigating sensorimotor error detection during an AAF task has attempted to address this issue. Future studies are needed to determine if this mechanism is exclusive to SE.
4.5. Conclusions
We investigated connectivity between regions of interest previously implicated in successful speech entrainment and the right hemisphere homologues. We observed that individuals with non-fluent aphasia demonstrate ‘normalized’ functional connectivity when speaking with a SE model compared to a FS condition. The present study suggests that speaking along with an audio-visual speech model not only leads to improved speech fluency, but also promotes anterior-posterior neural coherence between anterior cortical regions. Notably, improvements in neural coherence approximate the functional connectivity values that are observed in neurotypical controls. In summary, our results suggest that speaking in a SE task is associated with increased anterior-posterior connectivity within language regions and RH homologues, and that this brain-behavior relationship may explain why individuals with nonfluent aphasia demonstrate greater speech output during SE as compared to spontaneous speech.
Supplementary Material
Acknowledgements
This work was supported by the National Institute on Deafness and Other Communication Disorders (NIDCD) to Julius Fridriksson (R21 DC014170; P50 DC014664).
Footnotes
Declaration of Conflicting Interests: The authors declare that there is no conflict of interest.
Citations
- 1.Fridriksson J, Fillmore P, Guo D, Rorden C. Chronic Broca’s aphasia is caused by damage to Broca’s and wernicke’s areas. Cereb. Cortex. 2015;25:4689–4696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Fridriksson J, Hubbard HI, Hudspeth SG, et al. Speech entrainment enables patients with Broca ‘ s aphasia to produce fluent speech. Brain. 2012:3815–3829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Fridriksson J, Basilakos A, Hickok G, Bonilha L, Rorden C. Speech Entrainment Compensates for Broca’s Area Damage. Cortex. 2015:68–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Bonilha L, Hillis AE, Wilmskoetter J, et al. Neural structures supporting spontaneous and assisted (entrained) speech fluency. Brain. 2019:1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Venezia JH, Fillmore P, Matchin W, et al. Perception drives production across sensory modalities: a network for sensorimotor integration of visual speech. Neuroimage. 2016:196–207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Fox MD, Raichle ME. Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging. Nat. Rev. Neurosci. 2007;8:700–711. [DOI] [PubMed] [Google Scholar]
- 7.Vlooswijk MCG, Jansen JFA, Krom MCFTM De, et al. Functional MRI in chronic epilepsy : associations with cognitive impairment. Neurology. 2010;9. [DOI] [PubMed] [Google Scholar]
- 8.Baldassarre A, Lewis CM, Committeri G, Snyder AZ. Individual variability in functional connectivity predicts performance of a perceptual task. Proc. Natl. Acad. Sci. 2012;109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Knollman-porter K, Wallace SE, Hux K, et al. Reading experiences and use of supports by people with chronic aphasia. Aphasiology. 2015;29:1448–1472. [Google Scholar]
- 10.Chai XJ, Berken JA, Barbeau EB, et al. Intrinsic Functional Connectivity in the Adult Brain and Success in Second-Language Learning. J. Neurosci. 2016;36:755–761. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Guenther FH, Hampson M, Johnson D. A theoretical investigation of reference frames for the planning of speech movements. Psychol. Rev. 1998;105:611–633. [DOI] [PubMed] [Google Scholar]
- 12.Houde JF, Nagarajan SS. Speech production as state feedback control. Front. Hum. Neurosci. 2011;5:1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Guenther FH, Ghosh SS, Tourville JA. Neural modeling and imaging of the cortical interactions underlying syllable production. Brain Lang. 2006;96:280–301. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Behroozmand R, Phillip L, Johari K, Bonilha L, Hickok G, Fridriksson J. Sensorimotor impairment of speech auditory feedback processing in aphasia. Neuroimage. 2018:102–111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Thors H, Yourganov G, Rorden C, Bonilha L, Fridriksson J. Speech Entrainment to Improve Spontaneous Speech in Broca’s Aphasia. Prep. [Google Scholar]
- 16.Kertesz A Western aphasia battery-R. New York, NY, US: Grune & Stratton; 2007. [Google Scholar]
- 17.Rorden C, Brett M. Stereotaxic display of brain lesions. Behav. Neurol. 2000;12:191–200. [DOI] [PubMed] [Google Scholar]
- 18.Brett M, Leff AP, Rorden C, Ashburner J. Spatial normalization of brain images with focal lesions using cost function masking. Neuroimage. 2001;14:486–500. [DOI] [PubMed] [Google Scholar]
- 19.Buchsbaum B, Hickok G, Humphries C. Role of left posterior superior temporal gyrus in phonological processing for speech perception and production. Cogn. Sci 2001;25:663–678. [Google Scholar]
- 20.Okada K, Hickok G. Left posterior auditory-related cortices participate both in speech perception and speech production: Neural overlap revealed by fMRI. Brain Lang. 2006;98:112–117. [DOI] [PubMed] [Google Scholar]
- 21.Faria A, Joel S, Zhang Y, et al. Atlas-Based Analysis of Resting-State Functional Connectivity: Evaluation for Reproducibility and Multi-Modal Anatomy- Function Correlation Studies. Neuroimage. 2012;61:613–621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Henry ML, Hubbard HI, Grasso SM, et al. Retraining speech production and fluency in non-fluent/agrammatic primary progressive aphasia. Brain. 2018;141:1799–1814. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hickok G Computatinal neuroanatomy of speech production. Nat. Rev. Neurosci. 2012;13:135–145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Feenaughty L, Basilakos A, Bonilha L, et al. Non-fluent speech following stroke is caused by impaired efference copy. Cogn. Neuropsychol. 2017;34:333–346. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Kell CA, Neumann K, Behrens M, Gudenberg AW Von, Giraud A. Speaking-related changes in cortical functional connectivity associated with assisted and spontaneous recovery from developmental stuttering. J. Fluency Disord. 2018;55:135–144. [DOI] [PubMed] [Google Scholar]
- 26.Ewald A, Aristei S, Nolte G, Rahman RA. Brain oscillations and functional connectivity during overt language production. Front. Psychol. 2012;3:1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Vlooswijk MCG, Jansen JFA, Majoie HJM, et al. Functional connectivity and language impairment in cryptogenic localization-related epilepsy. Neurology. 2010;75:395–402. [DOI] [PubMed] [Google Scholar]
- 28.Mark VW, Taub E. Constraint-induced movement therapy for chronic stroke hemiparesis and other disabilities. Restor. Neurol. Neurosci. 2004;22:317–336. [PubMed] [Google Scholar]
- 29.Taub E, Mark W, Morris D. The learned nonuse phenomenon: implications for rehabilitation. Eura Medicophys. 2006;42:241–255. [PubMed] [Google Scholar]
- 30.Wolf SL, Lecraw DE, Barton LA, Jann BB. Forced use of hemiplegic upper extremities to reverse the effect of learned nonuse among chronic stroke and head-injured patients. Exp. Neurol. 1989;104:125–132. [DOI] [PubMed] [Google Scholar]
- 31.Kleim JA, Jones TA. Principles of experience-dependent neural plasticity: Implications for rehabilitation after brain damage. J. Speech, Lang. Hear. Res. 2008;51:225–239. [DOI] [PubMed] [Google Scholar]
- 32.Yakunina N, Kim TS, Tae WS, Kim SS, Nam EC. Applicability of the Sparse Temporal Acquisition Technique in Resting-State Brain Network Analysis. Am. J. Neuroradiol. 2016;37:515 LP–520. Available at: http://www.ajnr.org/content/37/3/515.abstract. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



