Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Apr 1.
Published in final edited form as: Epilepsy Behav. 2021 Mar 16;117:107909. doi: 10.1016/j.yebeh.2021.107909

Prediction of baseline expressive and receptive language function in children with focal epilepsy using diffusion tractography-based deep learning network

Jeong-won Jeong 1,2,4,5, Min-Hee Lee 1,5, Nolan O’Hara 4,5, Csaba Juhász 1,2,3,4,5, Eishi Asano 1,2,4
PMCID: PMC8035310  NIHMSID: NIHMS1680811  PMID: 33740493

Abstract

Purpose:

Focal epilepsy is a risk factor for language impairment in children. We investigated whether the current state-of-the-art deep learning network on diffusion tractography connectome can accurately predict expressive and receptive language scores of children with epilepsy.

Methods:

We studied 37 children with a diagnosis of drug-resistant focal epilepsy (age: 11.8±3.1 years) using 3T MRI and diffusion tractography connectome: G=(S, Ω), where S is an adjacency matrix of edges representing the connectivity strength (number of white-matter tract streamlines) between each pair of brain regions, and Ω reflects a set of brain regions. A convolutional neural network (CNN) was trained to learn the nonlinear relationship between ‘S (input)’ and ‘language score (output)’. Repeated hold-out validation was then employed to measure the Pearson correlation and mean absolute error (MAE) between CNN-predicted and actual language scores.

Results:

We found that CNN-predicted and actual scores were significantly correlated (i.e., Pearson’s R/p-value: 0.82/<0.001 and 0.75/<0.001), yielding MAE: 7.77 and 7.40 for expressive and receptive scores, respectively. Specifically, sparse connectivity not only within the left cortico-cortical network but also involving the right subcortical structures was predictive of language impairment of expressive or receptive domain. Subsequent subgroup analyses inferred that the effectiveness of diffusion tractography-based prediction of language outcome was independent of clinical variables. Intrinsic diffusion tractography connectome properties may be useful for predicting the severity of baseline language dysfunction and possibly provide a better understanding of the biological mechanisms of epilepsy-related language impairment in children.

Keywords: diffusion-weighted imaging (DWI) tractography, language prediction, deep learning network, pediatric epilepsy

1. Introduction

Focal epilepsy may disrupt brain functions supporting neurocognitive development [1]. High incidence (20–30%) of language impairment in pediatric epilepsy is in part attributed to the consequence of repeated seizures and also to the underlying etiologies that per se lead to significant intellectual disability [26]. Given the importance of age-appropriate language development for educational and social welfare, it is clinically imperative that impairment of language function should be readily and reliably identified and that children and caregivers can be provided with accurate prognostic information and appropriately targeted intervention [7]. Direct psychometric assessment of language functions in young children and children with severe comorbidities is often challenging and unreliable due to motivational and behavioral concerns [810]. Thus, the etiologic yield of such evaluations is highly variable from 17% to 34% [11]. Clinical brain MRI in this age group has also been often unremarkable and of limited value in evaluating language impairment except to rule out lesional etiology. Thus, there is an urgent need to develop an objective imaging-based biomarker approach capable of accurately and objectively predicting language functions in pediatric epilepsy patients with severe internalizing or externalizing behavioral concerns. We expect that a neuroimaging tool quantifying the severity of alteration of anatomical brain networks has an outstanding chance to serve as an objective biomarker of language impairment in pediatric epilepsy, because neural dynamics supporting auditory or visual language function involve extensive brain networks not limited to a single region [1214]. Our central hypothesis is that early identification of language impairment will improve the current diagnostic workup by providing prognostic information and directing language interventions to mitigate the severity of expressive and receptive language impairment.

The overall purpose of this study was to determine how well diffusion-weighted imaging (DWI) tractography-based connectome analysis would predict expressive and receptive language scores in individual children with drug-resistant focal epilepsy. The first analytic step in this study was to model the whole brain as a large distributed network, represented by a collection of nodes (i.e., cortical and subcortical regions) and edges (i.e., pair-wise connections between nodes). By tracking a diffusion signal between two given nodes, we quantified pair-wise connectivity for each edge and then investigated the relationship of DWI connectivity with expressive and receptive language scores rated by Clinical Evaluations of Language Fundamentals (CELF) [15]. CELF is a clinical standard to accurately identify and diagnose language disorders and provides clinicians with a streamlined, flexible battery to assess semantics, morphology, syntax, and pragmatics. The present study utilized CELF expressive and receptive summary index scores as the ground-truth to quantify the impairment in expressive and receptive language domain.

Our group previously demonstrated the potential utility of DWI connectome analysis by first reporting that, compared with age- and gender-matched controls, a group of children with language delay showed decreased connectivity between left frontotemporal neocortical regions, especially involving the middle-frontal and superior-temporal gyri [16]. We subsequently reported that postoperative increase in DWI connectivity between the insular and inferior-frontal operculum regions, the superior-frontal and orbitofrontal regions, as well as the middle-frontal and orbitofrontal regions was associated with postoperative preservation or improvement in visual memory and planning in children who underwent epilepsy surgery [17]. These preliminary observations motivated our hypothesis that language impairment in children with epilepsy would be at least in part attributed to abnormalities in pair-wise white matter connectivity. We expected that the current study using DWI-based network analysis would be capable of segregating the connectivities whose abnormalities lead to expressive and receptive language impairments, respectively.

To optimize the prediction of baseline expressive and receptive language functions based on DWI connectome data, the present study employed a deep learning approach referred to as convolutional neural network (CNN) [1821]. CNN has been recently highlighted as the most powerful prediction model when important features are too complex and heterogeneous to be summarized in large data. The CNN analysis is suited for DWI connectome analysis, because it can typically consider >5,000 pair-wise connections depending on the number of network nodes. The CNN analysis performed in this study was driven by the hypothesis that multilayers of CNN should be capable of automatically learning nonlinear relationships between abnormal patterns of pair-wise connection and language scores on CELF-Fourth Edition (CELF-4) and of optimally learning subtle differences of pair-wise connections to minimize the magnitude of prediction error. We further hypothesized that specific pair-wise connection patterns would be predictive of expressive and receptive language scores, suggesting that the proposed CNN prediction is not a black box to provide the best fit between the predicted and actual scores but an effective tool to improve our understanding of the neuroanatomical substrates of language impairment in children with drug-resistant focal epilepsy.

2. Methods

2.1. Subjects

The study included 37 children clinically diagnosed with focal epilepsy (age: 11.8±3.1 years, 19 boys, Supplementary Table 1). These patients were selected by using the following inclusion criteria: (1) age: 5 to 18 years; (2) a history of medically refractory focal epilepsy scheduled for extraoperative electrocorticography (ECoG) recording as a part of the pre-surgical evaluation at the Children’s Hospital of Michigan, Detroit; (3) native English speaker; and (4) preoperative MRI and neuropsychological language assessments were successfully completed. Exclusion criteria consisted of the following: (1) presence of massive brain malformations (such as large perisylvian polymicrogyria or hemimegalencephaly), which could cause early functional and structural reorganization of language functions; (2) history of previous neurologic surgery and (3) autism spectrum disorder based on caregiver report and direct observations.

For clinical diagnosis, the language ability of a given patient was assessed using a battery of age- and developmentally-appropriate neuropsychological testing. It included child-specific combinations of the Mullen Scales of Early Learning, the Wechsler Scales of Intelligence for children aged 5–6 years and the Wechsler Intelligence Scale for children older than 6 years, the Token Test for Children, and the Vineland Adaptive Behavior Scales. Also, the baseline language ability using the CELF-Preschool (CELF-P) for children aged 5–6 years, and CELF-4 for children aged six years and above [22] was assessed for this research study. Composite measures derived from CELF testing yielded expressive and receptive language subscores that were standardized to a mean of 100 and standard deviation of 15, with a mean±standard deviation of 76±24 and 81±24 in expressive and receptive language ability, respectively. These scores were used as the ground-truth to generate our CNN-based predictive models using DWI connectome data. The time interval between neuropsychological assessment and DWI tractography data (i.e., Δt=date of neuropsychological assessment – date of MRI) was 2.1±0.5 months.

The study was approved by Wayne State University’s Institutional Review Board, and written informed consent was obtained from all parents/guardians.

2.2. Data analysis

All DWI data were acquired using a 3T GE Signa scanner (GE Healthcare, Milwaukee, WI) with 55 encoding directions at b=1000 s/mm2. Whole brain tractography was constructed using probabilistic SIFT tractography with second-order integration over fiber orientation distributions (iFOD2) [23] to sample up to three FOD at every voxel. At every voxel of the gray/white matter boundary identified by FSL FAST package (https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FAST) dynamically randomized seeding points were applied in the framework of anatomically constrained tractography (ACT) [24] to reconstruct biologically realistic streamlines existing in SIFT models. Automated anatomical labeling (AAL) parcellation (http://www.gin.cnrs.fr/en/tools/aal-aal2/) was then applied to whole brain tractography in order to construct a DWI-based brain connectome, G=(Si=1–116,j=1–116, Ωi or j=1–116), where Ωi or j=1–116 is a set of 116 nodes representing AAL regions in the brain and Si=1–116,j=1–116 is an adjacency matrix whose element (or edge), Sij represents the strength of pair-wise connection between two brain nodes, Ωi and Ωj: the total count of white-matter streamlines connecting Ωi and Ωj normalized by the average length of streamlines and the average volume of Ωi and Ωj to stabilize intersubject variability (due to intracranial volume and age differences).

To reduce the effect of false positives on the performance of CNN prediction, we first identified true positive connections of S as pair-wise connections of which the strength values are greater than the threshold of quantiles satisfying q cumulative probability of element value in S. At each element of S, we counted the number of patients whose strength values are greater than zero. If all patients had non-zero values, we assumed that the element was a true positive connection, yielding a binary matrix with the size of S. This binary matrix was then multiplied by S in order to mask out the false positive connections in S. A 1× 6670 vector consisting of upper triangular values of the masked S was finally scaled by its maximal value and resized into a 116×58 matrix, S¯, after padding 58 zeros in the end of the vector.

A CNN[21] was trained and tested for learning a nonlinear relationship between an input: S¯ and an output: a predicted score, y (Fig. 1). Figure 1 shows the architecture of the proposed CNN, a residual network with 18 layers consisting of 17 convolutional/batch normalization layers along with rectified linear unite (ReLU) and 1 fully connected layer [21]. On the output layer, we applied softmax function instead of ReLU to compute the final prediction probability. An adaptive learning rate approach for Stochastic Gradient Descent [25] is utilized to update all layer parameters. Briefly, for a given [S¯i,yi] of the N-training instances, our CNN model applies 2D convolution-batch normalization (BN)-ReLU operations and iteratively updates a series of 4 residual network layers with the following fully connected layer to minimize both root mean squared error (RMSE)=i=1N(tiyi)2/N in the regression layer where ti and yi are the measured target and predicted model scores, respectively. To minimize the RMSE, stochastic gradient descent with momentum optimizer was used, where the initial values of the weights of the convolutional and fully connected layers were randomly generated from a Gaussian distribution of zero mean and 0.01 standard deviation. The initial biases were set at zero. All network weights were then updated in the direction of the negative gradient function. The size of the mini-batch was 128 with 50 epochs. The learning rate η was modified at every 10 epochs by multiplying current learning rate with a factor of 0.1 to shorten the training time.

Figure 1.

Figure 1.

Schematic representation of the proposed convolutional neural network (CNN) architecture, where each colored square represents a specific network layer. Brain connectome matrix, S¯l of the ith patient is entered at the first stage, which is composed of two types of layers: convolution layer and batch normalization (BN) layer. The response of this stage is passed through a rectified linear unit (ReLU) layer. Then, the maximums of local patches are extracted by a max pool layer. Four blocks of convolution, BN, and ReLU layers are applied to learn high-level fine features from low-level features of the connectome matrix S¯l (i.e., response of the first stage). For each residual unit, its input is added to the output before the ReLU layer. The basic idea is that, rather than expecting blocks to approximate the regression relationship, we explicitly let these layers approximate a residual function, which is easier to be optimized. Finally, fully connected and regression layers are induced to get the predicted score yi. An average pooling layer is also applied to help prevent overfitting.

2.3. Statistical analysis

For each of two CNN systems predicting expressive and receptive scores, the optimal learning rate (η) and the connectivity matrix binary operation quantile (q) were found by incorporating a grid search in the repeated hold-out validation procedure. Thereby, the data of 36 subjects were randomly assigned to train and test the convergence of CNN (i.e., 25 and 11 subjects for training and testing, respectively). The converged CNN was then used to predict the score of the remaining 37th subject for validation. Briefly, to prevent overfitting of CNN in a small training cohort, we applied synthetic minority over-sampling technique (SMOTE) [26] which enlarges the training and testing dataset up to 1000 augmentations per patient. That is, each [S¯i,ti] of the 36 study patients was stacked in a 36×6729 matrix, where the elements of the ith row vector are a 1×6729 vector which consists of 6728 elements of S¯i and a scalar of ti. For each ith row vector, this procedure found 6 nearest neighbors (i.e., instances) and interpolated 1000 new instances randomly along lines connecting neighbors, resulting in an augmented dataset including 36036 instances of [S¯i,ti] for both training and testing CNN (Supplementary Figure 1 shows representative examples of CNN convergence obtained from the repeated hold-out validation procedure).

At each grid coordinate (η, q), where 0.0005 ≤ η ≤ 0.002 and 0.25 ≤ q ≤ 0.65, mean absolute error (MAE) and the standard deviation of the absolute error (SDAE) were measured between actual score: ti and predicted score: yi. An optimal grid coordinate was selected to minimize MAE. Finally, (η, q) of the optimal grid coordinate was applied to train and test CNN system. Probability of MAE less than one standard deviation of normative CELF score (i.e., 15), P(MAE≤15) was measured in validation trials to evaluate the accuracy of the optimized prediction.

To uncover which pair-wise connections were learned by CNN to be predictive of language score, we used the method of saliency visualization [27], which computes the partial derivatives of the output of a CNN: yi with respect to the input: S¯i. For each type of language score yi (i.e., either expressive or receptive score), this method computes dyid S¯i(m,n) for every input edge (m,n). This partial derivative was averaged over all instances. Nodes with a larger sum of positive partial derivatives were determined to be more predictive of the language score yi, which have higher pair-wise connections contributing to better prediction. Pearson’s correlation analysis and ANOVA group comparison between actual score: ti and predicted score yi were performed at significance threshold of p-value < 0.05 using SPSS 23 (https://www.ibm.com/analytics/spss-statistics-software).

3. Results

3.1. Correlation and subgroup difference of patient variables

The expressive language score was strongly correlated with the receptive language score of the same patient (R=0.87, p-value < 0.001). No significant correlation was found between age and the two language scores (R=0.21/0.08, p-value=0.20/0.62 for an expressive/receptive score, respectively). No significant group effect of seizure onset side (left vs. right) on two language scores (p-value=0.15/0.10 for an expressive/receptive score, respectively). Also, we found a significant group effect by the lobes: the presence of the epileptic focus (based on subsequent intracranial EEG recordings) in the temporal lobe (yes vs. no) was significantly associated with lower language scores (p-value=0.05 and 0.006 yes: 74.05±24.06 and 66±22.04, no: 89.25±21.88 and 87.38±22.15 for expressive and receptive score, respectively). No significant group effect of handedness (left vs. right) was found on either expressive or receptive language scores (p-value=0.80 and 0.80, respectively).

3.2. Prediction of CELF language score using the clinical variable

The expressive language scores were correlated with two clinical variables, including epilepsy duration (R=−0.36, p-value=0.029) and age of epilepsy onset (R=0.53, p-value<0.001). The epilepsy duration was significantly correlated with age of epilepsy onset (R = −0.69, p-value < 0.001). The receptive language scores were correlated with the total number of lobes involved by the seizure onset zone defined by subsequent intracranial EEG recording (R=−0.40, p-value=0.014) and age of epilepsy onset (R=0.56, p-value<0.001). Such significant correlations of the clinical variables provided a good-fit of linear regression to predict expressive language score (R=0.59, p-value=0.01) and receptive language score (R=0.59, p-value=0.01), respectively. The linear regression analysis yielded MAE=14.35/15.54, SDAE=11.32/10.93 and P(MAE≤15) = 0.60/0.51 for prediction of expressive/receptive scores (Table 2).

Table 2.

Comparison of the prediction performance obtained from linear regression of clinical variables, FCN and CNN of the DWI connectome data. Three metrics including MAE, SDAE and P(MAE≤15) were evaluated via the repeated hold-out validation procedure (n= 37).

Model Expressive score Receptive score
MAE SDAE P(MAE≤15) MAE SDAE P(MAE≤15)
Linear regression 14.35 11.32 0.60 11.32 10.93 0.51
FCN 12.30±7.67 6.19±4.11 0.71±0.27 11.68±7.03 6.31±3.53 0.72±0.29
CNN 7.77±6.38 4.41±3.31 0.90±0.20 7.40±7.10 4.43±4.13 0.88±0.24

3.3. Optimization, comparison, and subgroup analysis of the CNN

Figure 2 presents the results of grid search methods to optimize two key parameters of the proposed CNN prediction systems, η and q. For the comparison, the same dataset was analyzed by a fully connected network (FCN) that consists of the same fully connected layer and regression layer presented in Fig. 1. Both CNN and FCN were trained and tested at the same number of epochs (i.e., 50). To achieve global minima of MAE for prediction of expressive and receptive scores, CNN outperformed FCN (7.77 and 7.40 from CNN, 12.30 and 11.68 from FCN), indicating that high-level features extracted from convolution layers are more powerful to minimize prediction error in both expressive and language scores.

Figure 2.

Figure 2.

Quantile q and learning rate η (red circle) were optimized by grid search algorithm which minimizes mean absolute error (MAE) of the proposed CNN and fully connected network (FCN) to predict expressive and receptive language scores across 37 children with focal epilepsy.

In addition, compared to FCN, CNN provided a more significant linear correlation coefficient between actual and predicted scores (Fig. 3, R/p-value of CNN =0.82/<0.001 and 0.75/0.000 for expressive and receptive scores, R/p-value of FCN =0.36/0.031 and 0.44/0.007 for expressive and receptive scores), yielding a much higher P(MAE≤ 15) to predict expressive and receptive scores within normative standard deviation (Table 2, 0.90 and 0.88 from CNN, 0.71 and 0.72 from FCN).

Figure 3.

Figure 3.

Significant linear correlations between the measured and predicted language scores obtained from 37 children with focal epilepsy. To predict the score of each child as a validation trial, we first applied synthetic minority over-sampling technique (SMOTE) to augment the training data, [S¯i,ti] of the remaining 36 children. The trained CNN was then used to predict the score of the validation child, yi as plotted in the y-axis.

Subsequent subgroup analyses (Table 3) found no significant interference of onset age, seizure frequency, epilepsy duration, the number of AED trials, handedness, and the presence of cortical dysplasia (on postoperative histopathology) on overall prediction performance of the proposed CNN using DWI connectome data (p-value > 0.08 for all subgroup comparisons), suggesting robust and stable performance of the CNN-based prediction across heterogeneous patient profiles.

Table 3.

Subgroup analysis obtained from the proposed CNN of DWI connectome data. The prediction performance metrics including MAE, SDAE and P(MAE≤15) were assessed at different subgroups via the repeated hold-out validation procedure. None of below subgroup comparisons showed a significant group effect on the performance metrics of expressive and receptive language scores (p-value > 0.08*).

Subgroup Expressive score Receptive score
MAE SDAE P(MAE≤15) MAE SDAE P(MAE≤15)
Onset age ≤ 5 y.o (n=21) 7.16±5.03 3.99±2.73 0.93±0.16 7.00±6.88 4.14±3.36 0.90±0.24
Onset age > 5 y.o (n=16) 8.58±7.93 4.96±3.99 0.86±0.24 7.93±7.57 4.81±5.06 0.85±0.26
Weekly or monthly seizure frequency (n=25) 7.87±4.95 4.37±2.83 0.90±0.17 8.13±8.04 4.81±4.77 0.87±0.26
Daily seizure frequency (n=12) 7.56±8.94 4.50±4.23 0.90±0.25 5.89±4.48 3.64±2.23 0.90±0.22
Duration ≤ 5 year (n=14) 8.17±8.37 4.87±4.22 0.87±0.25 8.33±7.98 5.13±5.31 0.83±0.27
Duration > 5 year (n=23) 7.52±5.01 4.13±2.68 0.91±0.16 6.84±6.62 3.99±3.26 0.90±0.22
Number of AED ≤ 2 (n=25) 6.50±3.35 4.00±2.11 0.92±0.13 7.22±6.82 4.40±4.40 0.89±0.23
Number of AED > 2 (n=12) 10.41±9.88 5.25±2.21 0.84±0.29 7.79±7.95 4.50±3.65 0.85±0.28
Right handedness (n=31) 7.96±6.75 4.53±3.53 0.90±0.21 7.77±7.65 4.69±4.44 0.86±0.27
Left handedness (n=6) 6.76±4.22 3.78±1.99 0.90±0.15 5.50±2.55 3.10± 1.35 0.98±0.03
Without dysplasia (n=28) 8.56±7.21 4.87±3.79 0.87±0.21 7.90±7.67 4.70±4.54 0.85±0.25*
With dysplasia (n=9) 6.09±3.47 3.54±1.37 0.95±0.13 4.66±1.19 2.99±0.63 0.99±0.00*

3.4. Mapping crutial hubs for prediction of expressive and receptive language function

In Figure 4, the sums of partial derivatives to be predictive of expressive and receptive language scores, averaged over the entire dataset (n=37), are plotted with corresponding edges (i.e., line segments) connecting nodes of AAL atlas regions in a 2-D circular connectogram. Here, the greater partial derivative sum indicates the presence of thicker edges (i.e., high streamline counts of S¯i(m,n)) contributing to a better score. For the prediction of the expressive language score, right putamen and frontal regions appeared to be prominent hubs of important connections. That is, connection edges across the right thalamus, right putamen, and right caudate were found to be the most predictive of positive language outcomes. In contrast, the left superior parietal and right superior temporal regions appear prominent hubs of important connections for the receptive language score. Connection edges connecting the left superior parietal gyrus-left postcentral gyrus, as well as right superior temporal gyrus-right hippocampus were the most predictive of language outcomes. In other words, weaker connectivities across these regions were predictive of worse language scores, suggesting language impairments. There were noticeable overlaps between the edges that are most predictive of expressive and receptive scores, including bilateral middle frontal regions and subcortical areas such as right thalamus and right putamen.

Figure 4.

Figure 4.

AAL brain regions learned by CNN to be most predictive of the expressive language (top) and receptive language scores (bottom). Each 2D circular connectome presents Circos ideogram available at http://mkweb.bcgsc.ca/tableviewer/. It shows AAL regions and their pair-wise edge strength, S¯i(m,n), quantified by the thickness of individual strips, which are most predictive of the CNN determined score, yi. Very small magnitudes of positive derivatives (i.e., less than 5 % of maximum value) were omitted for clarity. On the 3D surface images (right panels), the size of each sphere indicates the sum of the positive partial derivative magnitude of edge strength, S¯i(m,n), with respect to the partial derivative of score output, yi. Here, the greater sphere suggests the region whose edge strengths are more predictive of the score. Note that edge strengths of the right putamen (PUT.R) were found to be the most predictive of expressive language scores, while edge strengths of the left superior parietal gyrus (SPG. L) were the most predictive of receptive language scores. A complete list of region names corresponding to the region labels is available in Table 1.

4. Discussion

This study investigated the clinical utility of deep learning technology to predict baseline language impairment in expressive and receptive domains in children with focal epilepsy. The CNN analysis accurately predicted expressive and receptive scores. The main observations support our hypothesis that diffusion tractography-based CNN analysis has an outstanding potential to serve as a biomarker of language function. Furthermore, concerning accuracy measures, our deep convolution models outperformed the fully connected network without relying on the large fully connected layer. This increased accuracy was found for real connectome data of independent study cohorts that were not included in the training and testing procedures of the proposed CNN systems. These results validate that our novel CNN systems are practically effective to learn key white matter connections associated with expressive and receptive language functions based on a relatively small number of pair-wise axonal connections.

Due to the oversimplification of the fiber orientation distribution modeling with a relatively lower number of encodings [28], our fiber tracking may generate many spurious connections that result in false edges in Sij of DWI connectome network. Thus, the present study first created a whole-brain backbone network, Sij with true positive edges satisfying Sij > 0 [29]. Any connection having Sij > 0 across all patients was assumed to construct a whole-brain backbone network (i.e., true positive connections exist in all patients). Finally, the present study had applied a statistical threshold, q, to control the density of the constructed backbone network and optimize it to maximize the accuracy of the proposed CNN (see Fig. 2). This statistical thresholding was originally proposed in the context of z-test prior to any network analysis [30]. However, this thresholding is still challenging due to the lack of a proper null hypothesis [31]. Interestingly, in our recent work [32], which applied a statistical threshold, q, to control the network density in the same dataset, we found no significant effect on the accuracy to predict language scores in the conventional framework of CNN. This observation indicates that the proposed CNN can provide a reliable means to predict language impairment without depending on the prefixed network density.

Language function is supported by widespread areas of the cortex and has multiple spatiotemporally distributed modules that serve a variety of linguistic functions. A recent study [33] reported that significant information sharing was also identified by language-associated electrocorticography high-gamma modulation “within” frontal and temporo-parietal language cortices, and “between” classical language areas in the dominant hemisphere (i.e., Broca’s and Wernicke’s areas), suggesting that expressive and receptive language functions may communicate and share neural activity across multiple language regions. The present study supported this notion by showing the tight correlation between expressive and receptive language scores (i.e., R=0.87, p < 0.001). This correlation might be attributed to the information sharing between Broca’s and Wernicke’s areas. Our results also provide neurobiological evidence on a spatial redundancy [34] between multiple language regions determined by our CNN to be most predictive of expressive and receptive scores (e.g., bilateral middle frontal gyri). However, the present study failed to confirm the information sharing between CNN-defined expressive and receptive language regions using an electrophysiology diagnostic tool. Further studies combining cortico-cortical evoked potentials and diffusion tractography may provide a valuable opportunity to better understand both specializations of the association cortices [35] and axonal connections between CNN-defined hubs. We plan to incorporate DWI connectome-based analysis to predict the postoperative language outcomes in the future optimally.

Epilepsy likely causes a slower, progressive injury to the brain when compared with acute underlying disease etiologies (e.g., stroke or traumatic brain injury). Interictal epileptiform discharges may cause transient disruptions in local cortical function and likely also affect the function of more widespread networks [36,37]. Children with early epileptogenic lesions involving the left neocortical regions can reorganize expressive and receptive language function to the right hemisphere [3841]. Chronic seizures, as well as interictal epileptiform discharges, account for the development of an “atypical” language distribution pattern on functional MRI [42,43]. The present study found a positive correlation between language scores and connections of networks involving regions that were not considered to be classic language areas (e.g., “right putamen” and “left superior parietal gyrus”). An explanation for this finding is that these regions may function as new or additional hubs to compensate the altered language function. The putamen has traditionally been associated with reinforcement of learning and motor control, including speech articulation [44]. A recent meta-analytic connectivity study reported that right putamen activations would encompass regions involved in broader semantic language processes, such as memory and visual imagery [45], including left middle occipital gyrus, left fusiform gyrus, left insula, left superior temporal gyrus, left precentral gyrus, left cerebellum and left lingual gyrus. Similarly, the left superior parietal area is suggested to be a part of the perisylvian language network [46]. Although Wernicke’s original description was of a temporal lobe language area, the term Wernicke’s area subsequently has been used to include inferior parietal areas, as well as posterior temporal areas encompassing the superior parietal lobe (BA 40) [47]. Thus, the increased axonal connectivity in the left superior parietal gyrus may reflect a compensatory mechanism in an attempt to preserve receptive language in children with focal epilepsy. This, however, could only be proven by longitudinal studies.

Despite our efforts to ensure methodological rigor, the present study has unavoidable limitations. First, this retrospective and observational work includes a small sample size (n=37), which can be problematic for the individual CNN model to learn heterogeneous natures of high dimensional connectome data via a deep learning process. Especially, the effect of foci number and location could not be systematically evaluated due to the limited sample size. Although data augmentation was applied to alleviate this limitation, it was based on a randomly interpolated resampling procedure across the nearest neighbors, which may cause overfitting of an individual CNN. Thus, the reproducibility of each CNN should be evaluated at a new independent dataset obtained from more realistic augmentation strategies [48]. For example, synthetic instances can be sampled from an approximate distribution of positive instances that pre-learned using conventional kernel-based density estimation. The dominant modes of variation for positive training instances can also be pre-learned using principal component analysis, then sampled to generate new instances. Our hold-out validation analysis, which tests a randomly selected patient as for a validation cohort, showed the feasibility of the proposed CNN approach, but the accuracy will ultimately need to be tested and refined in a larger independent population including other institutional data sets. In addition, this study found that the number of AED trials did not significantly confound the accuracy of CNN to predict language scores (i.e., the correlation coefficient between this variable and neuropsychological language score was non-significant, R = −0.30/−0.30, p-value = 0.10/0.10 for expressive and receptive score, respectively). This insignificant correlation may infer that the proposed CNN may accurately predict language scores even in children with focal epilepsy who are not a candidate for epilepsy surgery. Future studies will explore the impact of AEDs on the DWI-based prediction model in a larger cohort of patients with variable severity of seizure burden. Also, the specific pattern of interhemispheric connectivity at CNN-determined language hubs, which may reflect the effect of epilepsy duration on axonal reorganization and functional crowding in the contralateral hemisphere [49], has not been studied in the present study. Further studies to understand such an effect of interest are warranted to generalize and integrate our CNN prediction technology into clinical practice. Finally, the present study did not consider combining clinical variables with connectome features to improve the prediction analyses, especially longer duration of epilepsy and earlier epilepsy onset based on their significant association with language scores. It is anticipated that features from both sources could be combined to improve the prediction accuracies further.

In summary, the present study provides an anatomical blueprint underlying atypically developed language networks in children with focal epilepsy. This approach may lead to the refinement of imaging and neuropsychological phenotype relationships in a detailed manner and can have important clinical implications to optimize individualized therapeutic interventions for the treatment of language dysfunction in children with focal epilepsy.

Supplementary Material

1

Table 1.

Brain regions that significantly correlated with neuropsychological language scores and comprise the language-related subnetwork.

Anatomical regions correlated with expressive language score Label Anatomical regions correlated with receptive language score Label
Left temporal pole: superior temporal gyrus TPOsup.L Left temporal pole: superior temporal gyrus TPOsup.L
Right thalamus THA.R Right thalamus THA.R
Left thalamus THA.L Right superior temporal gyrus STG.R
Right superior temporal gyrus STG.R Left superior parietal gyrus SPG.L
Right superior parietal gyrus SPG.R Left supplementary motor area SMA.L
Right superior frontal gyrus, medial SFGmed.R Right superior frontal gyrus, medial SFGmed.R
Left superior frontal gyrus, medial SFGmed.L Left superior frontal gyrus, dorsolateral SFGdor.L
Left superior frontal gyrus, dorsolateral SFGdor.L Left postcentral gyrus PoCG.L
Right rolandic operculum ROL.R Right putamen PUT.R
Right gyrus rectus REC.R Left precuneus PCUN.L
Left postcentral gyrus PoCG.L Left paracentral lobule PCL.L
Right lenticular nucleus, putamen PUT.R Left lenticular nucleus, pallidum PAL.L
Left precuneus PCUN.L Left superior frontal gyrus, orbital part ORBsup.L
Left posterior cingulate gyrus PCG.L Right middle temporal gyrus MTG.R
Left olfactory cortex OLF.L Left middle temporal gyrus MTG.L
Right middle temporal gyrus MTG.R Right middle occipital gyrus MOG.R
Left middle temporal gyrus MTG.L Right middle frontal gyrus MFG.R
Right middle occipital gyrus MOG.R Left middle frontal gyrus MFG.L
Left middle occipital gyrus MOG.L Right inferior frontal gyrus, triangular part IFGtriang.R
Right middle frontal gyrus MFG.R Left inferior frontal gyrus, triangular part IFGtriang.L
Left middle frontal gyrus MFG.L Left inferior frontal gyrus, opercular part IFGoperc.L
Right inferior temporal gyrus ITG.R Right hippocampus HIP.R
Left inferior parietal gyrus IPL.L Right fusiform gyrus FFG.R
Right insula INS.R Right calcarine fissure CAL.R
Left median cingulate gyrus DCG.L
Right caudate nucleus CAU.R
Left caudate nucleus CAU.L
Left calcarine fissure CAL.L
Right anterior cingulate gyrus ACG.R
Left anterior cingulate gyrus ACG.L

Highlight.

  • Deep learning of DWI connectome predicts baseline language scores

  • The predicted and actual scores is significantly correlated at p-value < 0.001

  • Right Putamen and left superior parietal cortex are predictive of language scores

  • Our method benefits cases in which neuropsychology fails to assess language scores

Acknowledgments

The authors would like to thank all participants and their families for their time and interest in this study. This study was funded by a grant from the National Institute of Health, (R01-NS089659 to J.J and R01 NS064033 to E.A.).

Funding source:

This work was supported by the National Institutes of Health, [NS089659 and NS064033, 2020].

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Declaration of competing interest

The authors declare no conflicts of interest in relation to this study.

References

  • [1].Hermann BP, Bell B, Seidenberg M, Woodard A. Learning disabilities and language function in epilepsy. Epilepsia 2001;42:21–23. [DOI] [PubMed] [Google Scholar]
  • [2].Robinson RJ. The causes of language disorder: introduction and overview. In: Martin J, Fletcher P, Grunwell P, Hall D, editors. In: Proceedings of the first international symposium on specific speech and language disorders in children. Volume 29. London: Association for All Speech Impaired Children; 1987. p. 1–19. [Google Scholar]
  • [3].Robinson RJ. Causes and associations of severe and persistent specific speech and language disorders in children. Dev Med Child Neurol 1991;33:943–962. [DOI] [PubMed] [Google Scholar]
  • [4].Parkinson GM. High incidence of language disorder in children with focal epilepsies. Dev Med Child Neurol 2002;44:533–537. [DOI] [PubMed] [Google Scholar]
  • [5].Baumer FM, Cardon AL, Porter BE. Language dysfunction in pediatric epilepsy. J Ped 2018;194:13–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Caplan R, Siddarth P, Vona P, et al. Language in pediatric epilepsy. Epilepsia 2009;50:2397–2407. [DOI] [PubMed] [Google Scholar]
  • [7].Hustad KC, Sakash A, Broman AT, Rathouz PJ. Longitudinal growth of receptive language in children with cerebral palsy between 18 months and 54 months of age. Dev Med Child Neurol 2018;60:1156–1164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Downing JE, Perino DM. Functional versus standardized assessment procedures: implications for educational programming. Ment Retard 1992;30:289–295. [PubMed] [Google Scholar]
  • [9].Sattler JM. Assessment of children: Cognitive applications (4th ed. Vol. 1). Jerome M Sattler publisher, San Diego, CA, 2001. [Google Scholar]
  • [10].Bishop DVM, Snowling MJ, Thompson PA, Greenhalgh T, and the CATALISE-2 consortium. Phase 2 of CATALISE: a multinational and multidisciplinary Delphi consensus study of problems with language development: Terminology. J Child Psychol Psychiatry 2017;58(10):1068–1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Shevell M, Ashwal S, Donley D, et al. Practice parameter: evaluation of the child with global developmental delay: report of the Quality Standards Subcommittee of the American Academy of Neurology and The Practice Committee of the Child Neurology Society. Neurology 2003;60:367–380. [DOI] [PubMed] [Google Scholar]
  • [12].Hickok G, Poeppel D. The cortical organization of speech processing. Nat Rev Neurosci 2007;8:393–402. [DOI] [PubMed] [Google Scholar]
  • [13].Dick AS, Bernal B, Tremblay P. The language connectome: new pathways, new concepts. Neuroscientist 2014;20:453–467. [DOI] [PubMed] [Google Scholar]
  • [14].Nakai Y, Sugiura A, Brown EC, et al. Four-dimensional functional cortical maps of visual and auditory language: Intracranial recording. Epilepsia 2019;60:255–267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Paslawski T The clinical evaluation of language fundamentals, (CELF-4) a review. Can J Sch Psychol 2005;20:129–134. [Google Scholar]
  • [16].Jeong JW, Sundaram S, Behen ME, Chugani HT. Differentiation of speech delay and global developmental delay in children using DTI tractography-based connectome. Am J Neuroradiol 2016;37:1170–1177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Jeong JW, Asano E, Juhász C, Behen ME, Chugani HT. Postoperative axonal changes in the contralateral hemisphere in children with medically refractory epilepsy: A longitudinal diffusion tensor imaging connectome analysis. Hum Brain Mapp 2016;37:3946–3956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Ciresan D, Giusti A, Gambardella LM, Schmidhuber J. Deep neural networks segment neuronal membranes in electron microscopy images. Adv Neural Inf Process Syst 2012:2843–2851. [Google Scholar]
  • [19].Ciresan D, Giusti A, Gambardella LM, Schmidhuber J. Mitosis detection in breast cancer histology images with deep neural networks. In: International Conference on Medical Image Computing and Computer-assisted Intervention Springer, Berlin, Heidelberg; 2013. p. 411–418. [DOI] [PubMed] [Google Scholar]
  • [20].Roth HR, Lu L, Farag A, et al. DeepOrgan: Multi-level deep convolutional networks for automated pancreas segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention: Springer, Cham; 2015. p. 556–564. [Google Scholar]
  • [21].Xu H, Dong M, Lee MH, O’Hara N, Asano E, Jeong JW. Objective detection of eloquent axonal pathways to minimize postoperative deficits in pediatric epilepsy surgery using diffusion tractography and convolutional neural networks. IEEE Trans Med Imaging 2019; in press. doi: 10.1109/TMI.2019.2902073 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Semel E, Wiig EH, Secord WA. Clinical evaluation of language fundamentals. San Antonio: TX: Psychological Corporation: 2003. [Google Scholar]
  • [23].Tournier J-D, Calamante F, Connelly A. Improved probabilistic streamlines tractography by 2nd order integration over fibre orientation distributions. In: Proceedings of the international society for magnetic resonance in medicine. Volume 18. Stockholm, Sweden; 2010. p. 1670. [Google Scholar]
  • [24].Smith RE, Tournier J-D, Calamante F, Connelly A. Anatomically-constrained tractography: improved diffusion MRI streamlines tractography through effective use of anatomical information. Neuroimage 2012;62:1924–1938. [DOI] [PubMed] [Google Scholar]
  • [25].Kingma DP, Ba J. Adam: A method for stochastic optimization. 2014. https://arxiv.org/abs/1412.6980
  • [26].Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic Minority Over-sampling Technique. J Artif Intell Res 2002;16:321–357. [Google Scholar]
  • [27].Simonyan K, Vedaldi A, Zisserman A. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:13126034 2013. [Google Scholar]
  • [28].Tournier JD, Calamante F, Connelly A. Determination of the appropriate b value and number of gradient directions for high-angular-resolution diffusion-weighted imaging. NMR Biomed 2013;26(12):1775–1786. [DOI] [PubMed] [Google Scholar]
  • [29].Gong G, He Y, Concha L, et al. Mapping anatomical connectivity patterns of human cerebral cortex using in vivo diffusion tensor imaging tractography. Cereb Cortex 2009;19:524–536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].Cheng H, Wang Y, Sheng J, Kronenberger WG, Mathews VP. Characteristics and variability of structural networks derived from diffusion tensor imaging. Neuroimage 2012;61:1153–1164 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Jbabdi S, Johansen-Berg H. Tractography: where do we go from here? Brain Connect 2011;1:169–182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Banerjee S, Dong M, Lee MH, et al. Deep relational reasoning for the prediction of language impairment and postoperative seizure outcome using preoperative DWI connectome data of children with focal epilepsy. IEEE Trans Med Imaging. 2020. doi: 10.1109/TMI.2020.3036933. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Arya R, Ervin B, Wilson JA, et al. Development of information sharing in language neocortex in childhood-onset drug-resistant epilepsy. Epilepsia 2019;60:393–405. [DOI] [PubMed] [Google Scholar]
  • [34].Schomers MR, Garagnani M, Pulvermüller F. Neurocomputational consequences of evolutionary connectivity changes in perisylvian language cortex. Journal of Neuroscience 2017;37:3045–3055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35].Catani M, ffytche DH. The rises and falls of disconnection syndromes. Brain 2005;128:2224–2239. [DOI] [PubMed] [Google Scholar]
  • [36].Brown EC, Matsuzaki N, Asano E. The transient effect of interictal spikes from a frontal focus on language-related gamma activity. Epilepsy Behav 2012;24:497–502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Federico P, Archer JS, Abbott DF, Jackson GD. Cortical/subcortical BOLD changes associated with epileptic discharges: an EEG-fMRI study at 3T. Neurology 2005;64:1125–1130. [DOI] [PubMed] [Google Scholar]
  • [38].Yuan W, Szaflarski JP, Schmithorst VJ, et al. fMRI shows atypical language lateralization in pediatric epilepsy patients. Epilepsia 2006;47:593–600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Rasmussen T, Milner B. The role of early left-brain injury in determining lateralization of cerebral speech functions. Ann N Y Acad Sci 1977;299:355–369. [DOI] [PubMed] [Google Scholar]
  • [40].Akanuma N, Alarcón G, Lum F, et al. Lateralising value of neuropsychological protocols for presurgical assessment of temporal lobe epilepsy. Epilepsia 2003;44:408–418. [DOI] [PubMed] [Google Scholar]
  • [41].Möddel G, Lineweaver T, Schuele SU, Reinholz J, Loddenkemper T. Atypical language lateralization in epilepsy patients. Epilepsia 2009;50:1505–1516. [DOI] [PubMed] [Google Scholar]
  • [42].Adcock JE, Wise RG, Oxbury JM, Oxbury SM, Matthews PM. Quantitative fMRI assessment of the differences in lateralization of language-related brain activation in patients with temporal lobe epilepsy. Neuroimage 2003;18:423–438. [DOI] [PubMed] [Google Scholar]
  • [43].Brázdil M, Chlebus P, Mikl M, Pazourková M, Krupa P, Rektor I. Reorganization of language-related neuronal networks in patients with left temporal lobe epilepsy - an fMRI study. Eur J Neurol 2005;12:268–275. [DOI] [PubMed] [Google Scholar]
  • [44].Abutalebi J, Green D. Neuroimaging of language control in bilinguals: Neural adaptation and reserve. Biling-Lang Cogn 2016;19:689–698. [Google Scholar]
  • [45].Viñas-Guasch N, Wu YJ. The role of the putamen in language: a meta-analytic connectivity modeling study. Brain Struct and Funct 2017;222:3991–4004. [DOI] [PubMed] [Google Scholar]
  • [46].Catani M, Jones DK, ffytche DH. Perisylvian language networks of the human brain. Ann Neurol 2005;57:8–16. [DOI] [PubMed] [Google Scholar]
  • [47].Aboitiz F, García VR. The evolutionary origin of the language areas in the human brain. A neuroanatomical perspective. Brain Res Rev 1997;25:381–396. [DOI] [PubMed] [Google Scholar]
  • [48].Brown CJ, Miller SP, Booth BG, et al. Prediction of motor function in very preterm infants using connectome features and local synthetic instances. In: International Conference on Medical Image Computing and Computer-Assisted Intervention: Springer, Cham; 2015. [Google Scholar]
  • [49].Kim JA, Jeong JW, Behen ME, et al. Metabolic correlates of cognitive function in children with unilateral Sturge-Weber syndrome: Evidence for regional functional reorganization and crowding. Hum Brain Mapp 2018;39:1596–1606. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES