Abstract
Objective
To determine the predictive power of white matter neuronal networks (i.e., structural connectomes [SCs]) in discriminating memory-impaired patients with temporal lobe epilepsy (TLE) from those with normal memory.
Methods
T1- and diffusion MRI (dMRI), clinical variables, and neuropsychological measures of verbal memory were available for 81 patients with TLE. Prediction of memory impairment was performed with a tree-based classifier (XGBoost) for 4 models: (1) a clinical model including demographic and clinical features, (2) a hippocampal volume (HCV) model, (3) a tract model including 5 temporal lobe white matter association tracts derived from a dMRI atlas, and (4) an SC model based on dMRI. SCs were derived by extracting cortical-cortical connections from a temporal lobe subnetwork with probabilistic tractography. Principal component (PC) analysis was then applied to reduce the dimensionality of the SC, yielding 10 PCs. Multimodal models were also tested combining SCs and tracts with HCV. Each model was trained on 48 patients from 1 epilepsy center and tested on 33 patients from a different center.
Results
Multimodal models that included the SC + HCV model yielded the highest classification accuracy (81%; 0.90 sensitivity; 0.67 specificity), outperforming the clinical model (61%; p < 0.001) and HCV model (66%; p < 0.001). In addition, the unimodal SC model (76% accuracy) and tract model (73% accuracy) outperformed the clinical model (p < 0.001) and HCV model (p < 0.001) for classifying patients with TLE with and without memory impairment. Furthermore, the SC identified that short-range temporal-temporal connections were important contributors to memory performance.
Conclusion
SCs and tract-based models are stronger predictors of memory impairment in TLE than HCVs and clinical variables. However, SCs may provide additional information about local cortical-cortical connectivity contributing to memory that is not captured in large association tracts.
Verbal memory impairment is the most frequent and disabling cognitive comorbidity of temporal lobe epilepsy (TLE), affecting up to 50% of patients.1–3 In addition, patients who undergo epilepsy surgery are at further risk of memory decline.4 Although hippocampal volume (HCV) loss has traditionally been considered a surrogate marker for memory impairment in TLE, there is significant variability in the nature and severity of impairments observed, with many patients with a sclerotic hippocampus demonstrating intact memory.5
There is emerging evidence that memory impairment in TLE is also determined by damage to both long-range6 and short-range7 white matter connections within the medial temporal lobe (MTL). In fact, temporal lobe white matter integrity may provide a stronger marker of memory impairment compared with HCV loss.6,8–10 For this reason, a comprehensive and individualized map of brain structural connectivity—the structural connectome (SC)—may enable more detailed mapping of the neural architecture involved in memory and reveal how subtle network alterations lead to memory impairments.11,12 To date, SCs have been used to predict seizure outcomes,13–15 but no studies have used SCs to predict memory impairments in TLE.
In this study, we evaluated whether a cortical-cortical white matter SC could identify aberrant network connections that predict memory impairment in TLE. We compared the performance of this SC to performance obtained with clinical variables, HCV, and conventional MTL association tracts. We hypothesized that the SC would lead to better classification accuracy compared to these other measures and reveal the anatomic topology of neuronal networks associated with memory performance in TLE.
Methods
Standard protocol approvals, registrations, and patient consents
This study was approved by the Institutional Review Boards at the University of California, San Diego (UCSD) and University of California, San Francisco (UCSF). All participants provided informed consent according to the Declaration of Helsinki.
Subjects
We recruited 81 patients with medically refractory TLE who met inclusion criteria for the study (n = 48 from UCSD; n = 33 from UCSF). Inclusion criteria for patients included a TLE diagnosis by a board-certified neurologist with expertise in epileptology, in accordance with the criteria defined by the International League Against Epilepsy and based on video-EEG telemetry, seizure semiology, and neuroimaging evaluation. MRIs were visually inspected by a board-certified neuroradiologist for detection of mesial temporal sclerosis (MTS) and the exclusion of contralateral temporal lobe structural abnormalities. MRI findings suggested the presence of ipsilateral MTS in 41 patients. Patients were excluded if there was evidence of large structural lesions or visible extrahippocampal pathology on clinical MRI. Sixty-one healthy controls were recruited for the study that were sex- and age-matched to the patient population. Healthy controls were excluded if they self-reported any history of neurologic or psychiatric conditions.
Neuropsychological testing
Neuropsychological data were available for all patients and healthy controls. Verbal memory was assessed with the California Verbal Learning Test–Second Edition16 Long Delay Free Recall trial and the Wechsler Memory Scale–Third Edition Logical Memory Delayed Recall and Verbal Paired Associates17 Delayed Recall trials. For each test, patients' raw scores were converted into z scores on the basis of the distribution of healthy controls. Impairment on an individual test was defined as ≥1.5 SDs below the mean of healthy controls, comparable to the cutoff used in most clinical settings. As described elsewhere,7 patients were classified as memory impaired if they were impaired on at least 2 of 3 tests. Of UCSD patients, 54% were classified as memory impaired, while 64% of UCSF patients were classified as impaired. All other patients were classified as not impaired. Demographic and clinical characteristics of the memory impaired and not impaired patient groups are provided in table 1.
Table 1.
Demographics and clinical variables
Image acquisition
Brain imaging for all patients was performed on a General Electric Discovery MR750 3T scanner (General Electric, Fairfield, CT) with an 8-channel phased-array head coil at the Center for Functional MRI at UCSD or the Surbeck Laboratory for Advanced Imaging at UCSF. Image acquisitions were identical at both centers and included a conventional 3-plane localizer, GE calibration scan, a T1-weighted 3D structural scan (repetition time 8.08 milliseconds, echo time 3.16 milliseconds, inversion time 600 milliseconds, flip angle 8°, field of view 256 mm, matrix 256 × 192, slice thickness 1 mm isotropic), and a single-shot pulsed-field gradient spin-echo echo planar imaging sequence (echo time/repetition time 96 milliseconds/17 seconds; field of view 24 cm, matrix 128 × 128 × 48; axial). Diffusion images were acquired with b = 0 and b = 1,000 mm2/s with 30 diffusion gradient directions. Two additional b = 0 volumes were acquired with either forward or reverse phase-encode polarity for use in nonlinear B0 correction.
Image processing
Structural MRI processing
Automatic segmentation of the right and left HCV was performed with FreeSurfer (version 5.3) using the structural T1-weighted images. The segmentations were visually inspected to ensure correct labeling of the hippocampus. To control for differences in brain size, HCV was represented as a ratio to total intracranial volume.
Diffusion tensor imaging processing
Preprocessing of the diffusion data included corrections for distortions due to magnetic susceptibility (B0), eddy currents, and gradient nonlinearities; head motion correction; and registration to the T1-weighted structural image. For B0 distortion correction, a reverse gradient method was used.18 A detailed description of the image processing is provided elsewhere.19 Diffusion tensor imaging–derived fractional anisotropy was calculated on the basis of a tensor fit to the b = 1,000 data.
Fiber tract calculations
Fiber tract values were derived with a probabilistic diffusion tensor atlas (i.e., AtlasTrack). AtlasTrack is a fully automated method for labeling fiber tracts in individual participants based on diffusion-weighted images, T1-weighted images, and a probabilistic atlas of fiber tract locations and orientations. For each participant, the T1-weighted structural images were nonlinearly registered to a common space, and the respective diffusion tensor orientation estimates were compared to the atlas. This resulted in a map of the relative probability that a voxel belongs to a particular tract given its location and similarity of diffusion orientation. Voxels identified with FreeSurfer 5.3.0 as CSF or gray matter were excluded from the fiber regions of interest (ROIs). Average fractional anisotropy was calculated for each fiber ROI, weighted by fiber probability, so that voxels with low probability of belonging to a given fiber contributed minimally to average values. A full description of the atlas and detailed steps used to create the atlas are provided elsewhere.20 Specific tracts included in the current analyses are described below and visualized in figure 1A.
Figure 1. Deep white matter tracts and cortical regions of interest.
(A) Sagittal and axial renderings of the fornix (FX), uncinate fasciculus (UNC), inferior longitudinal fasciculus (ILF), inferior frontal occipital fasciculus (IFOF), and parahippocampal cingulum (PHC) from AtlasTrack projected onto a T1-weighted image for a single individual. The corpus callosum is portrayed in light gray to provide additional spatial information. (B) Surface parcellations of the 49 bihemispheric regions of interest based on the Desikan-Killiany atlas that were used to generate the structural connectomes. (C) An example raw data connectivity matrix and an example of principal component analysis weights matrix of a patient in the study.
SC generation
The FMRIB Diffusion Toolbox (FDT), part of the FMRIB Software Library, was used for the connectome-based tractography.21,22 This method differs from AtlasTrack in that it does not require the diffusion estimates to conform to a predetermined atlas, instead calculating connection strength values between ROIs using probabilistic fiber tracking.21,22 This was performed using PROBTRACKX2 in FDT with the following parameters: 5,000 samples, 2,000 steps per sample, 0.5-mm step length, 0.2 curvature threshold, and loop checking enabled on paths. Path distributions were also corrected for the inherent linear bias toward longer pathways in tractography algorithms.23 A full description of the FDT tractography implementation is available elsewhere.21,22
Cortical seed region generation
PROBTRACKX2 generates connectivity distributions from user-specified seed ROIs, in which voxels have values representing the number of streamlines (i.e., connection strength values) passing through them from the specified seed ROIs. The cortical ROIs fed to PROBTRACKX2 were acquired from the FreeSurfer automatic cortical parcellation process applied to T1-weighted images.24 The initial parcellation was performed with the Desikan-Killiany (DK)25 atlas in each participant’s own diffusion space. Because the DK atlas contains multiple ROIs that are single gyri oriented in 1 direction in the temporal and frontal lobes, which are important for memory, the following DK ROIs were split orthogonally to the long axis of their parcellation in each individual participant using the FreeSurfer mris_divide_parcellation: middle temporal, superior temporal, inferior temporal, fusiform, postcentral, precentral, middle frontal, and superior frontal. The resulting atlas contained 98 ROIs (49 for each hemisphere; figure 1B). To create the initial complete 98 × 98 symmetric connectivity matrix, the connectivity between each pair of source and destination ROIs was averaged. These connectivity values were also normalized by the sum of the number of voxels of the source and destination ROIs to account for differences in head size between participants. In summary, the structural connectivity is defined as the number of probabilistic streamlines reaching ROI A when ROI B was seeded, averaged with the opposite direction, divided by the number of voxels in ROIs A and B, and corrected by the distance traveled by the fibers.
Connectome feature reduction
The initial connectivity matrices were symmetrical 98 × 98 matrices; taking the upper triangle led to 4,753 values per patient. To reduce the dimensionality of the data, we restricted the analysis to connections including at least 1 ROI in the left or right temporal lobe (figure 1C). This temporal lobe subnetwork was selected for analysis because TLE has been shown to affect connectivity both within the temporal lobe and between temporal and extratemporal regions,26 and these connections are frequently implicated in memory.27,28 This left 2,737 values per patient. Second, principal component (PCs) analysis (PCA) was performed on this subnetwork of connections for further dimensionality reduction. PCA is an unsupervised learning algorithm that finds a set of eigenvectors onto which the original data can be mapped with minimal loss of variability. These directions are known as the PCs. The number of PCs that yielded maximum accuracy was assessed through systematically tuning of the internal parameters of the machine learning model to arrive at a set of parameters that yield maximum performance (i.e., hyperparameter optimization). This optimization led to a choice of 10 PCs.
Memory impairment prediction models
XGBoost is a type of decision tree algorithm that is widely used in many machine learning tasks.29 Similar to other decision tree models such as random forest, XGBoost constructs many shallow trees (i.e., weak learners) that each by itself does not provide optimal classification results. However, by ensembling these weak learners, it is possible to achieve good classification performance. XGBoost improves on the random forest algorithm by using gradient boosting to minimize the training error, thereby focusing on the mistakes made by the previous trees and correcting the internal model to account for these outliers. Furthermore, XGBoost introduces regularization terms, which protect against overfitting to the training data by making the model more conservative and simpler.
We tested 6 models by training an XGBoost classifier on UCSD patients (i.e., training set) and validating the model on UCSF patients (i.e., testing set). We chose not to perform cross-validation because an external independent dataset was available for testing. XGBoost models were created with the following 6 sets of features:
1. Clinical variables (clinical model). We included the following clinical variables in our model: age, education, sex, handedness, MTS status, side of seizure onset, age at onset, number of antiepileptic drugs (AEDs), and seizure frequency (number per month).
2. HCV model. We included both the left and right HCV.
3. Atlas based tractography (tract model). The following right and left hemisphere temporal lobe fiber tracts were selected due to evidence of their disruption in TLE,30 their likely involvement in memory processing,6,9,10 and the fact that they include the major hippocampal efferent and afferent fiber pathways: fornix,31 parahippocampal cingulum (PHC),32 and uncinate fasciculus,33 as well as inferior longitudinal fasciculus (ILF) and inferior frontal occipital fasciculus (figure 1A).
4. SC model. For the SC, we selected cortical-cortical temporal subnetwork connections and did not include connections between cortical regions and deep gray matter structures (e.g., hippocampus, amygdala). This approach was selected to reduce redundancy with the association tracts included in the tract-based model. We applied PCA to reduce the high-dimensional temporal subnetwork connectome data to 10 PCs.
5 and 6. Tract + HCV and SC + HCV. To test the added value of white matter integrity to the HCV model, we combined left and right HCV with features from the tract and SC models.
Neuroanatomic interpretation of PCA
Finally, we sought to connect the important PCs back to overall white matter architecture to determine which connections had preferentially greater influence on model performance. The importance of each PC to model performance for discriminating impaired from nonimpaired patients was calculated by sklearn.34 Importance values range from 0 (no importance) to 1 and sum to 1 across all features within each model. To tie importance to specific connections, we combined PC importance with the weight of each connection in the PCs. Mathematically, PCs are computed as linear combinations of all the original features. With importance and weight combined, first PC importance values were demeaned (relatively more important PCs were >0, relatively less important PCs were <0). Then the weights of each connection for each PC were multiplied by the respective scaled feature importance values, and a sum was derived for each connection. Connections that contributed highly to only important PCs would have a value >0, while connections that contributed equally to all PCs or highly to only unimportant PCs would be ≤0.
Statistical Analysis
Independent t tests and Fisher exact tests were used to test for differences in demographic and clinical variables between memory-impaired and nonimpaired patients. The performance of each model was evaluated with receiver operating characteristic (ROC) curves, area under the ROC curve (AUC), accuracy, positive predictive value, negative predictive value, sensitivity, and specificity. The thresholds for the predictions were chosen on the basis of the point on the ROC curve that yielded maximum accuracy. Significant differences in model performance were assessed by creating 95% confidence intervals with 1,000 bootstrapped samples. A model was considered superior in a metric if its performance was above the 95% confidence interval of the model to which it was being compared.
Data availability
The authors have full access to all study data and participant consent forms and take full responsibility for the data, the conduct of the research, the analysis and interpretation of the data, and the right to publish all data.
Results
Patient demographics and clinical variables
There were differences in education [t (79) = −2.06, p = 0.042] and age [t (79) = −3.134, p = 0.002] between the training set and the testing set. UCSD patients were older and achieved a higher level of education. There were no differences in sex distribution (Fisher exact = 0.177, p = 0.821) between the 2 groups (table 1). In the training set, there were differences in sex distribution between the memory-impaired and unimpaired groups (Fisher exact = 4.481, p = 0.045), with the memory-impaired group having a greater proportion of women. However, when the overall impaired and unimpaired groups were compared, there were no significant differences in age (p = 0.19) or sex (p = 0.12). In the testing set, there were no differences in any of the clinical or demographic variables between the memory impaired and unimpaired groups (all p > 0.05).
Model performance
The full results of model performance with each set of features (clinical, HCV, tract, SC, tract + HCV, and SC + HCV) are shown in table 2. We consider the clinical model the baseline model to which all other models were compared.
Table 2.
XGBoost model performance with each set of features
The clinical model yielded the poorest accuracy at 61%. The HCV model performed at 66% accuracy, which outperformed the clinical model (p < 0.001). The main gain in performance was realized when tracts or SCs were included in the model. The tract model and SC model performed significantly better than the HCV model (p < 0.001) and clinical model (p < 0.001). However, the SC model (76% accuracy) and the tract model (73% accuracy) did not significantly differ from one another in accuracy. The 2 white matter models (tract model and SC model) showed similar sensitivity (0.81) to one another. However, the SC model showed a nonsignificant trend toward higher specificity compared to the tract model (0.67 vs 0.58). While the HCV model yielded similar sensitivity to the SC model, randomization testing revealed the SC model to have higher specificity than the HCV model and clinical model (both p < 0.001).
ROC curves were constructed to analyze the relationship between the sensitivity and specificity of each model (figure 2). As figure 2 shows, the ROC curve for the clinical model dips under the diagonal dashed line, indicating that it performs at baseline (AUC = 0.43). In contrast, the ROC curves for the SC model and tract model, which have the highest AUCs, indicate that these models perform best, with the SC model (AUC = 0.71) similar to the tract model (AUC = 0.68).
Figure 2. ROC curves for each model.

(A) Receiver operating characteristic (ROC) curves for models using clinical, hippocampal volume (HCV), tract, and structural connectome (SC) features. (B) ROC curves for models using tract + HCV and SC + HCV features. Gray lines indicate performance of a model performing at chance.
Multimodal feature performance
To test the added value of our 2 white matter models to HCV, we paired the tract model with HCV (tract + HCV model) and the SC model with HCV (SC + HCV model). The SC + HCV model numerically outperformed the tract + HCV model, with 81% accuracy and 0.82 positive predictive value. Both the tract + HCV model and SC + HCV model models yielded 0.80 negative predictive value; however, the tract + HCV model had the highest sensitivity of 0.95. These models performed significantly better than the HCV model alone in terms of accuracy (p < 0.001), suggesting that white matter integrity adds to the classification of memory impairment above and beyond information provided by HCV.
Important features from each model
To understand the variables from each model contributing the most to memory classification performance, feature importance values were extracted from each model. We used the average and variance feature importance values from the bootstrap analysis. The 5 most important features from each model are plotted in figure 3. The most important features for classifying memory impairment in the clinical model included age, age at onset, seizure frequency, education, and number of AEDs. Neither MTS status nor side of seizure onset emerged as 1 of the 5 most important features. Not surprisingly, left HCV emerged as more important than right HCV for verbal memory impairment prediction in the HCV model.
Figure 3. Feature importance plots of top 5 most important features in each model.
Mean and SD of the feature importances extracted from XGBoost are plotted for each model. Mean and SD of the feature importances were derived from the 1,000-sample bootstrap analysis. AED = antiepileptic drug; AO = age at onset; Fx = fornix; HCV = hippocampal volume; ILF = inferior longitudinal fasciculus; PC = principal component; PHC = parahippocampal cingulum; SC = structural connectome.
In the tract model, the white matter tracts contributing the most to memory impairment prediction were the bilateral fornix, bilateral, PHC, and left ILF. In the tract + HCV model, again the bilateral fornix, left ILF, bilateral HCV, and right PHC were most important. It is of further interest that left HCV emerged only as the third most important feature in the tract + HCV model and that otherwise the important tracts were the same as the tract model.
Figure 4 displays the top 15 connections by weight in PCs that were most important (PC-6), moderately important (PC-2), and least important (PC-9) for classifying memory impaired vs unimpaired patients in the SC model. The connections with the most weight in PC-6 appeared to be localized mainly within the anterior temporal lobe, with connectivity to extratemporal regions (figure 4A). Conversely, the highest-weighted connections of PC-9 were focused in posterior temporal to extratemporal lobe connections (figure 4C). In an examination of connections that consistently contributed to the most important PCs, several left anterior temporal connections were highlighted, implicating regions like the entorhinal cortex, inferior temporal gyrus, temporal pole, and fusiform (figure 5).
Figure 4. Connections with highest weights towards PCs.
Illustration of the general layout of high-weight connections in principal components (PCs) with differing importances to model performance. Top 15 connections by weight in each PC are displayed, with greater weights indicated by thicker bars. (A) Connections with highest weight toward an important PC (PC-6). (B) Connections with highest weight toward a moderately important PC (PC-2). (C) Connections with highest weight toward an unimportant PC (PC-9). The spheres in the figure represent larger regions of interest as illustrated in figure 1.
Figure 5. Top 15 most important connections for memory impairment classification.
(A) Glass brain visualization of top 15 connections. (B) Table of top 15 most important connections ordered by most important (top) to least important (bottom). These connections contributed preferentially to the most important PCs. The spheres in the figure represent larger regions of interest as illustrated in figure 1. ITG = inferior temporal gyrus; lh = left hemisphere; MTG = middle temporal gyrus; rh = right hemisphere.
Discussion
In this study, we evaluated the ability of comprehensive white matter neuronal network mapping (the SC) to predict verbal memory impairment in patients with drug-resistant TLE. We compared the predictive ability of the cortical-cortical SC to MTL tract-based analysis, HCV, clinical variables, and multimodal models that included each white matter approach combined with HCV. We observed that white matter integrity was more critical to verbal memory performance than HCV or clinical variables, with both white matter models achieving the highest classification accuracy. Although the SC model performed on par with the tract-based model for most performance metrics, the SC model provided unique information, highlighting the importance of short-range temporal-temporal connections that are not included in our large MTL fiber bundles. A multimodal model incorporating both white matter microstructure and HCV appears to be most informative, providing the highest sensitivity and specificity for determining memory impairment in TLE (table 2).
We found both white matter–based models to outperform the HCV and clinical models in terms of classification accuracy (table 2). These results support previous findings from our laboratory and others demonstrating that white matter integrity predicts verbal memory scores better than HCV and other morphometric measures.19,35 Furthermore, we found the clinical model to perform the worst among all the models tested. Epilepsy-related variables, including age at onset, seizure frequency, and number of AEDs, were among the most important variables contributing to the clinical model performance (figure 3A), but the clinical model failed to perform better than chance. It was quite surprising to see that MTS status was not an important predictor in the clinical model, given the existing literature indicating that left MTS is a strong predictor of verbal memory performance both preoperatively and postoperatively.1,2,36–38 Furthermore, the side of seizure focus was not among the most important predictors of verbal memory performance in the clinical model. Although this would appear to contradict many studies suggesting that patients with left TLE have poorer verbal memory compared to those with right TLE,1,2,39,40 emerging literature on cognitive phenotypes reveals that patients with chronic TLE have bilateral MTL microstructural loss, which can lead to significant verbal memory impairments in patients with both left and right TLE.7 Our results align with previous findings demonstrating that epilepsy-related variables explain little variance in verbal memory performance19 or exhibit only weak associations.41 Patients with similar clinical and seizure characteristics may have vastly different underlying white matter microstructure, resulting in heterogeneous memory performance that cannot be accurately described with clinical variables alone.7,42 Thus, evaluating white matter integrity directly through tracts or SCs may allow the best prediction of memory performance.
The SC model appeared to be numerically superior to the traditional tract-based analysis in terms of model specificity (table 2); however, we did not find significant improvement in classification accuracy for the SC over tract-based features. Rather, using a simple feature selection approach with PCA, we were able to unveil interesting patterns of network abnormalities present in memory-impaired patients that provide greater insight into local cortical-cortical connections contributing preferentially to memory performance in TLE. Among these connections were those involving regions often implicated in episodic memory and the larger memory network, including connections from the left entorhinal to left precuneus and cingulate43 and connections involving the left inferior temporal gyrus44 and left and right parahippocampal regions45 (figure 5B). Some of these connections would overlap with those included in our MTL fiber tracts, including the left ILF and left PHC.46 This overlap may have contributed to the similar prediction accuracies of the 2 methods. However, the SC also detected connections that would not be included in the MTL tracts such as those from the left temporal pole to left anterior fusiform and left inferior temporal gyrus, as well as ipsilateral connections from the bilateral parahippocampal regions to the fusiform (figure 5B). Thus, the cortical-cortical SC may provide a better understanding of how abnormal short-range MTL white matter connections affect verbal memory performance in TLE.
It should be emphasized that MTL tracts do not provide confirmatory information about which gray matter regions are connected. Because diffusion-based white matter fibers are the biophysical representations of multiple axonal bundles, the reconstructed fibers may represent several axonal projections that travel in parallel but connect different cortical or subcortical regions. For this reason, SC-based analyses provide important additional insight into white matter integrity by demonstrating which pair-wise cortical connections are related to behavior. This model-free approach is independent from predefined atlas-based white matter tracts. We believe this study is the first step toward achieving personalized memory connectomes that could provide important information for predicting risk for postoperative memory decline after epilepsy surgery.
This study uses the SC for predicting verbal memory impairment in patients with TLE. We compare the performance of the SC to performance based on more established clinical and imaging measures, a critical task seldom accomplished in previous studies. Another major strength is the use of an independent external dataset to cross-validate our results. In our case, we trained the XGBoost model on patients from the UCSD Epilepsy Center and tested the model on patients from the UCSF Epilepsy Center. This paradigm reduces the risk of overfitting the model to the data by testing the model on patients whom the model has never seen before.47 We took additional steps throughout the study methodology to ensure that XGBoost had no information about the testing dataset during the training phase. Finally, although the overall performance of the SC and tract-based models was comparable, we demonstrate that connectome-based modeling has the potential to provide more granular information about abnormal cortico-cortical connectivity underlying verbal memory impairment than long-range tract-based methods in TLE. We find the convergence of the 2 white matter methods encouraging because it demonstrates the importance of deep MTL and cortical-cortical white matter networks for predicting memory performance and suggests that their importance is robust to the tractography method used (i.e., connectome-based vs tract-based method).
Despite these strengths, our study has several limitations that should be noted. First, to derive our memory-impaired and unimpaired TLE groups, we collapsed across 3 somewhat different measures of verbal memory. The 3 memory indices we used in this study involved delayed recall of prose, verbal associative recall, and word list recall. It is well known that prose recall and associative memory recruit lateral and MTL structures, respectively,1 while list learning may rely more heavily on the hippocampus.48 By pooling results across tests, the model may have lost some specificity as to which temporal lobe structures and connections are associated with the different types of verbal memory. However, by requiring patients to be impaired on at least 2 of the 3 measures, a method used in previous studies,7,49 we felt more confident about our classification of memory impairment. Future analysis may also apply a regression-based model to determine whether this approach can predict the actual magnitude of impairment across tests or within each test. Second, our sample size was relatively small in the context of performing machine learning. In fact, although a similar pattern of results emerged, predication accuracy dropped by 10% to 20% across models when we trained the data on the smaller dataset (UCSF) and tested on the larger one (UCSD). In this analysis, the SC model maintained the highest performance (65% accuracy). It is challenging to achieve strong and stable machine learning performance with sample sizes common in neuroimaging studies. Machine learning algorithms need many examples in the training data to be able to detect patterns and to effectively perform predictions on unseen data. It is possible that with a larger sample size, our connectome-based model would have outperformed the tract-based model across most performance measures because inspection of table 2 would suggest a pattern in this direction.
Future research with larger sample sizes of patients with presurgical and postsurgical neuropsychological data is required to examine whether connectome-based or tract-based models provide better estimates of memory impairment or risk for postoperative memory decline. Further studies are also needed to determine the best dimensionality reduction approach to SC data. While PCA was used in this study, graph theory measures have proved effective in predicting seizure outcome14 and in characterizing cognitive phenotypes in TLE.7 It would be interesting to test how such an approach might perform in discriminating between memory-impaired and unimpaired patients in a machine learning context. Finally, although we tested 4 models with relatively unique information, it would be worthwhile to determine the best overall combinations of measures. For example, our SC model focused on cortical-cortical connections; however, perhaps a combination of cortical-cortical and subcortical connectivity would yield better performance. As network methodologies increase in popularity, efforts are needed to easily extract and interpret meaningful information.
Our data demonstrate the importance of white matter networks for predicting verbal memory impairment in TLE. Both connectome-based and tract-based models showed higher accuracy than important epilepsy-related clinical characteristics or HCV for classifying patients with TLE as memory impaired or not impaired. Although the connectome model and tract-based model performed similarly, the SC model provided more granular information regarding the importance of specific MTL connections to memory and highlights important local connections that are not captured in large white matter association tracts. These data could provide critical information for estimating risk for memory decline after different surgical interventions.
Glossary
- AED
antiepileptic drug
- AUC
area under the ROC curve
- DK
Desikan-Killiany
- FDT
FMRIB Diffusion Toolbox
- HCV
hippocampal volume
- ILF
inferior longitudinal fasciculus
- MTL
medial temporal lobe
- MTS
mesial temporal sclerosis
- PC
principal component
- PCA
PC analysis
- PHC
parahippocampal cingulum
- ROC
receiver operating characteristic
- ROI
region of interest
- SC
structural connectome
- TLE
temporal lobe epilepsy
- UCSD
University of California, San Diego
- UCSF
University of California, San Francisco
Appendix. Authors


Study funding
Supported by NIH/National Institute of Neurological Disorders and Stroke R01 NS065838 (C.R.M.), R21 NS107739 (C.R.M., L.B.), and 1F31NS111883-01 (A.R.).
Disclosure
All authors report no disclosures relevant to the manuscript. Go to Neurology.org/N for full disclosures.
References
- 1.Saling MM. Verbal memory in mesial temporal lobe epilepsy: beyond material specificity. Brain 2009;132:570–582. [DOI] [PubMed] [Google Scholar]
- 2.Bell B, Lin JJ, Seidenberg M, Hermann B. The neurobiology of cognitive disorders in temporal lobe epilepsy. Nat Rev Neurol 2011;7:154–164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Herman B, Seidenberg M. Behavioral Aspects of Epilepsy: Principles and Practice. Holmes GL, Schachter SC, Trenite DGKN, editors. New York: Demos Medical Publishing; 2007. [Google Scholar]
- 4.Sherman EM, Wiebe S, Fay-McClymont TB, et al. Neuropsychological outcomes after epilepsy surgery: systematic review and pooled estimates. Epilepsia 2011;52:857–869. [DOI] [PubMed] [Google Scholar]
- 5.Gargaro AC, Sakamoto AC, Bianchin MM, et al. Atypical neuropsychological profiles and cognitive outcome in mesial temporal lobe epilepsy. Epilepsy Behav 2013;27:461–469. [DOI] [PubMed] [Google Scholar]
- 6.McDonald CR, Ahmadi ME, Hagler DJ, et al. Diffusion tensor imaging correlates of memory and language impairments in temporal lobe epilepsy. Neurology 2008;71:1869–1876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Reyes A, Kaestner E, Bahrami N, et al. Cognitive phenotypes in temporal lobe epilepsy are associated with distinct patterns of white matter network abnormalities. Neurology 2019;92:e1957–e1968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hermann B, Seidenberg M, Bell B, et al. Extratemporal quantitative MR volumetrics and neuropsychological status in temporal lobe epilepsy. J Int Neuropsychol Soc 2003;9:353–362. [DOI] [PubMed] [Google Scholar]
- 9.Riley JD, Franklin DL, Choi V, et al. Altered white matter integrity in temporal lobe epilepsy: association with cognitive and clinical profiles. Epilepsia 2010;51:536–545. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Diehl B, Busch RM, Duncan JS, Piao Z, Tkach J, Lüders HO. Abnormalities in diffusion tensor imaging of the uncinate fasciculus relate to reduced memory in temporal lobe epilepsy. Epilepsia 2008;49:1409–1418. [DOI] [PubMed] [Google Scholar]
- 11.Sporns O, Tononi G, Kötter R. The human connectome: a structural description of the human brain. PLoS Comput Biol 2005;1:e42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kim J, Parker D, Whyte J, et al. Disrupted structural connectome is associated with both psychometric and real-world neuropsychological impairment in diffuse traumatic brain injury. J Int Neuropsychol Soc 2014;20:887–896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bonilha L, Jensen JH, Baker N, et al. The brain connectome as a personalized biomarker of seizure outcomes after temporal lobectomy. Neurology 2015;84:1846–1853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Taylor PN, Sinha N, Wang Y, et al. The impact of epilepsy surgery on the structural connectome and its relation to outcome. Neuroimage Clin 2018;18:202–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Gleichgerrcht E, Munsell B, Bhatia S, et al. Deep learning applied to whole-brain connectome to determine seizure control after epilepsy surgery. Epilepsia 2018;59:1643–1654. [DOI] [PubMed] [Google Scholar]
- 16.Delis DC, Kramer JH, Kaplan E, Ober BA. California Verbal Learning Test, Adult Version (CVLT-II). San Antonio: Psychol Corp; 2000. [Google Scholar]
- 17.Wechsler D. WMS-III: Wechsler Memory Scale Administration and Scoring Manual. San Antonio: Psychological Corp; 1997. [Google Scholar]
- 18.Holland D, Kuperman JM, Dale AM. Efficient correction of inhomogeneous static magnetic field-induced distortion in echo planar imaging. Neuroimage 2010;50:175–183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.McDonald CR, Leyden KM, Hagler DJ, et al. White matter microstructure complements morphometry for predicting verbal memory in epilepsy. Cortex 2014;58:139–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Hagler DJ, Ahmadi ME, Kuperman J, et al. Automated white-matter tractography using a probabilistic diffusion tensor atlas: application to temporal lobe epilepsy. Hum Brain Mapp 2009;30:1535–1547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Behrens TE, Woolrich MW, Jenkinson M, et al. Characterization and propagation of uncertainty in diffusion-weighted MR imaging. Magn Reson Med 2003;50:1077–1088. [DOI] [PubMed] [Google Scholar]
- 22.Behrens TE, Berg HJ, Jbabdi S, Rushworth MF, Woolrich MW. Probabilistic diffusion tractography with multiple fibre orientations: what can we gain? NeuroImage 2007;34:144–155. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hagmann P, Kurant M, Gigandet X, et al. Mapping human whole-brain structural networks with diffusion MRI. PLoS One 2007;2:e597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Dale AM, Fischl B, Sereno MI. Cortical surface-based analysis, I: segmentation and surface reconstruction. NeuroImage 1999;9:195–207. [DOI] [PubMed] [Google Scholar]
- 25.Desikan RS, Ségonne F, Fischl B, et al. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. NeuroImage 2006;31:968–980. [DOI] [PubMed] [Google Scholar]
- 26.Besson P, Dinkelacker V, Valabregue R, et al. Structural connectivity differences in left and right temporal lobe epilepsy. NeuroImage 2014;100:135–144. [DOI] [PubMed] [Google Scholar]
- 27.Voets NL, Zamboni G, Stokes MG, Carpenter K, Stacey R, Adcock JE. Aberrant functional connectivity in dissociable hippocampal networks is associated with deficits in memory. J Neurosci 2014;34:4920–4928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bonilha L, Alessio A, Rorden C, et al. Extrahippocampal gray matter atrophy and memory impairment in patients with medial temporal lobe epilepsy. Hum Brain Mapp 2007;28:1376–1390. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Chen T, Guestrin C. Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016:785–794. [Google Scholar]
- 30.Leyden KM, Kucukboyaci NE, Puckett OK, Lee D, Loi RQ. What does diffusion tensor imaging (DTI) tell us about cognitive networks in temporal lobe epilepsy? Quant Imaging Med Surg 2015;5:17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Poletti CE, Creswell G. Fornix system efferent projections in the squirrel monkey: an experimental degeneration study. J Comp Neurol 1977;175:101–127. [DOI] [PubMed] [Google Scholar]
- 32.Mufson EJ, Pandya DN. Some observations on the course and composition of the cingulum bundle in the rhesus monkey. J Comp Neurol 1984;225:31–43. [DOI] [PubMed] [Google Scholar]
- 33.Schmahmann J, Pandya D. Fiber Pathways of the Brain. New York: Oxford University Press USA; 2009. [Google Scholar]
- 34.Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. J Mach Learn Res 2011;12:2825–2830. [Google Scholar]
- 35.Yogarajah M, Powell HR, Parker GJ, et al. Tractography of the parahippocampal gyrus and material specific memory impairment in unilateral temporal lobe epilepsy. Neuroimage 2008;40:1755–1764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Helmstaedter C, Grunwald T, Lehnertz K, Gleissner U, Elger CE. Differential involvement of left temporolateral and temporomesial structures in verbal declarative learning and memory: evidence from temporal lobe epilepsy. Brain Cogn 1997;35:110–131. [DOI] [PubMed] [Google Scholar]
- 37.Hermann BP, Wyler AR, Somes G, Berry AD III, Dohan FC Jr. Pathological status of the mesial temporal lobe predicts memory outcome from left anterior temporal lobectomy. Neurosurgery 1992;31:652–657. [DOI] [PubMed] [Google Scholar]
- 38.Alessio A, Damasceno BP, Camargo CHP, Kobayashi E, Guerreiro CAM, Cendes F. Differences in memory performance and other clinical characteristics in patients with mesial temporal lobe epilepsy with and without hippocampal atrophy. Epilepsy Behav 2004;5:22–27. [DOI] [PubMed] [Google Scholar]
- 39.Hermann BP, Seidenberg M, Schoenfeld J, Davies K. Neuropsychological characteristics of the syndrome of mesial temporal lobe epilepsy. Arch Neurol 1997;54:369–376. [DOI] [PubMed] [Google Scholar]
- 40.Saling MM, Berkovic SF, O'shea MF, Kalnins RM, Darby DG, Bladin PF. Lateralization of verbal memory and unilateral hippocampal sclerosis: evidence of task-specific effects. J Clin Exp Neuropsychol 1993;15:608–618. [DOI] [PubMed] [Google Scholar]
- 41.Hermann BP, Seidenberg M, Dow C, et al. Cognitive prognosis in chronic temporal lobe epilepsy. Ann Neurol 2006;60:80–87. [DOI] [PubMed] [Google Scholar]
- 42.Rodríguez-Cruces R, Velázquez-Pérez L, Rodríguez-Leyva I, et al. Association of white matter diffusion characteristics and cognitive deficits in temporal lobe epilepsy. Epilepsy Behav 2018;79:138–145. [DOI] [PubMed] [Google Scholar]
- 43.Shallice T, Fletcher P, Frith CD, Grasby P, Frackowiak RSJ, Dolan RJ. Brain regions associated with acquisition and retrieval of verbal episodic memory. Nature 1994;368:633. [DOI] [PubMed] [Google Scholar]
- 44.Chen J, Duan X, Shu H, et al. Differential contributions of subregions of medial temporal lobe to memory system in amnestic mild cognitive impairment: insights from fMRI study. Sci Rep 2016;6:26148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Wagner AD, Schacter DL, Rotte M, et al. Building memories: remembering and forgetting of verbal experiences as predicted by brain activity. Science 1998;281:1188–1191. [DOI] [PubMed] [Google Scholar]
- 46.Wakana S, Caprihan A, Panzenboeck MM, et al. Reproducibility of quantitative tractography methods applied to cerebral white matter. Neuroimage 2007;36:630–644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Pereira F, Mitchell T, Botvinick M. Machine learning classifiers and fMRI: a tutorial overview. Neuroimage 2009;45:S199–S209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Hirni DI, Kivisaari SL, Monsch AU, Taylor KI. Distinct neuroanatomical bases of episodic and semantic memory performance in Alzheimer's disease. Neuropsychologia 2013;51:930–937. [DOI] [PubMed] [Google Scholar]
- 49.Edmonds EC, Eppig J, Bondi MW, et al. Heterogeneous cortical atrophy patterns in MCI not captured by conventional diagnostic criteria. Neurology 2016;87:2108–2116. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The authors have full access to all study data and participant consent forms and take full responsibility for the data, the conduct of the research, the analysis and interpretation of the data, and the right to publish all data.






