Skip to main content
Neurology logoLink to Neurology
. 2021 Jul 20;97(3):e263–e274. doi: 10.1212/WNL.0000000000012221

Nomograms to Predict Verbal Memory Decline After Temporal Lobe Resection in Adults With Epilepsy

Robyn M Busch 1,, Olivia Hogue 1, Margaret Miller 1, Lisa Ferguson 1, Mary Pat McAndrews 1, Marla Hamberger 1, Michelle Kim 1, Carrie R McDonald 1, Anny Reyes 1, Daniel L Drane 1, Bruce P Hermann 1, William Bingaman 1, Imad M Najm 1, Michael W Kattan 1, Lara Jehi 1
PMCID: PMC8302146  PMID: 34011574

Abstract

Objective

To develop and externally validate models to predict the probability of postoperative verbal memory decline in adults after temporal lobe resection (TLR) for epilepsy using easily accessible preoperative clinical predictors.

Methods

Multivariable models were developed to predict delayed verbal memory outcome on 3 commonly used measures: Rey Auditory Verbal Learning Test (RAVLT) and Logical Memory (LM) and Verbal Paired Associates (VPA) subtests from Wechsler Memory Scale–Third Edition. With the use of the Harrell step-down procedure for variable selection, models were developed in 359 adults who underwent TLR at the Cleveland Clinic and validated in 290 adults at 1 of 5 epilepsy surgery centers in the United States or Canada.

Results

Twenty-nine percent of the development cohort and 26% of the validation cohort demonstrated significant decline on at least 1 verbal memory measure. Initial models had good to excellent predictive accuracy (calibration [c] statistic range 0.77–0.80) in identifying patients with memory decline; however, models slightly underestimated decline in the validation cohort. Model coefficients were updated with data from both cohorts to improve stability. The model for RAVLT included surgery side, baseline memory score, and hippocampal resection. The models for LM and VPA included surgery side, baseline score, and education. Updated model performance was good to excellent (RAVLT c = 0.81, LM c = 0.76, VPA c = 0.78). Model calibration was very good, indicating no systematic overestimation or underestimation of risk.

Conclusions

Nomograms are provided in 2 easy-to-use formats to assist clinicians in estimating the probability of verbal memory decline in adults considering TLR for treatment of epilepsy.

Classification of Evidence

This study provides Class II evidence that multivariable prediction models accurately predict verbal memory decline after TLR for epilepsy in adults.


Temporal lobe resection (TLR) is an effective treatment option for pharmacoresistant temporal lobe epilepsy (TLE) but can result in episodic memory declines in a large subset of patients.1-4 Several risk factors for verbal memory decline have been identified, including dominant-sided surgery, strong preoperative memory, older age at seizure onset, absence of hippocampal sclerosis, and older age at surgery.5 Given the elective nature of epilepsy surgery and the high prevalence of postoperative memory decline, it is imperative to develop risk models to predict memory outcome and to inform patient counseling.

While a number of multivariable models have been developed to predict verbal memory decline after TLR,5 many are not suitable for widespread clinical use because of the inclusion of site-specific predictors (e.g., Wada or fMRI protocols), use of memory measures not commonly used, and lack of external validation. Hence, there is a need for externally validated models using easily accessible preoperative variables to predict memory outcome for individual patients.

Nomograms are statistical tools that allow simultaneous consideration of numerous, possibly contradictory, clinical factors to assess risk for a clinical outcome in a particular patient given his/her unique demographic and disease characteristics. These models are widely used for clinical decision-making in many areas of medicine but have only recently been applied to epilepsy surgery outcomes.6,7

The objective of this study was to develop and externally validate models to predict the probability of postoperative decline in verbal delayed recall in adults after TLR for epilepsy.

Methods

Standard Protocol Approvals, Registrations, and Patient Consents

This was a retrospective prediction model development and validation study. All data for the study were obtained from existing Institutional Review Board–approved studies or data registries at Cleveland Clinic, Columbia University, University of California, San Diego, University of Washington School of Medicine, Emory University School of Medicine, and University of Toronto. The need for informed consent was waived because of the use of deidentified data.

Participants

For model development, cases were selected from an Institutional Review Board–approved neuropsychology registry for older adolescents and adults who underwent epilepsy surgery at Cleveland Clinic between 1988 and 2018. Individuals were included if they met the following criteria: (1) were age ≥16 years, (2) underwent any type of TLR (e.g., standard anterior temporal lobectomy, amygdalohippocampectomy, lesionectomy), (3) completed preoperative and postoperative neuropsychological evaluations that included at least 1 of the 3 verbal memory measures of interest: Rey Auditory Verbal Learning Test (RAVLT) or Logical Memory (LM) or Verbal Paired Associates (VPA) subtests from the Wechsler Memory Scale–Third Edition (WMS-III), (4) had no history of neurosurgery, and (5) had complete data for all predictors of interest. Patients who underwent laser ablations rather than resective surgery were excluded. A total of 359 patients from Cleveland Clinic met all inclusion criteria for the study and constituted the development cohort.

For model validation, data were obtained from 5 independent epilepsy surgery centers in different regions of the United States and Canada: Columbia University Medical Center, New York, NY (n = 56); University of California, San Diego, CA (n = 31); University of Washington, Seattle, WA (n = 44); Emory University School of Medicine, Atlanta, GA (n = 19); and University Health Network, Toronto, Ontario, Canada (n = 140). The same inclusion/exclusion criteria were applied to the validation cohort, although patients needed complete data only for predictors included in the models built from the development sample. Variables used for the validation cohort were coded with the same criteria used for the development cohort. Data for the validation cohort were obtained from clinical data registries or from existing research studies. Therefore, this cohort represents a subset of all patients who underwent epilepsy surgery during this time period.

Memory Outcomes

Verbal memory outcome was assessed with 3 different episodic memory measures. First, the RAVLT is a word-list learning task recommended as part of the NIH Epilepsy Neuropsychology Common Data Elements to assess verbal memory in patients with epilepsy. This measure includes a series of 15 words presented over the course of 5 trials. After each trial, the examinee is asked to provide as many of the 15 words as they can freely recall. An interference trial is then presented (i.e., a second, unrelated word list). After the examinee provides all items that they can recall from the interference list, they are asked to produce words that they can recall from the initial list that was presented 5 times. One additional recall trial for the initial list is conducted after a 30-minute delay. Change in performance on the delayed recall trial (decline/no decline) was used as the dependent measure for the RAVLT model (RAVLT Delay). Second, the LM subtest of the WMS-III requires examinees to listen to 2 short stories and to recall as many details as possible from the stories immediately and after a 25- to 35-minute delay. Change in performance on the delayed recall trial (decline/no decline) was used as the dependent measure for the LM model (L.M. Delay). Third, the VPA subtest of the WMS-III involves presentation of a series of 8 word pairs (e.g., lion-orange) over the course of 4 trials. After each trial, the first item in each word pair is presented, and the examinee must provide the word that was paired with that stimulus. Feedback is provided on whether the examinee's response is correct or incorrect, and the examinee is told the correct response whenever an incorrect response is provided. After a 25- to 35-minute delay, the examinee is again asked to indicate what word was paired with each stimulus during the initial learning trials. Change in performance on the delayed recall trial (decline/no decline) was used as the dependent measure for the VPA model (VPA Delay).

Patients in this study completed at least 1 of these measures before and ≈6 to 12 months after TLR as part of clinical neuropsychological evaluations. Change scores (postoperative score minus preoperative score) were calculated and classified into 1 of 2 categories (decline or no decline) using published, epilepsy-specific reliable change indices (RCIs) developed on individuals with epilepsy tested twice without any intervening surgery.8,9 RCIs identify the amount of test-retest change necessary to conclude that clinically meaningful change has occurred, independent of measurement error and practice effects. RCIs with an 80% confidence interval were used for this study. Specifically, published RCIs for the RAVLT Delay require a raw score decline of ≥4 points to be considered a clinically significant decline.9 Published RCIs for LM Delay and VPA Delay require a scaled score decline of ≥3 points to be considered a clinically significant decline.8

Candidate Predictors

Ten candidate predictors were considered for inclusion in the models: sex, education, age at epilepsy onset, duration of epilepsy, age at surgery, side of surgery (dominant or nondominant), preoperative MRI findings (lesional or nonlesional; presence or absence of hippocampal sclerosis), hippocampal resection (resected or spared), and baseline verbal memory score (for each respective memory measure; raw score for RAVLT Delay and scaled scores for LM Delay and VPA Delay). These variables were selected for their known association with verbal memory outcome after TLR5,10 and the ease with which they could be obtained in the course of routine presurgical assessment in most epilepsy centers. Several interaction terms were also assessed (defined a priori): baseline memory score by side of surgery, duration of epilepsy, hippocampal sclerosis on MRI, nonlesional MRI, and hippocampal sclerosis on MRI by hippocampal resection.

Side of TLR (dominant or nondominant) was coded using side of surgery (left or right) and results of available language lateralization procedures (fMRI and/or Wada testing). In right-handed individuals without a lateralization procedure, left-sided TLRs were coded as dominant and right-sided TLRs as nondominant. Left-handed and ambidextrous individuals without a language lateralization procedure and patients with symmetric bilateral language representation were excluded. To ensure that the assumptions we made about lateralization in patients without a formal language lateralization procedure did not overly influence the final models, model-building methods were repeated using only patients with confirmed left language dominance via fMRI or Wada testing (n = 214).

Statistical Analyses

Descriptive statistics were stratified first by test and then by sample (development and validation cohorts) and are presented using mean with SD, median with interquartile range, or count with percent as appropriate according to distribution. For each memory test, baseline demographic and clinical characteristics were compared between the development and validation cohorts using independent-samples t tests for normally distributed continuous variables (with Satterthwaite correction for inequality of variances, if warranted), Wilcoxon rank-sum tests for skewed continuous variables, and Pearson χ2 or Fisher exact tests for categorical variables. Normality was assessed visually with Q-Q plots. Tests comparing groups at baseline were 2 tailed with α = 0.05. Correction for multiple comparisons was not applied because inference from baseline group comparisons was not the goal of the study.

Model-building procedures were the same for each memory outcome. The Harrell step-down procedure was used to select the most parsimonious, best-fitting model. This procedure prioritizes the concordance (c) statistic for discriminatory ability and removes predictor variables sequentially in the order of their influence on the c statistic, from least to most important. The c statistic is equivalent to the area under the receiver operating characteristic curve. It ranges from 0.5 to 1.0. A value of 0.5 indicates that the model performs no better than chance, and a value of 1.0 means the model can correctly classify the memory outcome for 100% of patients. Final models were powered to allow a minimum of 15 events per variable. Continuous predictors were assessed for linearity by plotting against log odds. If warranted, piecewise linear terms or restricted cubic splines were included.

The models built in the development cohort were applied to the validation cohort. Performance in the external sample was assessed with the c-statistic and calibration curves, which plot predicted against observed outcomes. When warranted, model coefficients were updated with information from the external sample to reduce bias and subjected to internal validation using an internal bootstrapping technique; the dataset was resampled with replacement repeatedly to create 500 bootstrapped datasets. The 500 bootstrapped datasets are meant to simulate samples from the population; each will vary in how similar it is to the original training dataset, and the c statistic will be lower in the bootstrapped data than the training data. Each prediction model is then applied to each bootstrapped dataset; the c statistic is calculated for each; and the average c statistic from the bootstrapped data is taken to represent the optimism-corrected c statistic.11,12 Full final model coefficients are presented. Because the purpose of a prediction model is not to test a hypothesis, p values are not included. Nomograms were created from final logistic regression models, providing a visual representation for each model. Online risk calculators were also developed with the use of established methods.13

While seizure outcome was not included as a predictor in our models given that it would not be known when predictions are being made, we conducted Pearson χ2 tests among the combined sample to examine the relationship between verbal memory and seizure outcomes (completely seizure free vs not seizure free). We also tested the relationship between testing interval and seizure outcome and memory outcomes using Wilcoxon rank-sum tests.

Statistical analyses were conducted on complete cases with SAS Studio version 3.5 (SAS Institute, Cary, NC) and R Studio v.3.3.0 (rms and GivitiR packages; R Foundation for Statistical Computing, Vienna, Austria). Visual nomograms were created with SAS Studio version 3.7.

Data Availability

The datasets analyzed in the current study are not publicly available because of restricted access, but further information about the datasets is available from the corresponding author on reasonable request.

Results

The development sample included 190 patients contributing RAVLT Delay scores, 359 with LM Delay scores, and 354 with VPA Delay scores. The validation sample included 175 with RAVLT Delay scores, 112 with LM Delay scores, and 87 contributing VPA Delay scores. Compared to the development cohort, the validation cohort was younger with a lower proportion of White patients; LM and VPA validation cohorts also had a lower proportion of patients with hippocampal sclerosis. The validation sample had longer follow-up times and lower rates of seizure freedom after surgery than development sample patients. Table 1 provides further information.

Table 1.

Demographic and Baseline Clinical Information Stratified by Delayed Verbal Memory Measure and Sample

graphic file with name NEUROLOGY2020116954TT1.jpg

Twenty-nine percent (n = 127; 88% dominant) of development cohort patients and 26% (n = 74; 93% dominant) of validation cohort patients experienced clinically relevant postoperative memory decline on at least 1 verbal memory measure. The proportion of patients who declined on each measure is shown in table 1.

The model for RAVLT Delay included the following predictor variables: baseline score, surgery side, and hippocampal resection. Higher baseline RAVLT Delay scores, dominant surgeries, and hippocampal resection all led to higher odds of decline. The models for both LM Delay and VPA Delay included the associated baseline score, surgery side, and education. For both, higher preoperative score and dominant-sided surgeries yielded higher odds of decline, while higher education imparted lower odds of decline.

The models built on the development cohort had similar receiver operating characteristic curves when applied to the validation cohort. However, the models from the development sample somewhat underestimated the probability of decline for each test in the validation sample, as indicated in figure 1, A–C, where the confidence bands lie above the ideal line for much of the lower predicted probabilities.14 Thus, we combined the development and validation samples to update model coefficients and to better reflect both samples. Performance in the updated models was good to excellent: after internal bootstrap validation and correcting for optimism, RAVLT Delay achieved a c statistic of 0.81, LM Delay of 0.76, and VPA Delay of 0.78 (table 2). Model calibration was also very good. The curves presented in figure 2, A–C do not cross the 45° line, indicating no systematic overestimation or underestimation of risk. The nomograms are presented in figures 3 through 5 and are available as online risk calculators.15

Figure 1. Model Calibration (Development/Validation Samples) for Delayed Memory Measures.

Figure 1

(A) Rey Auditory Verbal Learning Test Delayed Recall, (B) Wechsler Memory Scale–Third Edition (WMS-III) Logical Memory Delayed Recall, and (C) WMS-III Verbal Paired Associates Delayed Free Recall. Calibration curves for models built on the development sample and applied to the validation sample. A well-calibrated model includes a 45° bisector (red line) with Giviti confidence bands (gray-shaded areas) that lie narrowly and evenly on either side. The above calibration bands indicate that the first iteration of the model underestimates risk for all 3 memory outcomes.

Table 2.

Model Coefficients and ORs

graphic file with name NEUROLOGY2020116954TT2.jpg

Figure 2. Model Calibration (Combined Sample) for Delayed Memory Measures.

Figure 2

(A) Rey Auditory Verbal Learning Test Delayed Recall, (B) Wechsler Memory Scale–Third Edition (WMS-III) Logical Memory Delayed Recall, and (C) WMS-III Verbal Paired Associates Delayed Free Recall. A well-calibrated model includes a 45° bisector (red line) with Giviti confidence bands (gray-shaded areas) that lie narrowly and evenly on either side. The above indicates that the updated models are well calibrated and that model predictions are a reliable estimate of risk. Narrower confidence bands at lower risks reflect the larger proportion of patients in the sample who are classified as low risk.

Figure 3. Prognostic Nomogram for Predicting Decline on Rey Auditory Verbal Learning Test Delayed Recall.

Figure 3

To use the nomogram, locate the patient's position on the scale associated with each predictor. Top axis displays prognostic points. Connect the position on each variable axis with the points axis to determine the number of points corresponding to the appropriate variable position. Total the points for all variables; then find the appropriate position on the total points axis and connect it with the associated position on the risk of decline axis to determine the patient's individual risk. The Rey Auditory Verbal Learning Test (RAVLT) Delay score is reported using the raw score for the Delayed Free Recall Trial. For example, a patient whose baseline RAVLT raw score was 15 (100 points), who had a dominant-sided resection (70 points), and who had the hippocampus resected (59 points) would have a total of 229 points and a corresponding approximate 79% risk of clinically meaningful postoperative decline in RAVLT Delayed Free Recall.

Figure 4. Prognostic Nomogram for Predicting Decline on WMS-III LMII.

Figure 4

To use the nomogram, locate the patient's position on the scale associated with each predictor. Top axis displays prognostic points. Connect the position on each variable axis with the points axis to determine the number of points corresponding to the appropriate variable position. Total the points for all variables; then find the appropriate position on the total points axis and connect it with the associated position on the risk of decline axis to determine the patient's individual risk. Logical Memory Delayed Recall (LMII) is reported with the age-corrected scaled score. For example, a patient whose baseline LMII scaled score was 5 (25 points) with an education of 8 years (−10 points) and a nondominant temporal lobe resection (0 points) would have a total of 15 points and a corresponding approximate 8% risk of clinically meaningful postoperative decline in LMII. There were no participants in this study with an LMII nomogram score >81; therefore, we cannot generate a specific probability for scores >81. Thus, for clinical purposes, any nomogram score >81 points should be interpreted as an ≥80% probability of decline. WMS-III = Wechsler Memory Scale–Third Edition.

Figure 5. Prognostic Nomogram for Predicting Decline on WMS-III VPAII.

Figure 5

To use the nomogram, locate the patient's position on the scale associated with each predictor. Top axis displays prognostic points. Connect the position on each variable axis with the points axis to determine the number of points corresponding to the appropriate variable position. Total the points for all variables; then find the appropriate position on the total points axis and connect it with the associated position on the risk of decline axis to determine the patient's individual risk. The Verbal Paired Associates Delayed Recall (VPAII) score is reported using the age-corrected scaled score. For example, a patient whose baseline VPAII scaled score was 12 (57 points) with an education of 15 years (−22 points) and a dominant temporal lobe resection (29 points) would have a total of 64 points and a corresponding approximate 45% risk of clinically meaningful postoperative naming decline. There were no participants in this study with a VPAII nomogram score >81; therefore, we cannot generate a specific probability for scores >81. Thus, for clinical purposes, any nomogram score >81 points should be interpreted as a ≥72% probability of decline. WMS-III = Wechsler Memory Scale–Third Edition.

When analyses were rerun using only patients with confirmed left-hemisphere language dominance via fMRI or Wada, included variables and coefficients remained largely unchanged for all models.

Patients who were seizure-free after surgery had shorter intervals between preoperative and postoperative testing (median 11 vs 16 months, p < 0.001) and shorter intervals between surgery and postoperative testing (median 6 vs 10 months, p < 0.001) than patients with recurrent seizures. Patients who declined on LM Delay had longer testing intervals than patients who did not decline (median 14 vs 10 months between testing sessions, p < 0.001; median 7 vs 6 months between surgery and postoperative testing, p < 0.001). Seizure outcome was not associated with memory outcome.

Discussion

In this study, we developed multivariable models to estimate the probability of verbal memory decline after resective TLE surgery on 3 commonly used delayed verbal memory tasks: word-list learning (RAVLT Delay), story recall (LM Delay from WMS-III), and word-pair learning (VPA Delay from WMS-III). We provide visual nomograms and online calculators that can be used to aid clinicians in estimating a patient's individual risk for verbal memory decline after TLR using clinical information that is readily available in most surgical settings. These models are designed to assist clinicians in preoperative patient counseling by consolidating a number of risk factors, some of which may be contradictory, into a single model designed to estimate the probability of postoperative memory decline in a particular patient given his or her unique individual demographic and disease characteristics.

While the initial models built on the development cohort had similar receiver operating characteristic curves when applied to the validation cohort, we were not satisfied with the calibration of the models because the models built on the development sample somewhat underestimated the probability of decline for each test in the validation sample. A number of factors may have contributed to our less-than-optimal calibration. The sample sizes for the validation cohort were relatively modest for the WMS-III subtests compared to those of the development cohort; there was greater heterogeneity among patients in the validation cohort; and there were a number of differences in demographic and disease characteristics between the 2 cohorts (table 1). The time frame between preoperative and postoperative assessment and between surgery and postoperative assessment was also much shorter in the development cohort than in the validation cohort (table 1). These group differences are not unexpected given that the validation sample comprised patients from many different geographic regions, from San Diego, CA, to Toronto, Canada, with more diverse racial and ethnic backgrounds than the development cohort.

To overcome these challenges, we combined the development and validation samples to update model coefficients to better reflect both samples. There are several accepted methods for updating a prediction model after poor external validation; our choice to combine the new samples and update all coefficients was driven largely by the relatively small sample sizes. After internal validation through bootstrap resampling, updated models had good to excellent discrimination and very good calibration (table 2 and figure 2, A–C). Specifically, RAVLT Delay achieved a c statistic of 0.81. This indicates that when 2 patients are presented, 1 with postoperative memory decline and another without postoperative memory decline, the model would correctly classify patients 81% of the time. The predictions generated by this model were very good representations of actual risk, as demonstrated by the calibration belt (figure 2A). The models for LM Delay and VPA Delay performed similarly with c statistics of 0.76 and 0.78, respectively (figure 2, B and C). Performance of these 3 models indicates that they are suitable for routine clinical use to predict verbal memory outcome after TLR for treatment of epilepsy.

Consistent with the existing literature, side of surgery and baseline verbal memory ability were important predictors in all 3 memory models.16-23 It is well known that there is some material specificity to episodic memory; the dominant temporal lobe plays a greater role in verbal memory functions, and the nondominant temporal lobe may play a greater role in visual memory functions.24,25 Thus, it is not surprising that resections in the language dominant temporal lobe are associated with greater risk for postoperative verbal memory decline, regardless of the type of verbal memory task used. The relationship between preoperative verbal memory ability and memory outcome after TLR has also been rather consistent across studies16-18,20,21; patients with intact memory functions before surgery are at greater risk for postoperative memory decline than those with preoperative memory deficits.9-11,13,14

Resection of the hippocampus was a top contender for inclusion in models predicting all 3 memory outcomes, and it was retained as a predictor in the final model for RAVLT Delay. The hippocampus plays a crucial role in the initial encoding of long-term memories, with rapid changes in neuronal activation and structural synaptic plasticity setting the stage for subsequent memory consolidation.26 The most common types of TLR for treatment of epilepsy, including standard anterior temporal lobectomy and amygdalohippocampectomy, involve resection of the hippocampus, which is highly epileptogenic.27,28 Research over the years has rather consistently demonstrated declines in episodic memory after TLR, particularly after resection in the language-dominant hemisphere.1

Education had more discriminative ability than hippocampal resection in the models for LM Delay and VPA Delay. Specifically, patients with higher educational attainment had better memory outcomes than those with fewer years of education. The inclusion of education vs hippocampal resection may be due in part to the difference in case mix between the sample that completed the RAVLT and the sample that completed LM and/or VPA. Nevertheless, this finding suggests that cognitive reserve is an important factor for memory outcome after TLR. The cognitive reserve hypothesis was proposed in the early 1990s to explain the variability in clinical expression observed despite similar brain injury/pathology.29,30 In the literature that followed, which spans a wide range of neurologic and neurodegenerative conditions, education has been one of the most commonly used proxies for cognitive reserve.31-35 Higher educational attainment is generally associated with higher cognitive test scores and better cognitive outcomes after a brain insult. Indeed, normative data for most episodic memory measures, including those used in this study, are adjusted for education level, and education has been shown to account for some of the variability in memory performance in patients with epilepsy.8,9,36,37 While the concepts of hippocampal adequacy and functional reserve have long been discussed in relation to memory outcome after temporal lobe surgery,38 we are unaware of any prediction models for postoperative memory outcome after TLR that have examined education as a potential predictor variable.

It is important to note that although other predictors that have been associated with verbal memory outcome in prior studies did not improve discriminative ability beyond the other variables in the models, this does not mean they are irrelevant. Rather, the variables included in our final models (surgery side, baseline memory, hippocampal resection, education) reflect the combinations of predictors that achieve the best discrimination within the constraints of our sample size. It is also quite possible that inclusion of other more sophisticated variables and more detailed variable characterization (e.g., type and extent of resection, imaging results, fMRI language asymmetry) may have further improved performance of our models. However, as noted previously, we intentionally limited our predictors to demographic and clinical variables readily available in most surgical centers with a simple coding scheme that would ensure consistency across sites. The variables we have selected should be easily ascertainable in the course of routine clinical care, thereby permitting more widespread use of these nomograms across surgical centers.

Given these models were developed on a large, well-phenotyped group of patients who underwent TLR from 1 of 5 epilepsy centers in disparate regions of the United States (West, East, Midwest, South) and Canada (East), we expect they will have good generalizability to other large epilepsy centers in North America that use these memory measures (RAVLT, LM, VPA) to assess verbal memory in patients being evaluated for TLE surgery. Given that the development and validation samples were combined for our final models, additional studies to externally validate the models are warranted. When considering these models, it is important to note that not all patients in this study completed all 3 memory measures. Thus, each of the 3 models was built on a different subset of patients with different demographic characteristics. Therefore, differences in the models may be partially attributable to the different case mix, and the models should not be directly compared to each other.

While neuropsychological assessment is considered standard of care for epilepsy surgery at most centers, there may be variability in referral patterns and completion rates, particularly for postoperative assessments, across centers. For example, out-of-state patients may complete their postoperative follow-up visits locally; there may be limited insurance coverage for postoperative neuropsychological assessments; and some patients may fail to return for a scheduled postoperative assessment, particularly if they feel that they are doing well from a cognitive standpoint. As a result of these factors, or others, the base rate of observed cognitive decline may vary from center to center. Nevertheless, the base rates of memory decline observed in this study are well within the range reported in the literature across centers and diverse populations of patients undergoing TLE surgery.1 Furthermore, prediction models such as those reported in this study tend to be robust to variability in base rates because the distribution of predictors often varies along with the prevalence rate. Therefore, these models should have rather broad applicability, even in centers where the base rate of cognitive decline is somewhat higher or lower than that observed in this sample.

It is important to note that while these models can aid clinical decision-making by estimating a particular individual's postoperative risk for memory decline, these models do not provide information regarding the functional impact (e.g., social, vocational) that memory decline will have for a given patient or whether cognitive changes will affect quality of life. It is also unclear to what extent an individual would experience memory decline over time due to chronic epilepsy in the absence of surgical resection because we did not have data on a nonsurgical control group. Prospective memory declines in individuals with unoperated TLE compared to control participants have been reported,39 along with the aforementioned literature replete with documentation of pretreatment to posttreatment left TLR memory decline. Determination of the comparative risk and magnitude of memory decline in surgically vs medically treated patients with TLE requires a randomized prospective trial. Indirect cognitive comparisons of surgery vs medically treated cohorts, while of interest, can be confounded by reasons for nonoperation in medically treated patients (e.g., bilateral disease), variations in test-retest intervals, administered measures, and definitions of change.

Several study limitations deserve mention. First, we included only patients who underwent a surgical resection in the temporal lobe; therefore, results may not generalize to patients undergoing other types of surgical procedures (e.g., laser ablation, gamma knife, multiple transections) for treatment of their epilepsy. While we included patients with a wide range of surgical resection types from lesionectomy to standard anterior temporal resection, we tested only a single surgical predictor in our models (i.e., hippocampus resected vs spared) for ease and consistency of variable coding. Some evidence suggests that selective mesial resections are associated with better memory outcomes than anterior temporal lobectomies; however, this finding has not been consistent, and research suggests that the functional adequacy of the tissue to be resected is more strongly related to postoperative memory outcome than extent of the resection itself.1,40-42 Nevertheless, additional research on resection type/extent may be useful to identify variables that may enhance model performance. Second, in each model, baseline score had the highest relative importance compared to other predictors. We recognize that some component of the role of baseline score in predicting memory outcome can be attributed to the phenomenon of regression to the mean. In other words, patients with extreme scores at baseline may be likely to show scores closer to the mean at retest. Furthermore, patients with very low baseline scores would be less likely to show a decline at follow-up, simply because of the lower bounds of the test. Thus, a patient's baseline score should be considered in the interpretation of output from the model. That said, baseline verbal memory test scores and the risk of preoperative to postoperative decline have been shown to be associated with the integrity of the left mesial temporal lobe/hippocampus through quantitative and qualitative assessments of neuronal loss and morphometrics.43-45 Third, the story recall and word-pair tasks used in this study are from the WMS-III and not from the most current version of WMS (fourth edition). While the stimuli remain similar between the 2 test versions, there are some potentially important differences; therefore, additional research will be needed to determine whether these models have utility for the WMS-IV subtests. Fourth, although calibration of the model improved after data from all sites were combined, an even larger sample size would impart still better calibration. Notably, precision of estimates at the lower ranges of predicted probability was very good but becomes slightly less precise at the higher ranges of prediction. Fifth, the RCIs used in this study were developed in mixed cohorts of patients with focal epilepsy, albeit mostly TLE, tested an average of 7 to 8 months apart. All the patients in the current study had TLE, and most completed postoperative testing 6 to 12 months after surgery. While the majority of patients are rather similar in demographic, disease, and testing characteristics, it is possible that there may be a slight overestimation or underestimation of decline in our cohort. Furthermore, longer-term outcomes or natural history of TLR-related cognitive morbidity remains to be fully clarified. While there have been some reports of cognitive improvement or stability after longer follow-up periods after left TLR, a large subset of patients appear to demonstrate additional declines over time,46-49 and further information is critically needed. It should be noted that a small subset of patients in this study had long test-retest intervals. Given that the RCIs used in this study were developed on patients with shorter retest intervals, decline may be overestimated in these individuals. Finally, the methods we used to characterize language lateralization are relatively simplistic and may not accurately reflect actual language representation for some patients. While this methodology permits application of these models to a wider range of patients (e.g., those without a language lateralization procedure), it does not fully consider the variability in language representation that can be observed in patients with epilepsy.50,51 These models should not be applied to left-handed or ambidextrous patients who have not had a language lateralization procedure or to those with symmetric bilateral language lateralization because such patients were excluded from this study.

In conclusion, we observed declines in delayed verbal recall in 20% to 30% of individuals who underwent a dominant TLR. This study provides easy-to-use nomograms to predict delayed verbal memory outcome on 3 commonly used measures in adults considering TLR for treatment of epilepsy. We hope that these tools will be useful in helping clinicians consolidate multiple risk factors, which are often contradictory, to counsel patients regarding their individual risk for postoperative memory decline. These models can be used in conjunction with the models we previously developed for predicting naming outcome after TLR.7 Online calculators for all models are also available.15 We hope that these tools will help to improve preoperative decision-making and patient counseling. Future research will seek to examine other aspects of memory (e.g., verbal learning, visual memory) and other postoperative outcomes (e.g., mood), as well as the impact of cognitive decline on day-to-day functioning and quality of life.

Acknowledgment

The authors thank Xinge Ji for programming the online risk calculator. Some of the study data were collected and managed with the use of REDCap (Research Electronic Data Capture) electronic data capture tools hosted at Cleveland Clinic. REDCap is a secure, web-based software platform designed to support data capture for research studies, providing (1) an intuitive interface for validated data capture, (2) audit trails for tracking data manipulation and export procedures, (3) automated export procedures for seamless data downloads to common statistical packages, and (4) procedures for data integration and interoperability with external sources.

Glossary

LM

Logical Memory

RAVLT

Rey Auditory Verbal Learning Test

RCI

reliable change index

TLE

temporal lobe epilepsy

TLR

temporal lobe resection

VPA

Verbal Paired Associates

WMS-III

Wechsler Memory Scale–Third Edition

Appendix. Authors

Appendix.

Footnotes

Class of Evidence: NPub.org/coe

Study Funding

Primary support for this study was provided by the Cleveland Clinic Epilepsy Center and the NIH/National Institute of Neurologic Disorders and Stroke (NINDS) (R01 NS097719 to R.M.B., O.H., M.M., L.F., M.W.K., L.J.).

Disclosure

Dr. Busch receives support by grants from the NIH/NINDS (R01 NS097719, R01 NS035140). Ms. Hogue, Dr. Miller, and Ms. Ferguson receive support by a grant from the NIH/NINDS (R01 NS097719). Dr. McAndrews reports no disclosures. Dr. Hamberger receives support by grants from the NIH/NINDS (R01 NS35140, R01 NS083976, R03 NS111180). Dr. Kim receives support by grants from the NIH/NINDS (2U01NS038455-16A1 and R01NS088748) and research support from Medtronic, Inc to serve as the local neuropsychologist on a multisite trial of stereotactic laser amygdalohippocampotomy (A118529). Dr. McDonald receives support by a grant from the NIH/NINDS (NIH R01NS065838, NIH R21NS107739). Ms. Reyes receives support by a fellowship from the NIH/NINDS (F31 NS111883-01). Dr. Drane's laboratory receives funding from the NIH/NINDS (R01NS088748) and the National Institute of Mental Health (R01MH118514), as well as funds from Medtronic, Inc to serve as the core center for managing neuroimaging and cognitive analysis of data used in their multisite trial of stereotactic laser amygdalohippocampotomy (A1321808). Dr. Hermann receives support by grants from the NIH/NINDS (R01 NS111022, R01 NS44351), National Institute of Aging (R01 AG027161-11), and Citizens United for Research in Epilepsy. Dr. Bingaman and Dr. Najm report no disclosures. Go to Neurology.org/N for full disclosures.

References

  • 1.Sherman EMS, Wiebe S, Fay-McClymont TB, et al. Neuropsychological outcomes after epilepsy surgery: systematic review and pooled estimates. Epilepsia. 2011;52(5):857-869. [DOI] [PubMed] [Google Scholar]
  • 2.Langfitt JT, Westerveld M, Hamberger MJ, et al. Worsening of quality of life after epilepsy surgery: effect of seizures and memory decline. Neurology. 2007;68(23):1988-1994. [DOI] [PubMed] [Google Scholar]
  • 3.Giovagnoli AR, Parente A, Tarallo A, Casazza M, Franceschetti S, Avanzini G. Self-rated and assessed cognitive functions in epilepsy: impact on quality of life. Epilepsy Res. 2014;108(8):1461-1468. [DOI] [PubMed] [Google Scholar]
  • 4.Rausch R, Kraemer S, Pietras CJ, Le M, Vickrey BG, Passaro EA. Early and late cognitive changes following temporal lobe surgery for epilepsy. Neurology. 2003;60(6):951-959. [DOI] [PubMed] [Google Scholar]
  • 5.Busch RM, Naugle RI. Pre-surgical neuropsychological workup: risk factors for post-surgical deficits. In: Lüders H, editor. Textbook of Epilepsy Surgery. Informa HealthCare; 2008:817-825. [Google Scholar]
  • 6.Jehi L, Yardi R, Chagin K, et al. Development and validation of nomograms to provide individualised predictions of seizure outcomes after epilepsy surgery: a retrospective analysis. Lancet Neurol. 2015;14(3):283-290. [DOI] [PubMed] [Google Scholar]
  • 7.Busch RM, Hogue O, Kattan MW, et al. Nomograms to predict naming decline after temporal lobe surgery in adults with epilepsy. Neurology. 2018;91(23):e2144-e2152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Martin R, Sawrie S, Gilliam F, et al. Determining reliable cognitive change after epilepsy surgery: development of reliable change indices and standardized regression-based change norms for the WMS-III and WAIS-III. Epilepsia. 2002;43(12):1551-1558. [DOI] [PubMed] [Google Scholar]
  • 9.Sawrie SM, Chelune GJ, Naugle RI, Luders HO. Empirical methods for assessing meaningful neuropsychological change following epilepsy surgery. J Int Neuropsychol Soc. 1996;2(6):556-564. [DOI] [PubMed] [Google Scholar]
  • 10.Dulay MF, Busch RM. Prediction of neuropsychological outcome after resection of temporal and extratemporal seizure foci. Neurosurg Focus. 2012;3(3):E4. [DOI] [PubMed] [Google Scholar]
  • 11.RMS. Regression Modeling Strategies, R Package Version 5.1-2 [computer program]. Department of Biostatistics, Vanderbilt University; 2017. [Google Scholar]
  • 12.Harrell FE Jr., Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15(4):361-387. [DOI] [PubMed] [Google Scholar]
  • 13.Ji X, Kattan MW. Tutorial: development of an online risk calculator platform. Ann Transl Med. 2018;6(3):46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Moons KG, Kengne AP, Grobbee DE, et al. Risk prediction models, II: external validation, model updating, and impact assessment. Heart. 2012;98(9):691-698. [DOI] [PubMed] [Google Scholar]
  • 15.Predicting cognitive decline after temporal lobe epilepsy surgery. Accessed October 29, 2020. Available at: riskcalc.org/CognitiveAfterEpilepsySurgery/.
  • 16.Baxendale S, Thompson P, Harkness W, Duncan J. Predicting memory decline following epilepsy surgery: a multivariate approach. Epilepsia. 2006;47(11):1887-1894. [DOI] [PubMed] [Google Scholar]
  • 17.Chelune GJ, Najm IM. Risk factors associated with postsurgical decrements in memory. In: Luders HO, Comair Y, editors. Epilepsy Surgery. Lippincott-Raven; 2000:497-504. [Google Scholar]
  • 18.Hermann BP, Seidenberg M, Haltiner A, Wyler AR. Relationship of age at onset, chronologic age, and adequacy of preoperative performance to verbal memory change after anterior temporal lobectomy. Epilepsia. 1995;36(2):137-145. [DOI] [PubMed] [Google Scholar]
  • 19.Chelune GJ, Naugle RI, Luders H, Awad IA. Prediction of cognitive change as a function of preoperative ability status among temporal lobectomy patients seen at 6-month follow-up. Neurology. 1991;41(3):399-404. [DOI] [PubMed] [Google Scholar]
  • 20.Davies KG, Bell BD, Bush AJ, Wyler AR. Prediction of verbal memory loss in individuals after anterior temporal lobectomy. Epilepsia. 1998;39(8):820-828. [DOI] [PubMed] [Google Scholar]
  • 21.Helmstaedter C, Elger CE. Cognitive consequences of two-thirds anterior temporal lobectomy on verbal memory in 144 patients: a three-month follow-up study. Epilepsia. 1996;37(2):171-180. [DOI] [PubMed] [Google Scholar]
  • 22.Jokeit H, Ebner A, Holthausen H, et al. Individual prediction of change in delayed recall of prose passages after left-sided anterior temporal lobectomy. Neurology. 1997;49(2):481-487. [DOI] [PubMed] [Google Scholar]
  • 23.Bell BD, Davies KG, Haltiner AM, Walters GL. Intracarotid amobarbital procedure and prediction of postoperative memory in patients with left temporal lobe epilepsy and hippocampal sclerosis. Epilepsia. 2000;41(8):992-997. [DOI] [PubMed] [Google Scholar]
  • 24.Milner B. Disorders of learning and memory after temporal lobe lesions in man. Clin Neurosurg. 1972;19:421-446. [DOI] [PubMed] [Google Scholar]
  • 25.Saling MM. Verbal memory in mesial temporal lobe epilepsy: beyond material specificity. Brain. 2009(pt 3):570-582. [DOI] [PubMed] [Google Scholar]
  • 26.Frankland PW, Bontempi B. The organization of recent and remote memories. Nat Rev Neurosci. 2005;6(2):119-130. [DOI] [PubMed] [Google Scholar]
  • 27.Schaller K, Cabrilo I. Anterior temporal lobectomy. Acta Neurochirurg. 2016;158(1):161-166. [DOI] [PubMed] [Google Scholar]
  • 28.de Almeida AN, Teixeira MJ, Feindel WH. From lateral to mesial: the quest for a surgical cure for temporal lobe epilepsy. Epilepsia. 2008;49(1):98-107. [DOI] [PubMed] [Google Scholar]
  • 29.Satz P. Brain reserve capacity on symptom onset after brain injury: a formulation and review of evidence for threshold theory. Neuropsychology. 1993;7:273-295. [Google Scholar]
  • 30.Stern Y. What is cognitive reserve? Theory and research applications of the reserve concept. J Int Neuropsychological Soc. 2002;8(3):448-460. [PubMed] [Google Scholar]
  • 31.Stern Y. Cognitive reserve. Neuropsychologia. 2009;47(10):2015-2028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Xu W, Yu JT, Tan MS, Tan L. Cognitive reserve and Alzheimer's disease. Mol Neurobiol. 2015;51(11):187-208. [DOI] [PubMed] [Google Scholar]
  • 33.Hindle JV, Martyr A, Clare L. Cognitive reserve and Parkinson's disease: a systematic review and meta-analysis. Parkinsonism Relat Disord. 2014;20(1):1-7. [DOI] [PubMed] [Google Scholar]
  • 34.Staff RT. Reserve, brain changes, decline. Neuroimaging Clin N Am. 2012;22(1):99-105. [DOI] [PubMed] [Google Scholar]
  • 35.Tucker AM, Stern Y. Cognitive reserve in aging. Curr Alzheimer. 2011;8(4):354-360. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Chang Y-HA, Marxhall A, Bahrami N, Mathur K, et al. Differential sensitivity of structural, diffusion, and resting-state functional MRI for detecting brain alterations and verbal memory impairment in temporal lobe epilepsy. Epilepsia. 2019;60(5):935-947. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Hermann B, Seidenberg M, Schoenfeld J, Peterson J, Leveroni C, Wyler AR. Empirical techniques for determining reliability, magnitude, and pattern of neuropsychological change after epilepsy surgery. Epilepsia. 1996;37(10):942-950. [DOI] [PubMed] [Google Scholar]
  • 38.Helmstaedter C. Neuropsychological aspects of epilepsy surgery. Epilepsy Behav. 2004;5(suppl 1):S45-S55. [DOI] [PubMed] [Google Scholar]
  • 39.Hermann BP, Seidenberg M, Dow C, et al. Cognitive prognosis in chronic temporal lobe epilepsy. Ann Neurol. 2006;60(1):80-87. [DOI] [PubMed] [Google Scholar]
  • 40.Schramm J. Temporal lobe epilepsy surgery and the quest for optimal extent of resection: a review. Epilepsia. 2008;49(8):1296-1307. [DOI] [PubMed] [Google Scholar]
  • 41.Chelune GJ. Hippocampal adequacy versus functional reserve: predicting memory functions following temporal lobectomy. Arch Clin Neuropsychol. 1995;10(5):413-432. [PubMed] [Google Scholar]
  • 42.Wolf RL, Ivnik RJ, Hirschorn KA, Sharbrough FW, Cascino GD, Marsh WR. Neurocognitive efficiency following left temporal lobectomy: standard versus limited resection. J Neurosurg. 1993;79(1):76-83. [DOI] [PubMed] [Google Scholar]
  • 43.Hermann BP, Wyler AR, Somes G, Berry AD, Dohan JC. Pathological status of the mesial temporal lobe predicts memory outcome from left anterior temporal lobectomy. Neurosurgery. 1992;31(4):652-657. [DOI] [PubMed] [Google Scholar]
  • 44.Sass KJ, Westerveld M, Buchanan CP, Spencer SS, Kim JH, Spencer DD. Degree of hippocampal neuron loss determines severity of verbal memory decrease after left anteriomesiotemporal lobectomy. Epilepsia. 1994;35(6):1179-1186. [DOI] [PubMed] [Google Scholar]
  • 45.Trenerry MR, Jack CR Jr., Ivnik RJ, et al. MRI hippocampal volumes and memory function before and after temporal lobectomy. Neurology. 1993;43(9):1800-1805. [DOI] [PubMed] [Google Scholar]
  • 46.Grammaldo LG, Di Gennaro G, Giampà T, et al. Memory outcome 2 years after anterior temporal lobectomy in patients with drug-resistant epilepsy. Seizure. 2009;18(2):139-144. [DOI] [PubMed] [Google Scholar]
  • 47.Alpherts WC, Vermeulen J, van Rijen PC, da Silva FH, van Veelen CW. Verbal memory decline after temporal epilepsy surgery? A 6-year multiple assessments follow-up study. Neurology. 2006;67(4):626-631. [DOI] [PubMed] [Google Scholar]
  • 48.Baxendale SA, Thompson PJ. Postoperative hippocampal remnant shrinkage and memory decline: a dynamic process. Neurology. 2000;55(2):243-249. [DOI] [PubMed] [Google Scholar]
  • 49.Engman E, Andersson-Roswall L, Samuelsson H, Malmgren K. Serial cognitive change patterns across time after temporal lobe resection for epilepsy. Epilepsy Behav. 2006;8(4):765-772. [DOI] [PubMed] [Google Scholar]
  • 50.Dijkstra KK, Ferrier CH. Patterns and predictors of atypical language representation in epilepsy. J Neurol Neurosurg Psychiatry. 2013;84(4):379-385. [DOI] [PubMed] [Google Scholar]
  • 51.Goldmann RE, Golby AJ. Atypical language representation in epilepsy: implications for injury-induced reorganization of brain function. Epilepsy Behav. 2005;6(4):473-487. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets analyzed in the current study are not publicly available because of restricted access, but further information about the datasets is available from the corresponding author on reasonable request.


Articles from Neurology are provided here courtesy of American Academy of Neurology

RESOURCES