Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Feb 1.
Published in final edited form as: Parkinsonism Relat Disord. 2024 Dec 16;131:107245. doi: 10.1016/j.parkreldis.2024.107245

Short-term cognitive practice effects in Parkinson’s disease: More than meets the eye

Kevin Duff a,*, Julia V Vehar b, Daniel Weintraub c
PMCID: PMC11769728  NIHMSID: NIHMS2044003  PMID: 39705907

Abstract

Introduction:

Although practice effects (PE) on repeated cognitive testing have received growing interest in Alzheimer’s disease, they have been understudied in Parkinson’s disease (PD). The current paper examined PE across one week in a sample of patients with PD via traditional methods and regression-based change scores, as well as if these change scores relate to clinical variables in PD.

Methods:

Thirty-five patients with PD were administered a brief cognitive battery twice across approximately one week. Using both simple-difference and standardized regression-based change scores, a series of one-sample and independent t-tests were run to assess for PE across the test battery. Pearson correlations were run between both types of change scores and measures of mood and severity of motor symptoms.

Results:

Whereas traditional analyses (i.e., simple difference scores and dependent t-tests) did not reveal any changes on test scores over this interval, regression-based change scores did identify that these individuals showed significantly smaller-than-expected PE on three of the seven cognitive scores. Furthermore, when these regression-based change scores were trichotomized (decline/stable/improve), four of the seven tests showed significantly more decline than expected in this sample. Finally, these regression-based change scores significantly correlated with motor measures, with smaller PE being associated with worse motor functioning.

Conclusion:

Although these results are preliminary and need to be replicated in larger and more diverse samples, smaller-than-expected PE are seen in PD and they may signal more advanced disease.

Keywords: Parkinson’s disease, practice effects, cognition

Introduction

Practice effects (PE) are improvements in cognitive test scores due to repeated exposure to the same or similar test materials, which can obscure actual change in patients with neurological disease. In Alzheimer’s disease, smaller-than-expected PE have been linked to prognosis [1] and biomarker status [2]. Conversely, there is scant literature on PE in Parkinson’s disease (PD), where repeat evaluations are common (e.g., pre- and post-deep brain stimulation surgery, clinical trials of medications).

In longitudinal studies examining cognition in PD, PE has been complicated by longer retest intervals and interim interventions. For example, in treatment-as-usual observational and interventional studies across 0.5 – 2 years, minimal PE were reported [3,4,5,6,7,8,9]. Even across briefer periods (5 days to 10 weeks) without interventions, PE have largely been absent in PD [10,11]. However, none of these briefer, non-intervention studies have examined more sophisticated calculations of cognitive change (e.g., standardized regression-based change scores) that might identify smaller-than-expected PE in PD. It was expected that these sophisticated calculations of PE would identify PE better than simple difference methods, and that they would be more strongly related to clinical characteristics of PD.

Method

Participants

Thirty-five patients with PD (mean age = 69.1 years [SD = 7.8], mean education = 16.7 years [SD = 2.1], 63% were male, 100% White, 9% Hispanic) were recruited from the Clinical Core at the University of Pennsylvania (U19 AG062418) with a Montreal Cognitive Assessment (MoCA) score ≥ 20 and reliable internet connection. On medication (levodopa equivalency daily dose Time 1: M = 906.5 mg, SD = 570.8), mean Unified Parkinson’s Disease Rating Scale Part III (UPDRS Part III) was 23.3 (SD = 11.7, range = 5 – 63), most were Hoehn & Yahr stage 2 (71%, stage 3 = 20%, stage 4 = 9%), and they had minimal depression (Geriatric Depression Scale 15-item (GDS): M = 2.6, SD = 2.3, range = 0 – 8) and varying cognition (MoCA: M = 26.6, SD = 2.7, range = 22 – 30, 69% with MoCA>25). Based on consensus, 66% were classified as cognitively intact and 34% as cognitively impaired (e.g., MCI or dementia).

Measures

A neuropsychological battery assessing multiple cognitive domains was individually administered by trained research assistants, which included:

  • Symbol Digit Modalities Test (SDMT) is a test of divided attention and processing speed, and the score is number of correctly paired number and symbols in 90”.

  • Trail Making Test Part A (TMTA) is a test of visual scanning and processing speed, and the score is time to completion.

  • Trail Making Test Part B (TMTB) is a test of set shifting and processing speed, and the score is time to completion.

  • Phonemic fluency is a test of verbal fluency, and the score is total number of words produced across three 60” trials (F, A, and S).

  • Semantic fluency is a test of verbal fluency, and the score is total number of words produced across one 60” trial (animals).

  • Hopkins Verbal Learning Test-Revised (HVLT-R) Total Recall is a test of learning of a list of 12 words, and the score is total number of words recalled over three learning trials.

  • HVLT-R Delay Recall is a test of memory, and the score is total number of words recalled after a 20’ delay.

Higher scores on these measures indicated better cognition, with the exception of TMTA and TMTB.

Other measures that characterized the sample included:

  • GDS is a self-report screener for depression in older adults.

  • UPDRS Part III is a clinician-administered motor exam of 14 items.

  • Hoehn & Yahr scale classifies PD symptom progression into five stages of motor and functional disability.

Higher scores on these measures indicate more depression, motor impairment, or disability.

Procedures

Study procedures were approved by an institutional review board (University of Pennsylvania Institutional Review Board, PROTOCOL #: 820710, last approved 1/4/24), and all participants provided informed consent. The neuropsychological tests were administered twice (Time 1 and Time 2) across a mean 5.4 days (SD = 1.7, range=3–9), with the administration type (in-person or virtual) being randomized and counterbalanced, so that each participant was tested with each administration type. Alternate forms were not used for the neuropsychological measures, except for the HVLT-R. Finally, the same assessor completed both visits for participants to minimize inter-rater variability.

Data Analyses

Two sets of PE scores were calculated on seven cognitive test scores. First, a simple difference value was calculated as Time 2 – Time 1, which is consistent with traditional methods of examining PE. Second, the standardized regression-based (SRB) change formulae of Hammers et al. [12] quantified PE. Developed on 200 robustly cognitively intact older adults who were administered the same cognitive measures across one week, Time 1 scores, age, education, sex, and retest interval were used to predict Time 2 scores on each cognitive score. Predicted Time 2 scores were subtracted from observed Time 2 scores and divided by the standard error of the estimate from the regression models. The resulting z-score for each test indicated how much the current sample deviated from the expected PE in the Hammers et al. sample, with positive values indicating more improvement than expected, and negative values indicating less improvement than expected (with z-scores of ≤−1.645 indicating “smaller-than-expected” PE [12]). These z-scores were calculated for: SDMT, TMTA, TMTB, phonemic fluency, semantic fluency, HVLT-R Total Recall, and HVLT-R Delayed Recall. The signs of the z-scores were reversed for TMTA and TMTB so that all positive values indicating more improvement than expected.

Each set of PE scores were compared to 0 with a one-sample t-test to inform whether each set showed change in this sample relative to the expected no change point of 0. Second, the cognitively intact participants were compared to the cognitively impaired participants on each set of PE scores with independent t-tests. Finally, Pearson correlations examined each set of PE scores and depression (GDS-15) and motor functioning (UPDRS Part III, Hoehn & Yahr). A one- or two-tailed alpha level was set at .05 for all statistical analyses, depending on the expected directionality of the results (described below).

Results

Cognitive Change

For the simple difference scores, the means are all quite close to 0 (Table 1), suggesting “no change.” Statistically, none of simple difference scores significantly differed from 0 (all p-values > .05, two-tailed tests were used as either lower or higher scores were of interest). Conversely, the SRB z-scores indicated more change in the sample (Table 1). On the one-sample t-tests, both HVLT-R scores and the SDMT were significantly below 0 (Total Recall: t(34) = −6.80, p < .001, d = −1.15; Delayed Recall: t(34) = −5.32, p < .001, d = −0.90); SDMT: t(34) = −3.73, p < .001, d = −0.63), with medium to large effect sizes. Although other neuropsychological test scores failed to reach statistical significance, small to medium effect sizes were observed (e.g., TMTA d = −0.31, TMTB d = −0.22, Verbal Fluency FAS d = 0.29). Two-tailed tests were also used for these analyses.

Table 1.

Descriptive and One-Sample t-test Results

Time 1 Time 2

M SD M SD

Raw Scores

 HVLT-R – Total Recall 21.91 4.57 21.83 5.64


 HVLT-R – Delayed Recall 7.34 3.24 6.46 3.75


 TMT A 44.31 19.27 44.50 24.52


 TMT B 77.21 34.19 78.84 44.51


 SDMT 35.51 14.92 33.66 12.67


 Verbal Fluency – FAS 46.29 13.11 47.34 15.34


 Verbal Fluency - Animals 19.60 5.45 19.49 6.28

n M SD t(df) P Cohen’s d

Simple Difference Scores
 HVLT-R – Total Recall 35 −0.09 5.10 −0.10(34) .921 −0.02
 HVLT-R – Delayed Recall 35 −0.89 3.74 −1.40(34) .170 −0.24
 TMT A 32 −0.19 15.70 −0.07(31) .947 −0.01
 TMT B 19 −1.63 32.08 −0.22(18) .827 −0.05
 SDMT 35 −1.86 13.02 −0.84(34) .405 −0.14
 Verbal Fluency – FAS 35 1.06 8.75 0.72(34) .480 0.12
 Verbal Fluency - Animals 35 −0.11 4.30 −0.16(34) .876 −0.03
SRB Change z-Scores
 HVLT-R – Total Recall 35 −1.66 1.44 −6.80(34) <.001 −1.15
 HVLT-R – Delayed Recall 35 −2.60 2.89 −5.32(34) <.001 −0.90
 TMT A 32 −0.37 1.21 −1.73(31) .094 −0.31
 TMT B 19 −0.34 1.58 −0.95(18) .355 −0.22
 SDMT 35 −1.48 2.35 −3.73(34) <.001 −0.63
 Verbal Fluency – FAS 35 0.38 1.28 1.74(34) .091 0.29
 Verbal Fluency - Animals 35 −0.19 1.18 −0.93(34) .360 −0.16

Note. N = 35. Results of one-sample t-tests are presented above, with change scores compared to a test value of 0. Z-score signs were reversed for TMT A and B, with positive signs reflecting more improvement than expected and negative scores reflecting more decline than expected. HVLT-R = Hopkins Verbal Learning Test-Revised; SDMT = Symbol Digit Modalities Test; SRB = standardized regression-based; TMT = Trail Making Test.

Cognitive Change Between Subsamples

Simple difference scores suggested minimal change between subsamples (Table 2), with all independent t-test being nonsignificant (all p-values > .05, one-tailed, only worse performance in the impaired subsample was of interest). For the SRB z-scores, the impaired subsample exhibited more decline than the intact subsample (Table 2) on HVLT-R Total Recall (t(33) = 1.96, p = .029, d = 0.70), TMTA (t(30) = 1.89, p = .034, d = 0.72), TMTB (t(17) = 2.29, p = .018, d = 1.19), Verbal Fluency FAS (t(33) = 1.92, p = .032, d = 0.68), and Verbal Fluency Animals (t(33) = 1.82, p = .039, d = 0.65), with medium to large effect sizes.

Table 2.

Independent Samples t-test Results

Normal Cognition Impaired Cognition

Time 1 Time 2 Time 1 Time 2

Raw Scores M SD M SD M SD M SD

 HVLT-R – Total Recall 23.35 3.94 23.87 4.76 19.17 4.59 17.92 5.25

 HVLT-R – Delayed Recall 8.65 2.25 7.65 3.31 4.83 3.43 4.17 3.59

 TMT A 38.50 14.11 37.23 14.23 57.10 23.45 60.50 34.41

 TMT B 77.21 34.19 78.84 44.51 113.60 31.07 125.00 57.64

 SDMT 35.51 14.92 33.66 12.67 25.17 7.95 24.83 10.49

 Verbal Fluency – FAS 46.29 13.11 47.34 15.34 41.08 10.33 39.25 17.57

 Verbal Fluency - Animals 19.60 5.45 19.49 6.28 16.08 4.98 15.33 5.71

n M SD n M SD t(df) p d

Simple Difference Scores
 HVLT-R – Total Recall 23 0.52 5.08 12 −1.25 5.15 0.98(33) .168 0.35
 HVLT-R – Delayed Recall 23 −1.00 3.48 12 −0.67 4.36 −0.25(33) .403 −0.09
 TMT A 22 −1.27 8.45 10 3.40 25.80 0.78(30) .222 0.30
 TMT B 14 −1.86 16.44 5 11.40 59.93 0.79(17) .222 0.41
 SDMT 23 −2.65 15.70 12 −0.33 5.21 −0.50(33) .312 −0.18
 Verbal Fluency – FAS 23 2.57 7.91 12 −1.83 9.87 1.43(33) .081 0.51
 Verbal Fluency - Animals 23 0.22 4.12 12 −0.75 4.73 0.63(33) .268 0.22
SRB Change z-scores
 HVLT-R – Total Recall 23 −1.33 1.38 12 −2.30 1.41 1.96(33) .029 0.70
 HVLT-R – Delayed Recall 23 −2.13 2.74 12 −3.51 3.07 1.36(33) .092 0.48
 TMT A 22 −0.11 0.61 10 −0.95 1.91 1.89(30) .034 0.72
 TMT B 14 0.10 0.86 5 −1.59 2.48 2.29(17) .018 1.19
 SDMT 23 −1.57 2.80 12 −1.32 1.19 −0.29(33) .385 −0.11
 Verbal Fluency – FAS 23 0.67 1.02 12 −0.18 1.57 1.92(33) .032 0.68
 Verbal Fluency - Animals 23 0.07 1.10 12 −0.67 1.22 1.82(33) .039 0.65

Note. Results of independent-sample t-tests comparing the cognitively intact (N = 23) and cognitively impaired (N = 12) samples are presented above, with change scores compared to a test value of 0. Z-score signs were reversed for TMT A and B, with positive signs reflecting more improvement than expected and negative scores reflecting more decline than expected. HVLT-R = Hopkins Verbal Learning Test-Revised; SDMT = Symbol Digit Modalities Test; SRB = standardized regression-based; TMT = Trail Making Test. p-values, one-tailed. D = Cohen’s d.

Association with Functioning and Mood

For the simple difference scores, UPDRS Part III was correlated with TMTA (r = .48, p = .006) and HVLT-R Total Recall (r = −.37, p = .028). Hoehn & Yahr was correlated with TMTB (r = .52, p = .024), Verbal Fluency FAS (r = −.47, p = .005), and HVLT-R Total Recall (r = −.39, p = .020). For the SRB z-scores of change, UPDRS Part III was correlated with TMTA (r = −.73, p < .001), Verbal Fluency Animals (r = −.45, p = .007), and HVLT-R Total Recall (r = −.41, p = .015). Hoehn & Yahr was correlated with TMTB (r = −.74, p < .001), Verbal Fluency FAS (r = −.50, p = .002), TMTA (r = −.44, p = .012), HVLT-R Total Recall (r = −.43, p = .010), HVLT-R Delayed Recall (r = −.36, p = .032), and Verbal Fluency Animals (r = −.34, p = .049). For these correlations, smaller-than-expected PE were associated with worse motor functioning/greater disabilty. There was no association between either set of PE scores and GDS.

Discussion

Consistent with the results of Gallagher et al. [11], no PE were observed on several neuropsychological tests using simple difference scores on one-sample t-tests. The absence of PE aligns with prior literature in PD using deviation-based change scores [3,9,11]. However, when SRB z-scores were used to quantify PE, statistically significant declines (or smaller-than-expected PE) were observed on multiple cognitive measures. Individuals with PD showed PE on list learning, list recall, and processing speed that were approximately 1.5 – 2.5 standard deviations below expectations. The effect sizes for these changes across five days were in the medium to large range. Also noteworthy are the large standard deviations of these SRB z-scores, which indicate significant variability within the group, as some individuals likely show much greater change than the overall group. The studies that have utilized these more sophisticated methods have identified lower-than-expected cognitive change/PE in PD subjects [4,5,6,7,8,10]. Since these prior studies were confounded by long retest intervals and/or interventions in the interim, the current results might be the most rigorous examination of PE in PD. As seen in Alzheimer’s disease, smaller-than-expected PE might provide important insight into prognosis [1] and biomarker status [2] in PD. Interestingly, both individuals with PD and Alzheimer’s disease show smaller-than-expected PE on tests of learning and memory, but those with PD showed worse PE on speeded cognitive tests (e.g., SDMT, TMTA) [13].

Additional analyses validated the SRB method over the simple difference method, with the former identifying significantly more decline (or smaller-than-expected PE) on most neuropsychological scores in those who were impaired compared to those who were intact. For each group difference, the impaired group showed more decline/smaller-than-expected PE than the intact group. Despite this, cognitively intact participants showed poorer-than-expected change on HVLT-R and SDMT. As such, PE across brief retest intervals may prove useful for tracking progression of PD, regardless of the cognitive status of the individual. Prospective studies are needed to examine the prognostic value of PE in PD.

The SRB z-scores were also more strongly related to motor functioning and disease stage in this sample than the simple difference scores. Smaller-than-expected PE were associated with worse motor functioning and more severe disease stage. Therefore, PE appear related to other common markers of disease in PD. Future studies might examine how PE are related to other outcomes of interest in PD (e.g., dopamine transporter scan results, cerebrospinal fluid biomarkers, dopamine responsiveness).

Several limitations should be noted. First, the relatively small sample size increases the potential for underpowered analyses and over-estimation of effects. Second, the generalizability of these findings to diverse populations is limited. Third, while Gallagher et al. [11] found similar test performances between virtual and in-person visits, it is unclear if the resuls would be similar with consistent administration formats. Fourth, Gallagher et al. [11] utilized an alternate form for the HVLT-R, which may have reduced PE in the current analysis. Finally, although most participants were tested while on medication, this was not strickly enforced, which could have impacted the cognitive test results. Nonetheless, the daily dose of levodopa equivalence was the same at both testing sessions. Despite these limitations, our findings highlight the potential value of using SRB z-scores in tracking cognitive change over time in PD, extending the limited literature on short-term PE in PD, which has implications for clinical and research repeated cognitive assessments.

Supplementary Material

1

Highlights.

  • Practice effects on cognitive tests have been understudied in Parkinson’s disease

  • Quantified as simple difference scores, practice effects appeared absent

  • Using regression-based scores, smaller-than-expected practice effects were observed

  • Smaller practice effects related to poorer motor functioning and disease severity

Funding

We have no conflicts of interest to disclose. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Data Availability Statement

The datasets used and/or analyzed during the current study are available from the final author on reasonable request.

References

  • [1].Duff K, Lyketsos CG, Beglinger LJ, Chelune G, Moser DJ, Arndt S, Schultz SK, Paulsen JS, Petersen RC, & McCaffrey RJ (2011). Practice effects predict cognitive outcome in amnestic mild cognitive impairment. The American Journal of Geriatric Psychiatry, 19(11), 932–939. 10.1097/JGP.0b013e318209dd3a [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Duff K, Hammers DB, Koppelmans V, King JB, & Hoffman JM (2024). Short-term practice effects on cognitive tests across the late life cognitive spectrum and how they compare to biomarkers of Alzheimer’s disease. Journal of Alzheimer’s Disease, 99(1), 321–332. 10.3233/jad-231392 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Azuma T, Cruz RF, Bayles KA, Tomoeda CK, & Montgomery EB Jr (2003). A longitudinal study of neuropsychological change in individuals with Parkinson’s disease. International Journal of Geriatric Psychiatry, 18(11), 1043–1049. 10.1002/gps.1015 [DOI] [PubMed] [Google Scholar]
  • [4].Tröster AI, Woods SP, & Morgan EE (2007). Assessing cognitive change in Parkinson’s disease: Development of practice effect-corrected reliable change indices. Archives of Clinical Neuropsychology, 22(6), 711–718. 10.1016/j.acn.2007.05.004 [DOI] [PubMed] [Google Scholar]
  • [5].Higginson CI, Wheelock VL, Levine D, King DS, Pappas CT, & Sigvardt KA (2009). The clinical significance of neuropsychological changes following bilateral subthalamic nucleus deep brain stimulation for Parkinson’s disease. Journal of Clinical and Experimental Neuropsychology, 31(1), 65–72. 10.1080/13803390801982734 [DOI] [PubMed] [Google Scholar]
  • [6].Rinehardt E, Duff K, Schoenberg M, Mattingly M, Bharucha K, & Scott J (2010). Cognitive change on the repeatable battery of neuropsychological status (RBANS) in Parkinson’s disease with and without bilateral subthalamic nucleus deep brain stimulation surgery. The Clinical Neuropsychologist, 24(8), 1339–1354. 10.1080/13854046.2010.521770 [DOI] [PubMed] [Google Scholar]
  • [7].Rothlind JC, York MK, Carlson K, Luo P, Marks WJ, Weaver FM, Stern M, Follett M, & Reda D (2015). Neuropsychological changes following deep brain stimulation surgery for Parkinson’s disease: Comparisons of treatment at pallidal and subthalamic targets versus best medical therapy. Journal of Neurology, Neurosurgery & Psychiatry, 86(6), 622–629. 10.1136/jnnp-2014-308119 [DOI] [PubMed] [Google Scholar]
  • [8].Schoenberg MR, Rinehardt E, Duff K, Mattingly M, Bharucha KJ, & Scott JG (2012). Assessing reliable change using the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) for patients with Parkinson’s disease undergoing deep brain stimulation (DBS) surgery. The Clinical Neuropsychologist, 26(2), 255–270. 10.1080/13854046.2011.653587 [DOI] [PubMed] [Google Scholar]
  • [9].Witt K, Daniels C, Reiff J, Krack P, Volkmann J, Pinsker MO, Krause M, Tronnier V, Kloss M, Schnitzler A, Wojtecki L, Bötzel K, Danek A, Hilker R, Sturm V, Kupsch A, Karner E, & Deuschl G (2008). Neuropsychological and psychiatric changes after deep brain stimulation for Parkinson’s disease: A randomised, multicentre study. The Lancet Neurology, 7(7), 605–614. 10.1016/S1474-4422(08)70114-5 [DOI] [PubMed] [Google Scholar]
  • [10].Turner TH, Renfroe JB, Elm J, Duppstadt-Delambo A, & Hinson VK (2016). Robustness of reliable change indices to variability in Parkinson’s disease with mild cognitive impairment. Applied Neuropsychology: Adult, 23(6), 399–402. 10.1080/23279095.2016.1160907 [DOI] [PubMed] [Google Scholar]
  • [11].Gallagher J, Mamikonyan E, Xie SX, Tran B, Shaw S, & Weintraub D (2023). Validating virtual administration of neuropsychological testing in Parkinson disease: A pilot study. Scientific Reports, 13(1), 16243. 10.1038/s41598-023-42934-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Hammers DB, Suhrie KR, Dixon A, Porter S, & Duff K (2021). Validation of one-week reliable change methods in cognitively intact community-dwelling older adults. Aging, Neuropsychology, and Cognition, 28(3), 472–492. 10.1080/13825585.2020.1787942 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Duff K, Hammers DB, Koppelmans V, King JB, Hoffman JM. Short-Term Practice Effects on Cognitive Tests Across the Late Life Cognitive Spectrum and How They Compare to Biomarkers of Alzheimer’s Disease. J Alzheimers Dis. 2024;99(1):321–332. doi: 10.3233/JAD-231392. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Data Availability Statement

The datasets used and/or analyzed during the current study are available from the final author on reasonable request.

RESOURCES