Abstract
INTRODUCTION
Practice effects are an improvement in task performance with repeated testing. Their absence may indicate compromised learning and may help discriminate healthy from pathological ageing.
METHODS
We recorded semantic verbal fluency three times in n = 58 healthy older adults or patients with amnestic mild cognitive impairment (MCI) (72.16 ± 4.83 years old, 33 women). We extracted speech features and trained a machine learning classifier on them at each cognitive assessment. We examined which variables were informative for classification and whether they correlated with episodic memory performance.
RESULTS
We found smaller practice effects in patients with amnestic MCI. There was a 13% improvement in classification performance with features from the third cognitive assessment as compared to the first assessment. Practice effects correlated with episodic memory performance in healthy adults.
DISCUSSION
Speech features became more informative for classification when repeatedly assessed. They may be a promising tool for identifying individuals at risk of cognitive decline.
Highlights
In MCI, practice effects in verbal fluency tasks were smaller than in healthy adults.
Smaller practice effects in MCI indicated compromised learning.
Including practice effects improved the classification of MCI vs. healthy ageing.
In MCI, practice effects were independent of episodic memory performance.
Keywords: automated speech analysis, compromised learning, machine learning, practice effects, semantic verbal fluency
1.
Practice effects are an improvement in task performance due to increased familiarity with a task 1 . They can be viewed as a confounding factor or as valuable information, depending on the research question that is being asked. In longitudinal studies that track cognitive change over time, practice effects may hinder an accurate assessment of this change. When predicting cognitive impairment, however, absent or reduced practice effects may indicate compromised learning and therefore be helpful in discriminating healthy from pathological ageing. 2 , 3 , 4 , 5 Previous studies in patients with mild cognitive impairment (MCI) examined practice effects in semantic verbal fluency tasks. 2 , 6 , 7 They found that patients with MCI showed either no practice effects at all or smaller practice effects than healthy older adults, when assessed a second time after 1 week. 2 , 6 , 7 These studies focused on the number of retrieved words but did not consider other aspects of semantic verbal fluency, such as word frequency or the time it took to produce words. Hence, little is known about which specific features of semantic verbal fluency tasks, beside the classic word count, show practice effects and which do not. This may be important since the assessment of practice effects in semantic verbal fluency tasks could be a reliable, cost‐effective, and easy‐to‐administer measure of cognitive decline. In addition, if practice effects emerge even after short periods between assessments, they could provide an efficient way to track variations in cognitive performance. This would be valuable for remote assessments, where verbal fluency can be conducted over the phone, but probably also in clinical settings where assessments are often repeated every 6–12 months.
We therefore repeatedly tested semantic verbal fluency in a sample of healthy elderly volunteers and patients with amnestic MCI. We used a machine learning classifier, trained on speech features extracted from each task at each cognitive assessment. We further examined which speech features were particularly helpful to discriminate between both groups. Finally, we correlated those features with performance in an episodic memory task to find out whether there would be an association between practice effects in semantic verbal fluency tasks and learning in general. We hypothesized that information extracted from repeated assessments of semantic fluency would differentiate healthy from pathological ageing with significantly higher accuracy than information extracted from a single assessment. In addition, we hypothesized that change scores extracted from verbal fluency would be significantly associated with performance in an episodic memory task.
RESEARCH IN CONTEXT
Systematic review: Research on practice effects as a potential indicator of cognitive decline has been identified. Practice effects have mostly been viewed as a source of error, but only recently has their potential for identifying cognitive impairment been recognized.
Interpretation: Including practice effect from repeated testing improved the classification of healthy vs. pathological ageing. In patients with MCI, practice effects were independent of episodic memory performance.
Future directions: There is a clear need for larger studies investigating practice effects, particularly in early, non‐clinical stages of dementia.
2. MATERIAL AND METHODS
2.1. Participants
We included n = 58 participants in the study (n = 29 patients with amnestic MCI and n = 29 healthy elderly volunteers; Table 1). All participants were first screened over the phone. They had to be fluent in German, with normal or corrected‐to‐normal vision and no history of psychiatric or neurological disorders. Further exclusion criteria were current use of psychotropic medication, current or life‐time drug abuse or addiction, brain damage, or sleep disorders. We evaluated depressive symptoms with the Geriatric Depression Scale (GDS) 8 and included those with a score ≤ 5. We recruited patients with amnestic MCI from the Centre for Geriatric Medicine and Gerontology at the University Medical Centre Freiburg in Germany. During the diagnostic process, they received MR Imaging, laboratory diagnostics, and a functional assessment. To be diagnosed with amnestic MCI, they had to show impairment (1.5 standard deviations below age‐, gender‐, and education‐adjusted norms) in the delayed recall of a previously learned list of words (i.e., single‐domain amnestic MCI, n = 9). In addition, they may have shown impairment in other cognitive domains (multi‐domain amnestic MCI, n = 20). Additionally, they needed to (a) report memory complaints; (b) show no impairment in activities of daily living, and (c) no dementia according to established criteria. 9 They also had to fulfil criteria for a diagnosis of MCI due to Alzheimer's disease (AD) with intermediate certainty according to revised criteria. 10 That is, they needed to show signs of neuronal injury (i.e., hippocampal volume or medial temporal atrophy by volumetric measures of visual rating). Healthy controls were recruited via newspaper advertisements and flyers. They were included when no signs of cognitive impairment were found (i.e., the Montreal Cognitive Assessment (MoCA) score was ≥ 23 as recommended by Carson and colleagues. 11 , 12 All participants gave written informed consent. The Ethics Committee of Freiburg University approved the study. The study conformed to the Declaration of Helsinki.
TABLE 1.
Sociodemographic characteristics of the sample (mean ± standard deviations).
| Parameter | Healthy volunteers | Patients with amnestic MCI | p‐Value |
|---|---|---|---|
| N | 29 | 29 | |
| Sex (female/male) | 19/10 | 14/15 | 0.29 (Χ 2) |
| Age (years) | 71.10 ± 4.74 | 73.21 ± 4.77 | 0.10 |
| Education (years) | 14.66 ± 3.36 | 13.34 ± 3.31 | 0.14 |
| MoCA | 26.83 ± 1.91 | 22.07 ± 3.28 | < 0.001 |
| Immediate recall | 50.83 ± 9.59 | 32.48 ± 8.55 | < 0.001 |
| Delayed free recall, T0 | 10.48 ± 3.62 | 3.21 ± 3.07 | < 0.001 |
| Delayed free recall, T1 | 9.21 ± 3.95 | 2.31 ± 2.98 | < 0.001 |
| Delayed free recall, T2 | 7.97 ± 4.50 | 1.86 ± 2.64 | < 0.001 |
Note: We used t‐tests for group comparisons if not stated otherwise. Immediate recall = sum of all retrieved words across five trials of immediate retrieval in an episodic memory task.
Abbreviations: N, number; MCI, mild cognitive impairment; MoCA, Montreal Cognitive Assessment.
2.2. Study procedure
We collected data three times on 2 consecutive days. On day 1, the participants completed the MoCA and the GDS and then we examined semantic verbal fluency and verbal episodic memory (Verbal Learning and Memory Test [VLMT]) 13 performance. After a pause of approximately 1 h, we tested semantic verbal fluency and verbal episodic memory performance a second time. On day 2, we tested semantic verbal fluency and memory performance a third time and applied other cognitive tests.
2.3. Semantic verbal fluency task
We asked participants to name as many different four‐legged animals as possible within 60 seconds and to avoid repetitions. We collected speech recordings of all participants with a microphone on a computer. Subsequently, we trained students from the field of computational linguistics to transcribe these recordings in PRAAT, 14 a package for speech analysis in phonetics that additionally allows users to manually annotate a recording and time align the annotation. As described in the Pitt corpus and the CHAT protocol, 15 every single participants’ utterance was transcribed (including thinking aloud patterns and unintentional verbalizations of cognitive updating processes, e.g., “what else is there” or “cat, um, cat, cat, cat, what else, dog” or “did I say that already”). To avoid overestimating those repetitions, we discarded consecutive repetitions but not repetitions in general.
2.4. Feature extraction
From the transcripts of semantic verbal fluency, we extracted features based on our previous studies. 16 , 17 , 18 , 19 These features included word count, mean transition length between words, word frequency, (i.e., the frequency with which words where uttered), and number of repetitions. In addition, we included temporal features which are based on temporal clusters, that is, groups of words where the end of one word is close to the start of the next word. 17 , 20 These temporal features included number of switches between temporal clusters, number of clusters, mean length of clusters as well as mean length between clusters or switch length. Similar to Linz et al., 17 we split transcripts into six 10‐second slices. Hence, we used features for classification analyses that were either extracted from the entire transcript or from 10‐second slices.
2.5. VLMT
The VLMT examines verbal immediate and long‐term memory. Participants were asked to listen to a list of 15 words and to verbally recall as many words as possible thereafter. There were five trials in which the words were presented and immediately retrieved (to assess a rate of learned words, a so‐called immediate recall). After a retention interval (∼ 20 min), a delayed free recall followed without rehearsal. For the delayed free recall at the second (i.e., T1) and third (i.e., T2) assessment, we again asked the participants to recall as many words as possible without rehearsal.
2.6. Statistical analysis
We were primarily interested in differentiating healthy elderly volunteers from patients with amnestic MCI based on speech features. We therefore used a neural network‐based method for a binary classification. Given the small data set, we applied a feedforward architecture with two layers, each with a dropout of 0.5 to avoid overfitting. 21 All analyses were implemented with sci‐kit learn 22 and TensorFlow. 23 For the feed forward model, we used the Adam optimizer 24 with a default learning rate and a binary cross entropy loss function. To find out how important any individual feature was for classification, we used a permutation importance algorithm, where the importance of each feature was estimated by shuffling the feature vector and rerunning the classifier with this randomized vector. Permutation importance was run separately for each feature. We extracted feature vectors for each of the three assessments (T0, T1, T2) and calculated the classification performance with each vector. For each timepoint, this resulted in 58 vectors (i.e., one for each participant). We used nested cross‐validation to tune hyperparameters in an inner cross validation loop with three splits using a 67% training set and a 33% test set. Hyperparameters that performed best in an inner cross‐validation were then used in an outer cross‐validation loop with five splits and a 80% training set and a 20% test set. We report performance metrics that were computed using five repetitions of cross‐validation. Permutation importance was computed using 50 random permutations per cross‐validation loop (i.e., 250 permutations in total). We report the area under the receiver operator characteristic curve (AUC) as well as negative predictive value (NPV), positive predictive value (PPV), sensitivity, and specificity. We then compared the AUC at T2 with values obtained at T0 or T1 using a permutation test, since we assumed an increased level of learning (or stronger practice effects) at T2. We used a permutation test with 10.000 random permutations to assess the probability of obtaining the observed results by chance. 25 In each permutation, we randomly shuffled the class labels (0 or 1) in the prediction vectors for each classifier. The AUC was then computed for each permuted dataset. Permutation p‐values were calculated as the proportion of permuted AUC values higher than (or equal to) the observed AUC. Next, we selected features with positive permutation importance (i.e., permutation importance > 0)—those most effective for distinguishing between the two groups—to assess potential differences in practice effects. To do this, we used analysis of variance (ANOVA) with “group” (healthy volunteers or patients with amnestic MCI) as between‐subject factor and “time‐point” (T0, T1, T2) as within‐subject factor. We were particularly interested in whether we would find an interaction between “group” and “time‐point” as this would indicate a difference in practice effects between groups. Next, we calculated Cohen's d effect sizes to examine how strongly the groups differed in those features. We tested how participants within a group changed over time and how any difference between the two groups changed over time. Finally, we correlated the change in relevant speech features with immediate or delayed recall performance in an episodic memory task to examine whether practice effects would be related to learning in general. We used the slope of a linear regression fitted to the values of each feature for each participant at timepoints T0, T1, and T2. To account for baseline differences in task performance, we divided the slope by the intercept. With this we obtained a single variable representing an unbiased learning rate (i.e., change in speech features). We then calculated Pearson correlations between immediate or delayed recall of the episodic memory task and the change in speech features (slopes or corrected slopes, that is, slopes divided by the intercept).
3. RESULTS
The two groups were similar regarding age, sex, and education (Table 1). Healthy elderly volunteers had significantly higher MoCA scores, and they learned and retrieved significantly more words in an episodic memory task than patients with amnestic MCI (all p < 0.001; Table 1).
3.1. Classification performance increased with data from repeated assessment
When using speech features from the first or second verbal fluency assessment, performance of the classifier in terms of AUC was 0.57 or 0.60. When using features from the last verbal fluency assessment (i.e., T2), performance steeply increased to an AUC of 0.73 (Figure 1). Likewise, sensitivity, specificity as well as positive and negative predictive values increased at T2 (Table S1). These results indicate that with repeated assessment of semantic verbal fluency, there was a 73% chance that the model was able to distinguish between healthy volunteers and patients with amnestic MCI (see Figure S1 for the permutation importance of each feature).
FIGURE 1.

Receiver operating characteristic (ROC) curve for all models, calculated with data from each assessment separately (i.e., T0, T1, T2). The curves were aggregated over five runs from a five‐fold cross‐validation.
3.2. Patients with amnestic MCI show smaller practice effects than healthy older adults in four different speech features
We next compared speech features that were particularly helpful for distinguishing healthy elderly volunteers from patients with amnestic MCI (Table 2, Figure 2). We found a significant interaction between group and time‐point for word count (F (2, 112) = 4.14, p = 0.018), cluster count (F (2, 112) = 3.51, p = 0.033), number of temporal switches (F (2, 112) = 3.59, p = 0.031), and mean transition length (F (2, 112) = 3.61, p = 0.030). This indicates that with repeated assessment, healthy volunteers and patients with amnestic MCI changed differently in those four features. Next, we calculated differences in Cohen's d values to examine the extent of those changes within and between the two groups. In healthy volunteers, effect sizes increased from small to large with repeated assessment, while they remained small in patients with amnestic MCI (Table 3). This indicates compromised learning or reduced practice effects specifically among patients with amnestic MCI. This finding is further supported by the increasing effect sizes in the differences between groups, indicating that with repeated assessment, the difference between both groups became even more pronounced (Table 3).
TABLE 2.
Features that were important for the classification of healthy ageing or amnestic MCI.
| Group | Time‐point | Interaction | |||||
|---|---|---|---|---|---|---|---|
| Feature | PI | F‐Value | p‐Value | F‐Value | p‐Value | F‐value | p‐value |
| Cluster count | 0.013 | 5.316 | 0.025 | 4.902 | 0.009 | 3.51 | 0.033 |
| Word frequency | 0.019 | 21.493 | < 0.001 | 8.403 | < 0.001 | 1.745 | 0.179 |
| Word count | 0.070 | 23.802 | < 0.001 | 5.101 | 0.008 | 4.137 | 0.018 |
| Repetition count | 0.011 | 2.288 | 0.136 | 1.988 | 0.142 | 0.578 | 0.661 |
| Transition length | 0.012 | 15.623 | < 0.001 | 1.405 | 0.250 | 3.61 | 0.030 |
| Frequency transition length | 0.010 | 5.013 | 0.029 | 0.005 | 0.712 | 0.341 | 0.712 |
| Word frequency range | 0.009 | 14.148 | < 0.001 | 5.025 | 0.008 | 1.748 | 0.179 |
| No. of temporal switches | 0.002 | 5.784 | 0.020 | 6.827 | 0.002 | 3.598 | 0.031 |
Note: We compared those features using an analysis of variance with a between‐subject factor “group” and a within‐subject factor “time‐point” or their interaction. Features with a significant interaction between “group” and “time‐point” are shown in bold.
Abbreviation: PI, permutation importance.
FIGURE 2.

Speech features that changed significantly differently in healthy elderly volunteers (blue) and patients with amnestic mild cognitive impairment (MCI, red) when semantic verbal fluency was tested three times (T0, T1, T2).
TABLE 3.
Cohen's d effect sizes for the change in semantic verbal fluency features within healthy elderly volunteers or patients with MCI (left, middle) or between healthy elderly volunteers and patients with MCI (right) when tested three times (T0, T1, T2).
| Healthy volunteers | Patients with amnestic MCI | Between both groups | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Feature | T1‐T0 | T2‐T1 | T2‐T0 | T1‐T0 | T2‐T1 | T2‐T0 | T0 | T 1 | T 2 |
| Word count | 0.245 | 0.455 | 0.618 | 0.067 | 0.0 | 0.060 | 0.684 | 1.121 | 1.334 |
| Cluster count | 0.614 | 0.257 | 0.780 | 0.059 | 0.028 | 0.081 | −0.027 | 0.582 | 0.694 |
| Transition length | −0.297 | −0.432 | −0.627 | −0.164 | 0.311 | 0.161 | −0.27 | −0.7 | −1.09 |
| No. of temporal switches | 0.659 | 0.271 | 0.874 | 0.059 | 0.141 | 0.189 | −0.027 | 0.630 | 0.698 |
Abbreviation: MCI, mild cognitive impairment.
3.3. Practice effects in a semantic verbal fluency task correlated with episodic memory performance, but only in healthy volunteers
Finally, we correlated the change in those four features (from T0 to T2) with immediate or delayed recall performance (at T0, T1, or T2) in an episodic memory task. We found significant positive correlations between cluster count slope and immediate (r = 0.27, p = 0.042) or delayed recall (T0, r = 0.32, p = 0.042; T1, r = 0.35, p = 0.008; T2, r = 0.35, p = 0.009) as well as number of switches slope (immediate recall: r = 0.28, p = 0.033; delayed recall T0, r = 0.33, p = 0.013; T1, r = 0.34, p = 0.009; T2, r = 0.34, p = 0.008). We found similar results when using the corrected slope of cluster count (delayed recall: T0, r = 0.27, p = 0.042; T1, r = 0.31, p = 0.018; T2, r = 0.31, p = 0.017) and number of switches (delayed recall: T2, r = 0.26, p = 0.049). This indicates that the better the participants were in retrieving a list of words, the more they improved in semantic verbal fluency over time. When we divided the sample, however, the correlations did not reach significance in either group. There was, yet, a noticeable trend in healthy volunteers for all three delayed recalls (see Figure 3 for an example).
FIGURE 3.

Correlation between cluster count slope and delayed recall in an episodic memory task at T2 in healthy elderly volunteers (blue) and patients with amnestic mild cognitive impairment (MCI, red). Note that points with similar values may overlap.
4. DISCUSSION
The aim of this study was to determine whether practice effects, observed through repeated cognitive assessments, differed between healthy elderly volunteers and patients with amnestic MCI. In addition, we examined whether repeatedly calculating models based on task performance at three consecutive cognitive assessments can be used to classify healthy individuals and patients with amnestic MCI. We assessed and recorded semantic verbal fluency three times, extracted speech features, and used them for classification purposes. We found an AUC of up to 0.60 for the first and second assessments. This AUC value increased to 0.73 when we used speech features from the third and final assessment. The performance of our classifier is in line with previous studies. So far, data from neuropsychological tests or structural/ functional MRI data (sometimes in conjunction with positron emission tomography [PET] data) were used to differentiate healthy elderly volunteers from patients with MCI. 26 , 27 , 28 Only few studies have used semantic verbal fluency data to accomplish that, and neither of these studies has tried to use practice effects. 16 , 29 Some studies tested semantic verbal fluency once and used these data for classification purposes. 16 , 30 , 31 They achieved AUCs of 0.71–0.76, similar to our study. These studies have used different methodological approaches using a single cognitive assessment. In contrast, we used nested cross‐validation for three repeated assessments. Hence, their results may not be directly comparable to our study. In our study, the model including repeated assessments better classified healthy individuals and patients with amnestic MCI. In comparison with MRI and/or PET data, the performance of our classifier was worse, since those studies achieved classification accuracies between 77% and 95%. 26 , 32 However, semantic verbal fluency tasks are easy to administer, cost‐effective, and suitable for settings where no MRI or PET scanner is available or where remote assessments are necessary. When combined with automated speech analysis, there may even be no requirement for qualified personnel, as data can be recorded, analyzed, and interpreted automatically. Using an MRI leads to slightly higher accuracy but demands more personnel, is considerably more expensive, and scanning (as well as data processing) takes longer. Therefore, one needs to consider whether cheap and user‐friendly tests that take only 3 minutes are preferred to an MRI or PET examination. A disadvantage may be that individuals have to come back a second day, which is not the case for an MRI or PET examination. Examining semantic verbal fluency over the phone may obviate the need for an on‐site visit altogether and would hence save both time and money.
We found that eight speech features were informative for classification, but only four of them showed a significant interaction between group and time‐point. Among those, we identified a classic quantitative feature (word count) and three additional quantitative features (temporal switches, transition length, and cluster count). Consistent with previous findings, 6 , 7 only healthy elderly volunteers demonstrated improvements in their quantitative performance, characterized by an increase in the number of animals produced over time. Our study builds on these previous findings as we observed similar patterns in speech features, other than the classical word count. Here, again, only healthy elderly volunteers improved with repeated assessment as they increased their cluster size, reduced the transition length between words, and switched more between clusters over time. This means that practice effects are observable both in different speech features, at least in healthy volunteers. In contrast, patients with amnestic MCI demonstrated compromised learning abilities, pointing to smaller practice effects. They did not produce more animals, and they did not learn to increase cluster size or to reduce the transition length between clusters. In addition, effect sizes for patients with amnestic MCI remained small when comparing the three assessments to each other. Hence, there were only minimal differences in performance among the three assessments. In contrast, in healthy elderly volunteers, the effect sizes increased from small to large, indicating substantial differences in performance between the three assessments. We also found that the differences between both groups became increasingly prominent over time, as indicated by increased effect sizes between groups. These results indicate that healthy elderly volunteers improved in semantic verbal fluency with practice, both in terms of quality and quantity, while patients with MCI did not. Finally, we found positive correlations between different speech features and task performance in a verbal episodic memory test. These correlations indicate that more pronounced practice effects in a semantic verbal fluency task were associated with better immediate and long‐term memory. The correlations were, however, small, and only evident in healthy volunteers. This was somewhat unexpected since we assumed that in patients with amnestic MCI, the level of episodic memory impairment should be associated with an overall impaired level of learning. This was not the case, indicating that an impairment in episodic memory does not predict the absence of practice effects. In addition, it suggests that separate mechanisms govern the two types of learning. However, the low variance in the data of patients with amnestic MCI made it more difficult to find factors explaining that variance, hence, hindering the detection of associations within the data.
At the neuroanatomical level, practice effects have been investigated rarely and mostly in younger adults. The results of these studies indicate that practice turned controlled processing to automated processing, irrespective of the type of cognitive task. With practice, there was also less deactivation in the default‐mode‐network, 33 , 34 that is, a set of brain regions whose activity is suppressed during goal‐directed cognition in younger adults. 35 In contrast, episodic memory is closely linked to the medial temporal lobe. Future studies may explore whether practice effects in episodic memory differ from practice effects in semantic verbal fluency in older adults. In addition, it would be important to test the neuroanatomical correlates of absent practice effects in patients with amnestic MCI.
There may be several reasons why four speech features were informative for classification, but we did not find a significant interaction between group and time‐point in them. First, our neural network models were not directly based on cognitive change over time but instead classified groups based on speech features at different time points. This constitutes an indirect measure of practice effects indicating that speech features became more informative when repeatedly assessed. Next, neural networks are complex and do not rely on assumptions about the relationship between variables. Hence, they consider associations between variables that extend beyond linear relationships (e.g., quadratic, cubic) as well as interactions. On the other hand, neural networks do not provide insights into the relationship between individual variables or whether practice effects occurred over time. This finer‐grained analysis was instead revealed by using analyses of variance (ANOVAs). There may as well be aspects of speech in the task that distinguish between diagnostic groups—aspects that are captured by the classifier—but that are not necessarily practice effects.
Taken together, our study demonstrates that practice effects in verbal fluency differed between healthy and pathological ageing. That difference helped discriminate between both groups. Semantic verbal fluency tasks, when administered repeatedly and analysed with automated speech analysis, may therefore be a promising tool for identifying individuals at risk of cognitive decline or those who should undergo further cognitive assessment.
5. LIMITATIONS
Our study may have several limitations. First, we assessed semantic verbal fluency in a single language, German. We thus do not know whether our models perform similarly in verbal fluency data using other languages. For instance, semantic relatedness may differ between languages depending on the word category used. This needs to be the subject of future studies. Second, our sample was rather small. Hence, our findings may not be transferable to the general population. Third, the diagnosis of amnestic MCI did not include biomarkers (e.g., CSF or PET data) but signs of neuronal injury. Hence, there was intermediate certainty that MCI was due to Alzheimer's disease pathology, according to established criteria. 10 Finally, we used a repeated‐measures design with a short time‐interval between assessments. It would be interesting to follow participants over a longer period to assess their conversion rate in relation to practice effects in the task.
6. CONCLUSION
The results of our study indicate that a repeated assessment of semantic verbal fluency is beneficial for differentiating healthy elderly volunteers from patients with amnestic MCI. Given that only healthy elderly volunteers improved their semantic verbal fluency performance, practice effects appear to be a valuable measure derived from cognitive assessments. Semantic verbal fluency tasks are straightforward to administer, making them a practical tool for identifying individuals in preclinical or early stages of dementia who may warrant further cognitive evaluation or dementia diagnostics.
CONFLICT OF INTEREST STATEMENT
The authors have no conflicts of interest. Author disclosures are available in the Supporting Information.
CONSENT STATEMENT
All participants provided written informed consent.
Supporting information
Supporting Information
Supporting Information
ACKNOWLEDGMENTS
The authors have nothing to report. J. P. received funding from the Swiss National Science Foundation (Grant number 218252).
Dörr F, Grandjean L, Tröger J, Peter J. The classification of mild cognitive impairment or healthy ageing improves when including practice effects derived from a semantic verbal fluency task. Alzheimer's Dement. 2025;17:e70127. 10.1002/dad2.70127
REFERENCES
- 1. Hausknecht JP, Halpert JA. Retesting in selection: a meta‐analysis of coaching and practice effects for tests of cognitive ability. J Appl Psychol. 2007;92:373‐385. doi: 10.1037/0021-9010.92.2.373 [DOI] [PubMed] [Google Scholar]
- 2. Duff K, Beglinger LJ, Van Der Heiden S, et al. Short‐term practice effects in amnestic mild cognitive impairment: implications for diagnosis and treatment. IPG. 2008;20:986‐99. doi: 10.1017/S1041610208007254 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Hassenstab J, Ruvolo D, Jasielec M, Xiong C, Grant E, Morris JC. Absence of practice effects in preclinical Alzheimer's disease. Neuropsychology. 2015;29:940‐948. doi: 10.1037/neu0000208 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Jutten RJ, Grandoit E, Foldi NS, et al. Lower practice effects as a marker of cognitive performance and dementia risk: a literature review. Alzheimers Dement. 2020;12:e12055. doi: 10.1002/dad2.12055 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Sánchez‐Benavides G, Gispert JD, Fauria K, Molinuevo JL, Gramunt N. Modeling practice effects in healthy middle‐aged participants of the Alzheimer and families parent cohort. Alzheimers Dement. 2016;4:149‐158. doi: 10.1016/j.dadm.2016.07.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Cooper DB, Epker M, Lacritz L, et al. Effects of Practice on Category Fluency in Alzheimers Disease*. Clin Neuropsychol. 2001;15:125‐128. doi: 10.1076/clin.15.1.125.1914 [DOI] [PubMed] [Google Scholar]
- 7. Cooper DB, Lacritz LH, Weiner MF, Rosenberg RN, Cullum CM. Category fluency in mild cognitive impairment: reduced effect of practice in test‐retest conditions. Alzheimer Dis Assoc Disord. 2004;18:120‐122. doi: 10.1097/01.wad.0000127442.15689.92 [DOI] [PubMed] [Google Scholar]
- 8. Yesavage JA, Sheikh JI. Geriatric Depression Scale (GDS): recent evidence and development of a shorter version. Clin Gerontol. 1986;5:165‐173. doi: 10.1300/J018v05n01_09 [DOI] [Google Scholar]
- 9. Petersen RC. Mild cognitive impairment as a diagnostic entity. J Intern Med. 2004;256:183‐194. doi: 10.1111/j.1365-2796.2004.01388.x [DOI] [PubMed] [Google Scholar]
- 10. Petersen RC, Caracciolo B, Brayne C, Gauthier S, Jelic V, Fratiglioni L. Mild cognitive impairment: a concept in evolution. J Intern Med. 2014;275:214‐228. doi: 10.1111/joim.12190 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Carson N, Leach L, Murphy KJ. A re‐examination of Montreal Cognitive Assessment (MoCA) cutoff scores. Int J Geriatr Psychiatry. 2018;33:379‐388. doi: 10.1002/gps.4756 [DOI] [PubMed] [Google Scholar]
- 12. Nasreddine ZS, Phillips NA, Bédirian V, et al. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. J Am Geriatr Soc. 2005;53:695‐699. doi: 10.1111/j.1532-5415.2005.53221.x [DOI] [PubMed] [Google Scholar]
- 13. Helmstaedter C, Lendt M, Lux S. Verbaler Lern‐ und Merkfähigkeitstest. Göttingen: Hogrefe; 2001. [Google Scholar]
- 14. Boersma P, Weenink D. PRAAT, a system for doing phonetics by computer. Glot International. 2001;5:341‐345. [Google Scholar]
- 15. MacWhinney B. The CHILDES project: Tools for analyzing talk: Transcription format and programs. 3rd ed.. Lawrence Erlbaum Associates Publishers; 2000. [Google Scholar]
- 16. König A, Linz N, Tröger J, Wolters M, Alexandersson J, Robert P. Fully automatic speech‐based analysis of the semantic verbal fluency task. Dement Geriatr Cogn Disord. 2018;45:198‐209. doi: 10.1159/000487852 [DOI] [PubMed] [Google Scholar]
- 17. Linz N, Lundholm Fors K, Lindsay H, Eckerström M, Alexandersson J, Kokkinakis D, Temporal analysis of the semantic verbal fluency task in persons with subjective and mild cognitive impairment. Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology, Minneapolis, Minnesota: Association for Computational Linguistics; 2019, p. 103‐113. doi: 10.18653/v1/W19-3012 [DOI] [Google Scholar]
- 18. Linz N, Tröger J, Alexandersson J, König A, Using neural word embeddings in the analysis of the clinical semantic verbal fluency task. IWCS 2017 ‐ 12th International Conference on Computational Semantics, Sep 2017, Montpellier, France. pp.1‐7. .
- 19. Tröger J, Linz N, König A, et al. Exploitation vs. exploration‐computational temporal and semantic analysis explains semantic verbal fluency impairment in Alzheimer's disease. Neuropsychologia. 2019;131:53‐61. doi: 10.1016/j.neuropsychologia.2019.05.007 [DOI] [PubMed] [Google Scholar]
- 20. Dörr F, Schäfer S, Öhman F, et al. Dissociating memory and executive function impairment through temporal features in a word list verbal learning task. Neuropsychologia. 2023;189:108679. doi: 10.1016/j.neuropsychologia.2023.108679 [DOI] [PubMed] [Google Scholar]
- 21. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15:1929‐1958. [Google Scholar]
- 22. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit‐learn: machine learning in python. J Mach Learn Res. 2011;12:2825‐2830. [Google Scholar]
- 23. Abadi M, Agarwal A, Barham P, et al. TensorFlow: Large‐Scale Machine Learning on Heterogeneous Distributed Systems. 2016. doi: 10.48550/ARXIV.1603.04467 [DOI]
- 24. Kingma DP, Adam BaJ, A Method for Stochastic Optimization. 3rd International Conference on Learning Representations, San Diego, CA, USA, ILR;2015. [Google Scholar]
- 25. Edgington ES. Randomization Tests. J Psychol. 1964;57:445‐449. doi: 10.1080/00223980.1964.9916711 [DOI] [PubMed] [Google Scholar]
- 26. Ortiz A, Munilla J, Álvarez‐Illán I, Górriz JM, Ramírez J. Exploratory graphical models of functional and structural connectivity patterns for Alzheimer's Disease diagnosis. Front Comput Neurosci. 2015;9:132. doi: 10.3389/fncom.2015.00132 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Suk H‐I, Lee S‐W, Shen D. Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. Neuroimage. 2014;101:569‐582. doi: 10.1016/j.neuroimage.2014.06.077 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Ibrahim B, Suppiah S, Ibrahim N, et al. Diagnostic power of resting‐state fmri for detection of network connectivity in Alzheimer's disease and mild cognitive impairment: a systematic review. Hum Brain Mapp. 2021;42:2941‐2968. doi: 10.1002/hbm.25369 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Paula F, Wilkens R, Idiart M, Villavicencio A, Similarity measures for the detection of clinical conditions with verbal fluency tasks. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana, 2. Association for Computational Linguistics; 2018, p. 231‐235. doi: 10.18653/v1/N18-2037 [DOI] [Google Scholar]
- 30. Roark B, Mitchell M, Hosom J‐P, Hollingshead K, Kaye J. Spoken language derived measures for detecting mild cognitive impairment. IEEE Trans Audio Speech Lang Process. 2011;19:2081‐2090. doi: 10.1109/TASL.2011.2112351 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Toth L, Hoffmann I, Gosztolya G, et al. A speech recognition‐based solution for the automatic detection of mild cognitive impairment from spontaneous speech. CAR. 2018;15:130‐138. doi: 10.2174/1567205014666171121114930 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Fraser KC, Meltzer JA, Rudzicz F. Linguistic features identify Alzheimer's disease in narrative speech. JAD. 2015;49:407‐422. doi: 10.3233/JAD-150520 [DOI] [PubMed] [Google Scholar]
- 33. Jolles DD, Grol MJ, Van Buchem MA, Rombouts SARB, Crone EA. Practice effects in the brain: changes in cerebral activation after working memory practice depend on task demands. Neuroimage. 2010;52:658‐668. doi: 10.1016/j.neuroimage.2010.04.028 [DOI] [PubMed] [Google Scholar]
- 34. Xia J, Zhang W, Jiang Y, Li Y, Chen Q. Neural practice effect during cross‐modal selective attention: supra‐modal and modality‐specific effects. Cortex. 2018;106:47‐64. doi: 10.1016/j.cortex.2018.05.003 [DOI] [PubMed] [Google Scholar]
- 35. Anticevic A, Cole MW, Murray JD, Corlett PR, Wang X‐J, Krystal JH. The role of default network deactivation in cognition and disease. Trends Cogn Sci. 2012;16:584‐592. doi: 10.1016/j.tics.2012.10.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supporting Information
Supporting Information
