Abstract
Testing, or retrieval practice, has become a central topic in memory research. One potentially important effect of retrieval practice has received little attention, however: Retrieval practice may enhance, or potentiate, subsequent learning. We introduce a paradigm that can measure the indirect, potentiating effect of free recall tests on subsequent learning, and then test a hypothesis for why tests have this potentiating effect. In two experiments, the benefit of a restudy trial was enhanced when prior free recall tests had been taken. Results from a third correlational study suggest that this effect may be mediated by the effect of testing on organization. Not only do encoding conditions impact later retrievability, but also retrieval attempts impact subsequent encoding effectiveness.
Testing is much more than a measure of memory; taking a test, or retrieval practice, modifies memory (for a recent review, see Roediger & Butler, 2011). A well-studied demonstration of this phenomenon is seen in the testing effect, the finding that retrieval enhances subsequent retention (e.g., Roediger & Karpicke, 2006). However, retrieval may have another enhancing effect that has been largely overlooked: retrieval may enhance subsequent encoding, an effect known as test-potentiated learning (Arnold & McDermott, in press; Izawa, 1971).
The dearth of research on test-potentiated learning is especially notable for free recall tests, possibly in part because of a conclusion Tulving (1967) made decades ago. Tulving claimed that in multitrial free recall paradigms tests and study trials have an equivalent effect on learning. He concluded that subsequent recall “depends primarily on the total amount of time spent on the task, and that it is relatively little affected, if at all, by the distribution of this time between studying and recalling the material” (p. 181). That is, practicing recalling items has the same effect on learning as does studying the material.
However, Tulving (1967) also observed that the mechanisms underlying the equivalent effects of study and test trials were not the same. In a condition with three successive tests between study trials, forgetting occurred after the first test, but this loss was counteracted by a large increase in recall after each study trial. That is, in conditions with consecutive test trials, more learning seemed to occur on subsequent study trials than in conditions without consecutive test trials. These results suggest the additional tests may have potentiated learning during study.
Later researchers demonstrated that the distribution of study and test trials does affect learning and that taking free recall tests between study trials enhances performance (Donaldson, 1971; Karpicke & Roediger, 2007; Lachman & Laughery, 1968; Roediger & Smith, 2012; Rosner, 1970). Although this research shows that free recall tests enhance performance, how these tests do so is still unknown. Is learning enhanced because tests directly improve retention of the retrieved items (i.e., the testing effect), as suggested by Donaldson (1971)? Or is performance enhanced because free recall tests potentiate subsequent learning, an indirect effect of testing, as suggested by Rosner (1970)?
By manipulating both the number of prior tests and whether or not the material is restudied, we can identify whether the benefit incurred from restudying is enhanced by a recall test preceding the restudy phase. That is, do prior recall tests boost, or potentiate, the enhancement seen from restudying the material? To answer this question, in Experiment 1 we varied the number of initial tests (0 or 3) and whether or not a restudy opportunity occurred following the initial tests. Taking initial tests boosted the amount of information acquired from the restudy phase (i.e., the difference in final recall between the groups who did and did not receive restudy was greater for participants who had taken initial tests). In Experiment 2, these findings were extended and replicated in an Internet sample to ensure replicability and generalizability of the initial findings. Finally, in Experiment 3 we explored the role enhanced organization may play in test-potentiated learning. Previous research has shown that testing enhances organization and that this enhancement partially underlies the testing effect (Zaromb & Roediger, 2010). We ask whether enhanced organization from testing may also underlie the test-potentiated learning effect.
Experiment 1
Do free recall tests potentiate learning during subsequent study? This question is addressed in an undergraduate population using a between-participants design.
Method
Participants
One hundred and seventy-three Washington University in St. Louis undergraduate students participated in exchange for class credit or $10. The policies of the University’s Human Studies Committee were followed for all experiments.
Design
A 2(initial tests: 0, 3) × 2(restudy, no restudy) between-participants factorial design was used (see Fig. 1). Each participant was randomly assigned to one of four between-participant conditions: initial tests and restudy (n=43), initial tests but no restudy (n=43), no initial tests but restudy (n=44), or no initial tests and no restudy (n=43).
Prior to this main task, all participants completed a separate recall task, which was included as a baseline measure of individual memory performance.
Materials
For the baseline memory task, three lists of 15 related words from Roediger and McDermott’s (1995) report served as stimuli. Each list contained words (e.g., nurse, sick, lawyer) related to one target item (e.g., doctor) that never appeared in the list. All words were unrelated to images used in the main experiment. The lists were studied in an order randomized across subjects. Within each list of related words, items were presented in a random order.
In the main task, the stimuli were forty line drawings taken from the Snodgrass and Vanderwart (1980) norms. Images were chosen for their high name agreement (86%-100%) and depicted simple concrete nouns (e.g., carrot). On study trials, order of images was randomized for each participant.
Procedure
Participants were tested in groups of 1–4. Instructions were presented on the computer and were the same for all participants.
In part one (the baseline measure), participants studied 45 words. Each word was presented individually on the screen for 3 s with a 500 ms interstimulus interval. Participants were then given 3 min to recall as many of the items as they could in any order via the computer keyboard.
During part two (the main experiment), participants first studied 40 pictures, which were presented individually on the screen for 3 s with a 250 ms interstimulus interval. Participants then worked on math problems for 30 s to eliminate primary memory effects. Next, participants in conditions with initial testing were given 3 min to recall as many of the pictures as they could in any order by typing the name of the object depicted in each picture. All responses remained on the screen for the duration of the test. This procedure was repeated twice for a total of three tests. During this time, participants in the no initial test conditions played three games of Tetris, each lasting 3 min. Next, half of the participants restudied the pictures and the other half worked on math problems. Restudy followed the same procedure as the initial study. All participants then worked on math problems for an additional 30 s before taking a final test. Participants were given 5 min to recall as many pictures as possible.
Results and Discussion
Baseline Measure
A one-way ANOVA between the four conditions revealed no significant differences, F<1, and therefore, the baseline measure was not included in the remaining analyses.
Main Experiment
The final test data1 were analyzed using a 2(initial tests: 0, 3) X 2(restudy, no restudy) between-subjects ANOVA. Restudying enhanced later recall relative to not restudying (M=.61 vs .43; see Fig. 2), F(1,169)=50.87, p<.001, ηp2=.23. Similarly, taking initial tests enhanced later recall relative to not taking initial tests (M=.57 vs .47), F(1,169)=15.07, p<.001, ηp2=.08.
Test-potentiated learning would be substantiated by an interaction between initial tests and restudy conditions. That is, the difference between the restudy and no restudy conditions should be larger in the condition with initial tests, indicating that the benefit of having a restudy trial was enhanced when initial tests had been taken. As can be seen in Fig. 2, this pattern emerged; the difference between the restudy and no restudy conditions was larger when initial tests had been taken (M=.24) than when they had not been taken (M=.11), F(1,169)=8.21, p=.005, ηp2=.05, indicating that the initial tests potentiated learning during the restudy trial.
Another way to interpret this interaction is that a testing effect occurred only when participants restudied. There was no difference between the 0- and 3-test conditions when participants had not restudied, t<1. In contrast, when participants had restudied, recall was greater in the 3-test condition (M=.69) than the 0-test condition (M=.53), t(85)=4.83, p<.001, d=1.03. One might wonder whether this lack of testing effect is problematic in that a robust literature has demonstrated testing effects in the absence of a restudy condition. These testing effects, however, often do not emerge until after a delay (Roediger & Karpicke, 2006), and the present experiment was performed within a single session.
Experiment 2
Experiment 1 demonstrated that taking free recall tests potentiated learning during a subsequent restudy trial. Experiment 2 was designed to replicate this finding and extend it to a paradigm in which restudy was manipulated within-participants. Further, a more diverse population was used. Rather than testing college undergraduates, participants were recruited through the Internet using Amazon Mechanical Turk.
Method
Participants
One hundred and eighty participants completed the experiment on the Internet through Amazon Mechanical Turk in exchange for $2. Thirty-three participants were excluded from the analyses because on a post-experimental question, they indicated they had written words and/or picture names during the study phases. Additionally, seven participants were excluded because they failed to follow instructions. After these exclusions, 140 participants remained in the final analyses.
Design and Materials
Prior to the main task a separate recall test, identical to that used in Experiment 1, was given as a baseline measure of memory performance.
The main experiment was a 2(initial tests: 0, 2) X 2(restudy, no restudy) mixed factorial design with initial tests manipulated between-participants and restudy manipulated within-participants. Participants were randomly assigned to the initial tests nor no initial test ncondition. Stimuli were the same as in Experiment 1.
Procedure
The procedure was similar to Experiment 1 with two major changes: 1) participants were tested online and 2) all participants restudied half of the pictures. Which pictures were restudied was determined randomly for each participant.
Additional changes were made to accommodate the online testing format. Participants answered demographic questions prior to the beginning of the experiment and post-experimental questions after completing the study. Further, participants studied each picture for 4 s (rather than 3 s) with a 500 ms (rather than 250 ms) interstimulus interval. Participants in the initially tested condition took two (rather than three) initial free recall tests. Similarly, participants who were not given initial tests played two (rather than three) games of Tetris. Finally, on the final test participants had 3 min (rather than 5 min) to recall the pictures.
Results and Discussion
Baseline Measure
A t-test revealed a marginally significant difference in the proportion of words recalled on the baseline measure for subjects who did (M=.38) and did not (M=.43) subsequently receive initial tests in the main experiment, t(138)=1.86, p=.07, d=.32. Because of this marginal effect, in analyses of the main experiment, baseline memory performance was used as a covariate to ensure that differences between conditions were not due to pre-experimental differences. However, including this baseline measure as a covariate did not change any conclusions (see supplemental material).
Main Experiment
The final test data were analyzed using a 2(initial tests: 0, 2) X 2(restudy, no restudy) mixed ANCOVA. The proportion of words recalled on the baseline measure was used as the covariate.
Restudied pictures were more likely to be recalled than those not restudied (M=.62 vs .26; see Fig. 3), F(1,137)=446.54, p<.001, ηp2=.77. Unlike Experiment 1, there was no main effect of testing; taking initial tests did not significantly enhance recall relative to not taking initial tests (M=.45 vs .43), F<1 (however, three tests were used in Experiment 1 rather than two).
Test-potentiated learning is indicated by a greater benefit of restudying the pictures in the initial tests condition relative to the no initial tests condition. As Fig. 3 illustrates, this pattern was found; there was a larger difference between the proportion of restudied and not restudied pictures recalled in the initial tests condition relative to the no initial tests condition (M=.42 vs .31), F(1,137)=9.32, p<.003, ηp2=.06.
As in Experiment 1, this interaction can also be interpreted as indicating that there was a significant difference between initial test conditions for items that were restudied, F(1,137)=6.09, p=.02, ηp2=.04, but not for items that were not restudied, F(1, 137)=1.30, p=.26, ηp2=.009. These results again show that there was a significant testing effect only when participants were given the opportunity to restudy.
Experiment 3
Experiments 1 and 2 demonstrated that taking initial free recall tests can potentiate learning during a subsequent restudy trial. Experiment 3 explored why tests have this potentiating effect. Several previous researchers suggested that tests enhance learning by improving the organization of already learned material (e.g., Donaldson, 1971; Lachman & Laughery, 1968; Rosner, 1970). More recent work by Zaromb and Roediger (2010) provided evidence that free recall tests improve organization and that this improvement partially underlies the testing effect in free recall. Does this enhancement also underlie the test-potentiated learning effect in free recall? Improving the organization of already-learned material may increase the ability to encode new items by creating or improving a structure, or schema, which can be used to incorporate new items with already-learned items.
To test this hypothesis, Exp. 3 used categorized words. This change allowed organization to be measured through clustering, or recalling members of the same category together. Measurement of clustering was adjusted for the total number of items recalled using the adjusted ratio of clustering (ARC; Roenker, Thompson, & Brown, 1971). If enhanced organization underlies test-potentiated learning, organization on an initial test should be related to the amount of information learned during a subsequent restudy trial. That is, more organized recall prior to restudying should be related to more learning during restudy. The relationship between prior organization and subsequent learning was tested in a correlational study.
Method
Participants
Sixty-two participants completed the experiment through Amazon Mechanical Turk in exchange for $3. Seven participants were excluded from the final analysis because they reported writing down words and/or picture names. After these exclusions, 55 participants remained in the final analysis.
Design
This was a correlational study. A baseline task was given prior to the main task. In the main task, all participants took three initial tests and restudied all items.
Materials
For the baseline task, 30 line drawings of easily identifiable nouns (all unrelated to the categorized words) were chosen from the Snodgrass and Vanderwart (1980) norms.
For the main task, five medium frequency words from eight categories (total of 40 words) were chosen from the expanded and updated version of the Battig and Montague word norms (Van Overschelde, Rawson, & Dunlosky, 2004).
Procedure
As in Exp. 2, participants were tested online through Amazon Mechanical Turk and answered demographic questions before and post-experimental questions after the experiment.
The procedure was similar to Exp. 1, specifically to the initial tests with restudy condition, with two differences: on the baseline task participants learned 30 (rather than 45) items and on the final test participants had 3 min (rather than 5 min) to recall items.
Results and Discussion
The role organization may play in test-potentiated learning was examined by measuring the correlation between organization on the test prior to the restudy trial and learning during the restudy trial. Organization on the test prior to the restudy trial was measured using ARC scores, which can range from −1.0 to 1.0, with 1.0 indicating perfect organization, 0 indicating chance-level organization, and negative scores indicating below chance-level organization.
Learning on the restudy trial was estimated using a conditional probability measure: the proportion of items recalled on the final test given that they had not been recalled on any previous test. If organization of already-learned material improves subsequent learning, greater organization prior to restudying should be related to more learning during the restudy trial. As can be seen in Fig. 4, this pattern was found2. Higher ARC scores on the test prior to the restudy trial were associated with a larger proportion of items learned during the restudy trial, r(49)=.31, p=.03.
However, this relationship could be driven by a third variable. Specifically, individuals with better “memory ability” could tend to both have better organization and learn more during study trials. To test this possibility, the baseline memory measure was used as a covariate. When controlling for this estimate of memory ability, the correlation remained significant, r(48)=.34, p=.02. That is, higher ARC scores were still associated with greater learning suggesting that differences in memory ability do not drive this relationship.
General Discussion
The primary finding in this report is that free recall tests potentiate learning during subsequent restudy trials. The benefit of restudying the material was enhanced when initial free recall tests had been taken. This pattern was obtained when restudy was manipulated between-participants (Exp. 1) and within- participants (Exp. 2) and in both an undergraduate population (Exp. 1) and a more diverse population recruited online (Exp. 2).
This potentiating effect may be at least in part due to enhanced organization. Previous research has shown that testing improves organization (Zaromb & Roediger, 2010). Exp. 3 indicated that better organization prior to restudying was associated with more learning during restudying. This relationship does not seem to be mediated by memory ability. Although this finding is only correlational, it suggests that tests may potentiate learning by enhancing organization prior to learning.
Other explanations of test-potentiated learning
Other hypotheses have been proposed to explain test-potentiated learning. These alternative hypotheses are not mutually exclusive with the enhanced organization hypothesis. Multiple factors may contribute to test-potentiated learning.
One such hypothesis is that test-potentiated learning may be driven by enhanced metacognitive knowledge. Tests may increase metacognitive accuracy (Roediger & Karpicke, 2006), which could be used to improve restudy strategies. For instance, testing may allow participants to better determine which items they cannot remember and therefore which items they should focus on during the next restudy opportunity (Lachman & Laughery, 1968).
Recent work using functional magnetic resonance imaging (fMRI) has suggested another possible underlying mechanism for the enhancing effect of tests. Nelson et al. (2012) observed greater activation in the left posterior inferior parietal lobule during the restudy of word pairs that had been tested on a cued recall test than during restudy of pairs not previously tested. The specific region has been previously associated with successful recognition memory (McDermott, Szpunar, & Christ, 2009; Nelson et al., 2010), an observation leading the authors to suggest that the initial tests may have increased the tendency for study-phase retrieval, or remindings (Hintzman, 2004) during subsequent restudy. Although this work involved a different set of procedures than those used here, the results provide an intriguing possibility.
Several hypotheses have been proposed to explain the finding of enhanced encoding following a failed generation attempt (Grimaldi & Karpicke, 2012; Hays, Kornell, & Bjork, 2012; Kornell, Hays, & Bjork, 2009; see also Slamecka & Fevreiski, 1983), or what could be called generate-potentiated learning. In this paradigm, there is no initial study, and participants are asked to guess the target of a cue word (e.g., tide - ?) before studying the complete pair (e.g., tide – beach) or to study the pair without the initial guess. Final recall is enhanced for pairs that have an initial guess. The hypothesis favored by Grimaldi and Karpicke (2012), which they called the search set theory, posits that attempting to guess the target initiates a search process that activates related items. The experimentally-defined target and items related to the target may become activated even if they are not given as the response, and this activation may enhance encoding when the target is subsequently presented.
However, this theory does not seem to generalize to free recall learning, especially in a paradigm in which the stimuli are unrelated to each other as was the case in the first two experiments presented here. Further, Grimaldi and Karpicke (2012) proposed that this activation process is very short-lived and that the enhancing effect only occurs when the target is presented immediately after the retrieval attempt. In the experiments presented here, the delay between the initial tests and restudy would suggest that any activation would have already dissipated.
Conclusion
These experiments provide strong evidence that free recall tests have a potentiating effect on subsequent study and suggest that enhanced organization may underlie this potentiating effect. They introduce a new paradigm for studying test-potentiated learning and provide the first steps to a better understanding of the role of retrieval in learning. Not only does retrieval directly benefit future recall, but it also prepares the learner for future learning. In short, subsequent memory is enhanced by retrieval practice and by repeated study, and the combination of the two is an especially potent memory enhancer.
Supplementary Material
Acknowledgments
These experiments were supported by a grant from the James S. McDonnell Foundation and the National Institutes of Health training grant 5T32GM081739. We thank Allison Cantor, Sophie Crumpacker, Laura D’Antonio, Andrew Fishell, and Fan Zou for help with data collection and analyses.
Footnotes
For analyses of initial test data from all experiments and demographic and post-experimental question data from Experiments 2 and 3, see supplemental material.
Four participants did not recall enough items to calculate an ARC score and were not included in the correlation.
References
- Arnold KM, McDermott KB. Test-potentiated learning: Distinguishing between direct and indirect effects of tests. Journal of Experimental Psychology: Learning, Memory, and Cognition. doi: 10.1037/a0029199. (in press) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Donaldson W. Output effects in multitrial free recall. Journal of Verbal Learning and Verbal Behavior. 1971;10:577–585. [Google Scholar]
- Grimaldi PJ, Karpicke JD. When and why do retrieval attempts enhance subsequent encoding? Memory and Cognition. 2012 doi: 10.3758/s13421-011-0174-0. [DOI] [PubMed] [Google Scholar]
- Hays MJ, Kornell N, Bjork RA. When and why a failed test potentiates the effectiveness of subsequent study. Journal of Experimental Psychology: Learning, Memory, Cognition. 2012 doi: 10.1037/a0028468. [DOI] [PubMed] [Google Scholar]
- Hintzman DL. Judgment of frequency versus recognition confidence: Repetition and recursive reminding. Memory and Cognition. 2004;32:336–350. doi: 10.3758/bf03196863. [DOI] [PubMed] [Google Scholar]
- Izawa C. The test trial potentiating model. Journal of Mathematical Psychology. 1971;8:200–224. [Google Scholar]
- Karpicke JD, Roediger HL. Repeated retrieval during learning is the key to long-term retention. Journal of Experimental Psychology: General. 2007;138:469–486. doi: 10.1037/a0017341. [DOI] [PubMed] [Google Scholar]
- Kornell N, Hays MJ, Bjork RA. Unsuccessful retrieval attempts enhance subsequent learning. Journal of Experimental Psychology: Learning, Memory, and Cognition. 2009;35:989–998. doi: 10.1037/a0015729. [DOI] [PubMed] [Google Scholar]
- Lachman R, Laughery KR. Is a test trial a training trial in free recall learning? Journal of Experimental Psychology. 1968;76:40–50. [Google Scholar]
- McDermott KB, Szpunar KK, Christ SE. Laboratory-based and autobiographical retrieval tasks differ substantially in their neural substrates. Neuropsychologia. 2009;47:2290–2298. doi: 10.1016/j.neuropsychologia.2008.12.025. [DOI] [PubMed] [Google Scholar]
- Nelson SM, Arnold KM, Gilmore AW, Najjar LM, Finn B, McDermott KB. A neural signature of test-potentiated learning in parietal cortex. (under revision) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nelson SM, Cohen AL, Power JD, Wig GS, Miezin FM, Wheeler ME, et al. A parcellation scheme for human left lateral parietal cortex. Neuron. 2010;67:156–170. doi: 10.1016/j.neuron.2010.05.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roediger HL, Butler AC. The critical role of retrieval practice in long- term retention. Trends in Cognitive Sciences. 2011;15:20–27. doi: 10.1016/j.tics.2010.09.003. [DOI] [PubMed] [Google Scholar]
- Roediger HL, Karpicke JD. Test enhanced learning: Taking memory tests improves long-term retention. Psychological Science. 2006;17:249–255. doi: 10.1111/j.1467-9280.2006.01693.x. [DOI] [PubMed] [Google Scholar]
- Roediger HL, McDermott KB. Creating false memories: Remembering words not presented in lists. Journal of Experimental Psychology: Learning, Memory, and Cognition. 1995;21:803–814. [Google Scholar]
- Roediger HL, Smith M. The “pure-study” learning curve: The learning curve without cumulative testing. Memory & Cognition. doi: 10.3758/s13421-012-0213-5. (in press) [DOI] [PubMed] [Google Scholar]
- Roenker DL, Thompson CP, Brown SC. Comparison of measures for the estimation of clustering in free recall. Psychological Bulletin. 1971;76:45–48. [Google Scholar]
- Rosner SR. The effects of presentation and recall trials on organization in multitrial free recall. Journal of Verbal Learning and Verbal Behavior. 1970;9:69–74. [Google Scholar]
- Slamecka NJ, Fevreiski J. The generation effect when generation fails. Journal of Verbal Learning and Verbal Behavior. 1983;22:153–163. [Google Scholar]
- Snodgrass JG, Vanderwart M. A standardized set of 260 pictures: Norms for name agreement, image agreement, familiarity, and visual complexity. Journal of Experimental Psychology: Human Learning and Memory. 1980;6:174–215. doi: 10.1037//0278-7393.6.2.174. [DOI] [PubMed] [Google Scholar]
- Tulving E. The effects of presentation and recall of material in free-recall learning. Journal of Verbal Learning and Verbal Behavior. 1967;6:175–184. [Google Scholar]
- Van Overschelde JP, Rawson KA, Dunlosky J. Category norms: An updated and expanded version of the Battig and Montague (1969) norms. Journal of Memory and Language. 2004;50:289–335. [Google Scholar]
- Zaromb FM, Roediger HL. The testing effect in free recall is associated with enhanced organizational processes. Memory & Cognition. 2010;38:995–1008. doi: 10.3758/MC.38.8.995. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.