Abstract
Extended practice on a particular cognitive task can boost the performance of other tasks, even though they themselves have not been practiced. This transfer of benefits appears to be specific, occurring most when tasks are very similar to those being trained. But what type of similarity is most important for predicting transfer? This question is addressed with a tightly controlled randomised design, with a relatively large sample (N=175) and an adaptive control group. We created a hierarchical set of nested assessment tasks. Participants then trained on two of the tasks: one was relatively ‘low’ in the hierarchy requiring just simultaneous judgments of shapes’ spikiness, whereas the other was relatively ‘high’ requiring delayed judgments of shapes’ spikiness or number of spikes in a switching paradigm. Using the full complement of nested tasks before and after training we could then test whether and how these ‘low’ and ‘high’ training effects cascade through the hierarchy. For both training groups, relative to the control, whether or not an assessment task shared a single specific feature was the best predictor of transfer patterns. For the lower-level training group, the overall proportion of feature overlap also significantly predicted transfer, but the same was not true for the higher-level training group. Finally, pre-training between-task correlations were not predictive of the pattern of transfer for either group. Together these findings provide an experimental exploration of the specificity of transfer, and establish the nature of task overlap that is crucial for the transfer of performance improvements.
Extended practice – sometimes called training – on one or more cognitive tasks can improve performance on other, un-practiced, tasks. However, this transfer of improvements is largely constrained to tasks that are ‘highly similar’ to those being trained (Melby-Lervag et al., 2016; Sala & Gobet, 2019; Simons et al., 2016; Soveri et al., 2017). But what does similarity mean in the context of a cognitive task? Tasks can vary along several dimensions (e.g. stimulus type, spatial properties, timings, and goals) and there are multiple ways to calculate and conceptualise task similarity.To explain the limited scope of transfer and better understand its boundary conditions, researchers have called for a more systematic approach to cognitive training research (Katz et al., 2017; Redick, 2019; Sala & Gobet, 2019; Von Bastian & Oberauer, 2014; Gathercole et al., 2019; Holmes et al., 2019; Norris et al., 2019; Taatgen, 2013). The current study examines how feature overlap informs transfer within a set of hierarchically nested visual-discrimination tasks.
Theories of Between-Task Transfer
Transfer holds important theoretical implications for models of skill acquisition and performance (Singley & Anderson, 1989; Barnett & Ceci, 2002). Many accounts of transfer are either rooted in, or closely related to, production system models (Anderson, 1982; Cole et al., 2012; Gathercole et al., 2019; Singley & Anderson, 1985; Newell, 1990; Taatgen, 2013), in which, task performance is achieved by stimulus information being inputted to, and propagated via, a series of processing components (production rules), to produce an output. These processing components are functions that take information from the senses and/or current memory state as input, and pass them to a set of conditional statements, each of which specifies an output that either modifies the memory state or initiates a motor response.
Learning in the context of production system models concerns the acquisition, modification and composition of these processing components.This is thought to follow a declarative-to-procedural trajectory – the component processes used to perform a task start out very general and inefficient but with experience become increasingly specialised and efficient (Taatgen, 2013). Lower level processing components are combined sequentially into ‘modules’ to form sub-routines within a task-routine (sometimes referred to as a task-set, Rogers & Monsell, 1995), and when these sub-routines can be used effectively by other task-routines there is potential for transfer (Taatgen, 2013;Gathercole et al., 2019). From this perspective, transfer varies continuously according to the relative utility and interchangeability of modules at any given point in time. This provides a nuanced and dynamic interpretation of transfer, how and when it appears, and how it might vary across different stages of development and with varying amounts of practice. Moreover, it provides a way of deriving a taxonomy of tasks according to the interchangeability of their sub-routines. In turn this provides a concrete way of defining the overlap between tasks, a prerequisite for making quantitative predictions about transfer (Reder & Klatzky, 1994; Barnett & Ceci, 2002; Taatgen, 2013; Gathercole et al., 2019).
At the heart of this perspective is a division between task-specific and task-general processes, with the latter being essential for transfer (Singley & Anderson 1985, Taatgen, 2013). The initial optimism of cognitive training research reflected the prospect that training might improve task-general processes that are shared across many tasks, within and even between cognitive domains (Klingberg, 2010). However, more recent investigations have demonstrated the feature-specificity of transfer (Gathercole et al., 2019; Holmes et al., 2019; Norris et al., 2019; Sala & Gobet, 2019;Soveri et al., 2017).For example, while transfer has been observed between n-back variants using different stimulus types (Holmes et al., 2019; Minear et al., 2016; Soveri et al., 2017; Waris et al., 2015), complex span training effects appear tied to the specific stimulus type (Holmes et al., 2019;Minear et al., 2016). Furthermore, digit span training does not readily transfer to other simple span tasks that differ by only a single stimulus feature, such as modality (visual vs auditory presentation of digits) or stimulus type (e.g. digits vs letters; Norris et al., 2019).
Gathercole et al (2019) propose a framework in which transfer varies not as a function of similarity with respect to all the shared processing components between tasks, but instead primarily as a function of the applicability of novel cognitive routines, acquired during training, to an untrained task.A useful cognitive routine acquired during training can be conceptualised as a higher order process that controls the flow of lower order processes in a novel manner, in order to facilitate performance (e.g. mnemonic, grouping, and proactive control strategies; Gathercole et al., 2019; Taatgen, 2013). Acquiring new cognitive routines is resource intensive, so it is likely that we will only develop new ones if they improve performance in a meaningful way. Gathercole et al (2019) suggest that the cognitive routines recruited to perform some tasks are relatively well established and functional, so the development of new ones is not necessary. Moreover, when people do develop new routines to enhance performance after extensive practice, these tend to be tied to the specifics of the stimuli and/or paradigm and thus do not readily transfer. Accordingly, two tasks may be relatively highly correlated but show no transfer to one another following training. On the other hand, when task demands are relatively novel, they require the acquisition of more rudimentary routines that are less tied to the specifics of the task and thus transfer more readily. This framework imposes important theoretical constraints on transfer by emphasising the role novel task demands play in necessitating the development of new higher order routines, and their shared utility across different tasks.
Cognitive routines are closely related to the concept of a ‘task-set’, introduced by Rogers & Monsell (1995). It too describes the set of processes used by an individual to link sensory input to motor output to accomplish a task. According to Rogers & Monsell (1995), task-sets can be adopted in a preparatory manner so as to form an ‘effective intention’ to perform a task, and can be brought about both endogenously (e.g. proactive conscious preparation) and exogenously (e.g. in reaction to an external stimuli). One possibility is that task-sets perform a shielding function helping to prevent irrelevant stimulus features from interfering with response processes (Rogers & Monsell, 1995; Dreisbach & Wenke, 2011). Whilst task-sets may be adaptive in most contexts, they may also engender task specificity and even negative transfer in training contexts, as well as switch costs within task-switching contexts, which we discuss further in the next section.
Transfer Specificity in Discrimination and Switching tasks
Most of the cognitive training research to date has focused on relatively higher order tasks with the hope that these would have more generalisable benefits (Melby-Lervag et al., 2016; Simons et al., 2016), although their specificity is becoming apparent.In contrast, the specificity of lower order tasks is already well established (Fahle, 2005). Transfer in simple visual discrimination tasks is often confined to the specific stimulus features (e.g. orientation, contrast, motion) and contexts (range, spatial location, category etc.) being trained (Dosher & Lu, 2017; Fahle, 2005). However, this is still a graded phenomenon, both on-task learning and transfer to albeit very subtly different tasks, have been shown to be dependent upon stimulus complexity, judgement precision, and specifics of the test/training procedures (Assihar & Hochstien, 2004; Berry et al., 2010; Dosher & Lu, 2009; Dosher & Lu, 2017; Fahle, 2005; Jacobs, 2002; Jeter et al., 2009; Jeter et al., 2010; Parsons et al., 2016).Aside from the well demonstrated signal to noise improvements of representations in visual processing brain regions following training, enhanced perceptual discrimination ability is also thought to stem from modifications in top-down attentional and decision making processes, providing another potential avenue for transfer to manifest (Assihar & Hochstien, 2004; Berry et al., 2010; Covey et al., 2018; Dosher & Lu, 2017; Lu & Dosher, 2009; Parsons et al., 2016).
A paradigm in which the specificity of training induced transfer is less clear is that of simple task-switching. In line with other paradigms, task-switch training wherein participants practice switching between two (or more) simple binary decision tasks, shows fairly consistent transfer to other similarly structured switching tasks involving different binary decisions (Dorrenbacher et al., 2014; Karbach & Kray, 2009; Minear et al., 2002; Minear & Shah, 2008; Zinke et al., 2012). However, Karbach & Kray (2009) found that task-switch training improvements also transferred to interference control tasks (color-stroop/number-stroop), verbal working memory tasks (reading-span/counting-span), spatial working memory tasks (symmetry-span/navigation-span), and most surprisingly fluid intelligence tasks (figural reasoning/letter series/ravens matrices). They suggest that improved executive control processes tapped by the switching paradigm, such as interference control, are common across these tasks and may thus be responsible for these findings. Similar studies (Dorrenbacher et al., 2014; Zinke et al., 2012) have failed to show any generalizable benefits to interference control/inhibition tasks (flanker, stroop), nor updating/working memory tasks (n-back, keep track, backward-digit-span, counting span) but did find some evidence for transfer to a measure of processing speed (choice reaction time). Discrepancies between studies could be due (amongst others) to differences in sample populations, training/assessment task specifics, training dosages, motivation, sample sizes, and analysis protocols (Dorrenbacher et al., 2014; Karbach et al., 2017; Simons et al., 2016; Zinke et al., 2012).Given these mixed findings it is difficult to ascertain the scope of transfer for task-switching training.
The aforementioned concept of a task-set may shed light on the specificity of transfer in task-switching contexts, and more generally. One popular account is that task-sets perform a shielding function by providing a preparatory attentional state that serves to bias the set of imminent processes recruited to perform a task, so as to prevent irrelevant stimulus features, or indeed stimulus-response mappings, from interfering with the correct response process (Rogers & Monsell, 1995; Dreisbach & Wenke, 2011). However, the adoption of specific task-sets in a switching context may become proximately maladaptive; switch-costs arise because the need to switch between task-sets slows people down and/or a failure to adequately switch brings about pro-active interference caused by an irrelevant task-set (Dreisbach & Wenke, 2011). Interestingly, task-switching training appears to relax task-set shielding. This is evidenced by the finding that irrelevant information for both tasks can interfere with performance in a task-switching context but not on their single task counterparts (Dreisbach & Wenke, 2011). This implies that prior exposure to a task through training (or simply by task-order) may initially cause some negative transfer effects on a different task. Moreover, the broader transfer observed to other tasks involving interference control (Karbach & Kray, 2009) may be due to the adoption of more relaxed task-sets relative to training on single task-counterparts. Further support for this comes from Sabah et al (2019) who found that increasing task-variability (in terms of content and structure) in a task-switching context resulted in greater transfer to novel switching tasks.
Overview
Recent theoretical and experimental work suggests that specificity is the rule rather than the exception for transfer effects (Melby-Lervag et al., 2016; Sala & Gobet, 2019; Simons et al., 2016;Gathercole et al., 2019).However, the specificity of transfer varies between task-paradigms and as a function of task complexity and novelty (Assihar & Hochstien, 2004; Dosher & Lu, 2017;Gathercole et al., 2019; Jeter et al., 2009; Taatgen, 2013).To compliment theoretical progress and better understand the precise nature of transfer and its boundary conditions, further experimental studies are required that identify and systematically manipulate the overlap in task features between training and assessment tasks (Gathercole et al., 2019; Holmes et al., 2019; Minear et al., 2016; Norris et al., 2019; Von Bastian & Oberauer, 2014). Visual discrimination and task-switching paradigms both show potential for transfer (Assihar & Hochstien, 2004; Dorrenbacher et al., 2014; Dosher & Lu, 2017; Fahle, 2005; Karbach & Kray, 2009). Moreover,their simple feature structures (see figure 2), for example with switching elements being added to impose executive demands, make them suitable for the systematic exploration of practice induced transfer effects.
Figure 2. Depiction of the task feature hierarchy.
Assessment task abbreviations: Simultaneous Spikiness (SSP); Simultaneous Number (SN); Simultaneous Switching (SSW); Delayed Spikiness (DSP); Delayed Number (DN); Delayed Switching (DSW).
The Present Study
The present study explored the potential transfer of training two tasks within a set of six hierarchically nested perceptual discrimination tasks. To do so, we conducted a large online training study that allowed us to power for small-medium effect sizes. The tasks were hierarchically nested with respect to their combination of task features. That is, the higher level tasks contain all the features of their lower level counterparts. Importantly, we focused on task features and do not make strong claims about the specifics of the associated cognitive processes – it is very difficult to infer a cognitive hierarchy. However, we assumed that our tasks spanned a range of processes including: attention, working memory, and executive control (Dosher & Lu, 2017; Miyake et al., 2000), and that as tasks contain more features, so the required cognitive processes become more complex.Whilst the stimuli were identical, the task features varied systematically with respect to judgement type (number of spikes or ‘spikiness’), presentation type (simultaneous or delayed) and task-switching, allowing us to establish them as potential boundary conditions. All participants completed each of the six assessment tasks both before and after training, one group received training on a relatively low-level task, another group received training on a relatively high-level task, and a third group trained on a control task.
There were several motivations for taking this approach: 1) Task overlap can be quantified in an unambiguous and systematic manner at the level of the task features; 2) Given the prevalence of transfer specificity, we wanted the variability between tasks to be fairly minimal, systematic and precise, to allow for transfer; 3) Relatedly, transfer seems to depend upon complexity/novelty, so we chose tasks that were simple to interpret and easy to learn, whilst being complex and novel enough to allow for transfer; 4) The simplicity of the tasks and the brevity of their trials made for a relatively parsimonious and cost-effective study, allowing us to collect a large sample.
Despite potential avenues for transfer, we were hesitant to make specific a-priori predictions given the novelty of the specific task parameters, stimuli, and training protocols used. Instead, we posed the following open questions: 1) Do participants make substantial on-task training gains? 2) Do different training tasks generate different transfer patterns? 3) Are these transfer patterns predicted by the proportion of overlapping features? 4) Do some features contribute more to the transfer than others?
Materials and Methods
Ethical Approval
This study received ethical approval from the Cambridge Psychology ethics committee, University of Cambridge, application number: PRE.2019.046. All participants provided informed consent by checking a box to confirm they had fully understood the implications of participation and their right to withdraw.
Participants
The final sample (see ‘Data Exclusion’) consisted of 175 English speaking adults with normal/corrected vision aged between 18 and 35 years of age (M=27.11, SD=4.85). Participants were recruited via ‘Prolific’, a platform for recruiting and paying people to participate in online experiments. Participants were paid at a rate of £6 per hour and received a £5 bonus upon completion of all sessions.
A total sample size of 175 in three groups yielded 0.84 power to detect a medium transfer effect size (d = 0.5). Participants were randomly assigned to three groups and their demographics are displayed in Table 1. Analyses revealed moderate evidence for no group differences with respect to age (F(2,172)=1.23, p=0.293, BF10=0.16) and gender (X2=2.31, df=2, p=0.314, BF10=0.14).
Table 1. Group demographics.
| SSPT | DSWT | Control | |
|---|---|---|---|
| N | 59 | 60 | 56 |
| Age: M (SD) | 26.45 (5.28) | 27.05 (4.84) | 27.87 (4.34) |
| Female: N (%) | 30 (50.8%) | 29 (48.3%) | 21 (37.5%) |
| Male: N (%) | 29 (49.2%) | 31 (51.7%) | 35 (62.5%) |
Training group abbreviations: Simultaneous Spikiness Training (SSPT); Delayed Switching Training (DSWT).
Materials
All tasks were coded using JavaScript (jsPsych; De Leeuw, 2015), HTML, and CSS in house. We advertised the study via Prolific and used JATOS to set up and run the study on a local server.
A set of 220 (20x11) spikey shapes was generated using MATLAB as specified by Van Dam & Ernst (2015). The shapes varied in a graded fashion along two dimensions: ‘spikiness’ and ‘number of spikes’ (see Figure 1). They were always the same turquoise-grey on a black background.The range of both the Number of Spikes and Spikiness dimensions was determined from task pilot data. Seven difficulty levels were chosen to capture the range of performance and to allow room for improvement. Task difficulty corresponded to the deviation between stimuli along either dimension, where a difference of one was the most difficult judgment and a difference of seven was the easiest judgment.
Figure 1. The stimulus set comprised 220 spikey shapes that varied in a graded fashion along two dimensions: ‘spikiness’ and ‘number of spikes’.
Assessment Tasks
In each assessment phase (pre- and post-training) participants completed two blocks on each of the six tasks. Task order was semi-randomised such that task order was randomised with the constraint that two non-switching tasks had to occur first.All six assessment tasks required the participant to make a same-different judgement about two spikey shapes (see Figures 2 & 3). Participants were instructed to press the ‘J’ key when making a same-response or the ‘F’ key when making a different-response. They were instructed to simply be as accurate as possible and no mention of speed was made. This was to avoid ambiguously introducing a large range of potentially viable speed accuracy trade-offs and aimed at making the results more interpretable.
Figure 3. Training and assessment task trial sequences.
All task were used during assessment except for the Speeded-Response-Mapping task, which was only used in training for the Control group. The additional (T) indicates that the task was both an assessment task and training task.
Each block contained 56 trials, half required a ‘same’ response and the other half required a ‘different’ response and these were evenly distributed across the seven difficulty levels (four trials at each difficulty). All participants saw the same stimuli as one another within every task but in a randomised order. Participants received explicit step by step instructions with examples for each task, along with a small number of practice trials. Feedback was provided on each trial, with the shape turning green for 300ms to indicate ‘correct’ and red for ‘incorrect’. There was a 200ms inter-trial interval.Participants received feedback about their average accuracy after each block. Each assessment phase took approximately 45mins.
The tasks were divided into those with the two shapes presented simultaneously and those with the two shapes presented sequentially. In tasks using simultaneous presentation, participants were shown a centred fixation cross for 350ms followed by two spikey shapes presented simultaneously alongside one another for 1600ms, they had to respond within the 1600ms else the trial was counted as incorrect. For tasks using sequential presentation, participants were shown a fixation cross for 350ms, followed by a ‘target-spikey-shape’ for 800ms, which was then masked for 1000ms, and followed by a second ‘response-spikey-shape’ for 1400ms.Participants had to respond within the 1400ms otherwise the trial was counted as incorrect.The simultaneous presentation tasks were set to have slightly longer response deadlines to account for the increased encoding demands during the response phase of the task (two stimuli vs one).
The tasks were further sub-divided into those that required participants to make judgements about the ‘Spikiness’ property of the shape, those that required participants to make judgements about the ‘Number of Spikes’ property of the shape, and those that required participants to switch between these two judgement types. When a Spikiness-judgement was required, the two shapes always shared the same number of spikes (varying randomly between 5 and 15 spikes) and participants were to make a judgment about whether the two shapes share the same ‘Spikiness’ or not. When a Number-of-Spikes-Judgement was required the two shapes varied in both their ‘Spikiness’ and their number of spikes and participants were to make a judgment about whether the two shapes share the same number of spikes or not. When switching between judgements, the colour of the border in the response phase cued the judgement dimension (‘Spikiness’ or ‘Number of Spikes’). A blue border cued the ‘Spikiness’ judgement and a red border cued the ‘Number of Spikes’ judgement.This gave us the following six assessment tasks (see below for individual task descriptions and Figure 3 for a graphical depiction): Simultaneous-Spikiness (SSP); Simultaneous-Number (SN); Simultaneous-Switching (SSW); Delayed-Spikiness (DSP); Delayed-Number (DN); Delayed-Switching (DSW). Crucially, these tasks are all hierarchically nested, with the more complex variants being formed of their constituent paradigms.
Simultaneous-Spikiness (SSP)
Participants are shown a fixation cross for 350ms followed by two spikey shapes presented simultaneously alongside one another for 1600ms. In this task, the two spikey shapes always share the same number of spikes and participants are required to make a judgment about whether the two shapes share the same ‘spikiness’ or not.
Simultaneous-Number (SN)
Participants are shown a fixation cross for 350ms followed by two spikey shapes presented simultaneously alongside one another for 1600ms. In this task, the two spikey shapes can vary in both their ‘spikiness’ and their number of spikes and participants are required to make a judgment about whether the two shapes share the same number of spikes or not.
Simultaneous-Switching (SSW)
Participants are shown a fixation cross for 350ms followed by two spikey shapes presented simultaneously alongside one another within a border for 1600ms. In this task, the colour of the border cues the participant as to which judgement dimension (‘spikiness’ or number of spikes) they ought to be responding along on a given trial. If the border is Blue, the two spikey shapes always share the same number of spikes and participants are required to make a judgment about whether the two shapes share the same ‘spikiness’ or not. If the border is Red, the two spikey shapes can vary in both their ‘spikiness’ and their number of spikes and participants are required to make a judgment about whether the two shapes share the same number of spikes or not.
Delayed Spikiness (DSP)
Participants are shown a fixation cross for 350ms followed by a target-spikey-shape for 800ms, then a masked delay of 1000ms, then a second response-spikey-shape for 1400ms. In this task, the two spikey shapes always share the same number of spikes and participants are required to make a judgment about whether the target and response stimuli share the same ‘spikiness’ or not.
Delayed Number (DN)
Participants are shown a fixation cross for 350ms followed by a target-spikey-shape for 800ms, then a masked delay of 1000ms, then a second response-spikey-shape for 1400ms. In this task, the two spikey shapes can vary in both their ‘spikiness’ and their number of spikes and participants are required to make a judgment about whether the target and response stimuli share the same number of spikes or not.
Delayed Switching (DSW)
Participants are shown a fixation cross for 350ms followed by a target-spikey-shape for 800ms, then a masked delay of 1000ms, then a second response-spikey-shape within a border for 1400ms. In this task, the colour of the border cues the participant as to which judgement dimension (‘spikiness’ or # of spikes) they ought to be responding along on a given trial. If the border is Blue, the two spikey shapes always share the same number of spikes and participants are required to make a judgment about whether the target and response stimuli share the same ‘spikiness’ or not. If the border is Red, the two spikey shapes can vary in both their ‘spikiness’ and their number of spikes and participants are required to make a judgment about whether the target and response stimuli share the same number of spikes or not.
Training Tasks
Training was conducted on either the Simultaneous-Spikiness task (SSPT), Delay-Switching-Training(DWST) or a Speeded-Response-Mapping task (Control; see Figure 3). These represented a task relatively low in the hierarchy, relatively high in the hierarchy and a control, respectively. As in the assessments, all training tasks had seven difficulty levels. Participants started at the easiest difficulty level on the first session and the level reached by the end of each session carried over into the next training session. Level up/down performance requirements were based on preliminary pilot data and aimed at generating somewhat similar improvement trajectories over time across the training groups.
Simultaneous-Spikiness-Training (SSPT)
Identical in structure to the Simultaneous-Spikiness assessment task. Each training session lasted approximately 15mins. If participants achieved >75% accuracy they moved up a difficulty level, if they achieved <65% accuracy they moved down a difficulty level, otherwise they remained at the same difficulty level.
Delayed Switching-Training (DSWT)
This task is identical in structure to the Delay-Switching assessment task. Each training session lasted approximately 20mins.If participants achieved >65% accuracy they moved up a difficulty level, if they achieved <55% accuracy they moved down a difficulty level, otherwise they remained at the same difficulty level.
Speeded-Response-Mapping-Training
In the Speeded-Response-Mapping-Training task participants are shown a fixation cross for 350ms followed by a spikey shape in one of two locations (left or right) and are required to press the key corresponding to the location of the stimulus (‘F’ for left and ‘J’ for right) as quickly as they can. The difficulty was adjusted by changing the limited amount of time participants had to make a response, there were seven difficulty levels: 550ms, 500ms, 450ms, 400ms, 350ms, 300ms, 250ms, and 200ms (these times were chosen based on data from the human benchmark project (https://www.humanbenchmark.com/tests/reactiontime). They receive feedback about whether or not they made the correct choice: Green for correct and Red for incorrect (300ms). A failure to respond within the time limit is counted as incorrect. There was a 200ms inter-trial interval. Each training session lasted approximately 12mins. If participants achieved >90% accuracy and their reaction time was less than the current difficulty level they moved up a difficulty level, otherwise they moved down a difficulty level.
Training Procedure
After the Pre-Training assessment participants were randomly allocated to one of the three training conditions and received specific instructions about the training phase along with a personalised ‘homepage’. This homepage included the number of training sessions completed and how long they had to wait before starting the next session. Participants were only allowed to start the next session after 10 hours had elapsed from the previous. On the training homepage there was also a link to the post-training assessment session that they could access 10 hours after completing all the training sessions. Participants received three sessions of adaptive training, there were eight blocks per training session and 20 trials per block.
Overview of Procedure
All participants signed up and completed all sessions online via Prolific. All participants completed the same set of six assessment tasks, each of which required the participant to make same-different judgements about two spikey shapes, both before (pre) and after (post) training: Simultaneous-Spikiness (SSP); Simultaneous-Number (SN);
Simultaneous-Switching (SSW); Delayed-Spikiness (DSP); Delayed-Number (DN); Delayed-Switching (DSW). Upon completion of the first assessment session participants were then randomly allocated to one of three training groups: Simultaneous-Spikiness-Training (SSPT); Delayed-Switching-Training (DSWT); or Speeded-Response-Mapping-Training (Control). The first two groups (SSPT and DSWT) trained on their assessment task counterparts (SSP and DSW), whilst the third group trained acted as a control. Each training group then received three sessions of adaptive training spaced out across a few days before completing the second assessment session (see figure 4).
Figure 4. Overview of Procedure.
Data Exclusion
All incoming data were screened for quality based on summary statistics saved using JavaScript/JATOS. Participants with particularly low accuracy and reaction times across tasks at pre-training assessment (Accuracy<56% and RT<600ms; based on pilot data) were assumed to not be engaging and were excluded from the study. Furthermore, participants who did not complete all sessions were excluded from analysis. Of the 199 participants who started, 183 participants completed all sessions.
After data collection, participants who scored below 2 standard deviations (calculated task wise at pre-training) on two or more tasks at pre or post-training were excluded from all subsequent analyses. Again, this was intended to remove participants who were not engaging with the tasks. This resulted in 8 out of the 183 participants to be excluded (Simultaneous Spikiness Training =5, Delayed Switching Training=1, Control=2; Chi-Square: X2=3.25, p=0.196, BF10=0.867), leaving 175 in total (59 in the Simultaneous Spikiness Training group, 60 in the Delayed Switching Training, and 56 in the Control group).
Further to this, univariate data points 1.5 times the interquartile range above the third quartile or below the first were considered statistical outliers and individuals were excluded from any analyses on the respective task. This resulted in 6, 5, 1, 3, 7, and 4 of the 175 participants to be excluded from tasks Simultaneous Spikiness,Simultaneous Number, Simultaneous Switching,Delayed Spikiness,Delayed Number,and Delayed Switching respectively.
Training data were partially missing for 17 of the participants (Simultaneous Spikiness Training =4, Delayed Switching Training=8, Control=5), to our knowledge they completed the training session and the missing data was caused by an unknown technical issue when attempting to upload their data to the server. As such, these participants were removed from the training data analyses but still included in the rest of the analyses.
Analysis Plan
Data were analysed using both traditional null-hypothesis significance testing (NHST) and the more recently advocated Bayesian methods. Statistics are reported for both methods where they have been applied; however, Bayesian metrics are preferred as they allow us to quantify the strength of evidence in favour of the null and alternative hypotheses in an unbiased manner (Wagenmakers et al., 2018). All main analyses were conducted using JASP software (JASP Team, 2019). Inverse Bayes factors (BF10)expressing the odds of the alternative hypothesis relative to the null are used throughout (Jeffreys, 1961; van Doorn et al., 2019). For the NHST analyses, Holm-corrected p-values with a family wise alpha of 0.05 are used throughout to adjust for multiple comparisons.Holm-correction is used a slightly less conservative alternative to the Bonferroni method (see Chen et al., 2017 for more details). First p-values are ranked from smallest to largest, starting with the smallest and continuing in a stepwise fashion, the original values are adjusted according to the total number of comparisons (the greater the number of comparisons the greater the adjustment) and their rank (the lower the rank the smaller the adjustment). Let p′ = adjusted p = unadjusted p-value, 𝛼′ = adjusted alpha, 𝛼 = family wise alpha, m = number of comparisons, and i = 1,…,m, then:
The procedure stops as soon as the first observed and thereafter all remaining p-values are declared non-significant.
To evaluate transfer effects, we report results from ANCOVA models for each task and each group contrast, wherein post-training performance is the dependent variable, group is the independent variable and pre-training performance is a covariate. We opted for ANCOVAs instead of repeated measures ANOVAs as they are considered more powerful and less biased in randomised studies such as this one (Senn, 2006; van Breukelen, 2006). Moreover, including pre-training performance as a covariate controls for potential aptitude by treatment effects (Karbach et al., 2017). However, it is important to note that this deviates from our pre-registration analysis plan, in which we proposed using repeated measures ANOVAs (osf.io/36ayf).
Results
Task performance was primarily operationalised as accuracy because we instructed participants to be as accurate as possible within the time constraints and made no mention of speed. However, reaction time results are presented in the Supplementary Materials (Tables S3 & S4).
Pre-training Performance
A series of one-way ANOVAs tested for pre-training differences in task performance (see Table 2). We found moderate evidence only for a difference between groups on the Simultaneous-Switching task at pre-training assessment . Post-hoc analyses provided strong evidence that the Control group had lower accuracy than the Delayed Switching Training group on the Simultaneous Switching task at pre-training assessment (t(113)=3.08, d=0.56, p = 0.007, BF10=11.15). There was no strong evidence for group differences on the pre-training assessment when comparing the Control and Simultaneous Spikiness Training groups (t(112)=1.69, d=0.32, p=0.184, BF10=0.74) nor when comparing the Simultaneous Spikiness Training and Delayed Switching Training groups (t(117)=1.40, d=0.26, p=0.184, BF10=0.48).
Table 2. Assessment summary statistics for accuracy performance.
| Tasks | Accuracy (%) | Paired t-test results | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Assessment | Training | Pre-training | Post-training | Difference (Post-Pre) | |||||||||
| M | SD | M | SD | M | SD | df | t | d | BF 10 | p | |||
| SSPT | 75.69 | 8.01 | 83.94 | 7.33 | 8.25 | 8.18 | 55 | 7.55 | 1.00 | >100 | <0.001*** | ||
| SSP | DSWT | 75.77 | 7.10 | 80.46 | 8.58 | 4.69 | 9.64 | 59 | 3.76 | 0.48 | 63.55 | <0.001*** | |
| Control | 74.70 | 7.88 | 75.29 | 8.30 | 0.59 | 8.84 | 52 | 0.48 | 0.06 | 0.16 | 1.000 | ||
| SSPT | 79.60 | 4.53 | 79.23 | 7.08 | - 0.37 | 7.04 | 57 | 0.39 | 0.05 | 0.15 | 0.690 | ||
| SN | DSWT | 80.07 | 5.12 | 78.12 | 6.71 | - 1.95 | 6.86 | 58 | 2.18 | 0.28 | 1.28 | 0.033* | |
| Control | 78.18 | 5.45 | 76.80 | 6.62 | - 1.38 | 6.21 | 52 | 1.61 | 0.22 | 0.50 | 0.444 | ||
| SSPT | 65.83 | 8.20 | 72.64 | 8.34 | 6.81 | 6.75 | 58 | 7.75 | 1.00 | >100 | <0.001*** | ||
| SSW | DSWT | 68.04 | 8.71 | 73.20 | 8.05 | 5.16 | 8.09 | 59 | 4.94 | 0.63 | >100 | <0.001*** | |
| Control | 63.12 | 8.69 | 68.12 | 8.50 | 5.00 | 9.05 | 54 | 4.09 | 0.55 | >100 | <0.001*** | ||
| SSPT | 67.28 | 9.27 | 70.07 | 8.38 | 2.79 | 8.78 | 58 | 2.43 | 0.31 | 2.13 | 0.054 | ||
| DSP | DSWT | 65.90 | 7.38 | 69.73 | 8.72 | 3.83 | 8.09 | 58 | 3.63 | 0.47 | 43.10 | <0.001*** | |
| Control | 64.27 | 7.01 | 64.77 | 7.91 | 0.50 | 8.01 | 53 | 0.45 | 0.06 | 0.16 | 1.000 | ||
| SSPT | 73.32 | 7.42 | 75.51 | 6.24 | 2.19 | 7.46 | 57 | 2.23 | 0.29 | 1.41 | 0.058 | ||
| DN | DSWT | 72.62 | 7.75 | 75.81 | 6.76 | 3.19 | 8.81 | 55 | 2.70 | 0.36 | 3.93 | 0.018* | |
| Control | 72.40 | 7.62 | 73.94 | 5.95 | 1.54 | 7.54 | 53 | 1.49 | 0.20 | 0.42 | 0.444 | ||
| SSPT | 62.00 | 7.07 | 65.65 | 6.49 | 3.65 | 7.39 | 56 | 3.72 | 0.49 | 55.97 | <0.001*** | ||
| DSW | DSWT | 60.90 | 6.46 | 68.29 | 7.44 | 7.39 | 6.97 | 57 | 8.07 | 1.05 | >100 | <0.001*** | |
| Control | 59.39 | 8.01 | 62.76 | 6.74 | 3.37 | 8.09 | 55 | 3.11 | 0.41 | 10.44 | 0.015* | ||
Assessment task abbreviations: Simultaneous Spikiness (SSP); Simultaneous Number (SN); Simultaneous Switching (SSW); Delayed Spikiness (DSP); Delayed Number (DN); Delayed Switching (DSW). Training group abbreviations: Simultaneous Spikiness Training (SSPT); Delayed Switching Training (DSWT).
p < .05.
p < .01.
p < .001 (holm-corrected).
Training Task Gains
Paired-samples t-tests (one tailed) were performed for each of the three training groups to establish whether participants made improvements with respect to the average difficulty level achieved between the 4th and 8th block of the first session and the final training session (see Figure 4).All groups made substantial training gains:Simultaneous Spikiness Training (M=1.42, SD=0.97,t(54)=12.87, d=1.73, p<0.001, BF10>100); Delayed Switching Training (M=1.64, SD=1.23, t(51)=9.55, d=1.32, p<0.001, BF10>100); Control (M=0.49, SD=0.47, t(50)=7.41, d=1.03, p<0.001, BF10>100).
Transfer Effects
To investigate whether the groups show differential transfer patterns, we conducted a series of ANCOVAs that establish group differences in post-training performance, whilst covarying for pre-training performance. The full results are shown alongside the corresponding descriptive statistics for each task and each group contrast in Table 3 and Figure 5. The positive evidence for group differences is summarised below. A statistical comparison of these effect sizes is also provided in the Supplementary Materials.
Table 3. Pairwise group ANCOVAs of post-training accuracy adjusted for baseline performance.
| Group Contrast | Task | Post-training accuracy difference(%) |
ANCOVA | ||||
|---|---|---|---|---|---|---|---|
| df | F | p | BF 10 | ||||
| SSPT-Control | SSP | 8.25 | (1,106) | 36.264 | <0.001*** | >100 | 0.254 |
| SN | 1.64 | (1,108) | 1.837 | 0.534 | 0.480 | 0.016 | |
| SSW | 3.01 | (1,111) | 5.122 | 0.075 | 2.027 | 0.044 | |
| DSP | 3.89 | (1,110) | 7.940 | 0.012* | 7.309 | 0.067 | |
| DN | 1.26 | (1,109) | 1.418 | 0.472 | 0.383 | 0.012 | |
| DSW | 1.96 | (1,110) | 2.865 | 0.093 | 0.768 | 0.025 | |
| DSWT-Control | SSP | 4.77 | (1,110) | 9.928 | 0.004** | 16.181 | 0.082 |
| SN | 0.32 | (1,109) | 0.076 | 0.782 | 0.213 | 0.000 | |
| SSW | 2.78 | (1,112) | 3.941 | 0.098 | 1.317 | 0.034 | |
| DSP | 4.07 | (1,110) | 8.459 | 0.012* | 8.835 | 0.071 | |
| DN | 1.81 | (1,107) | 2.468 | 0.357 | 0.603 | 0.022 | |
| DSW | 4.87 | (1,111) | 16.436 | <0.001*** | >100 | 0.129 | |
| SSPT- DSWT | SSP | 3.51 | (1,113) | 6.250 | 0.013* | 3.205 | 0.052 |
| SN | 1.34 | (1,114) | 1.237 | 0.536 | 0.336 | 0.010 | |
| SSW | 0.72 | (1,116) | 0.351 | 0.554 | 0.227 | 0.003 | |
| DSP | -0.37 | (1,115) | 0.074 | 0.785 | 0.200 | 0.000 | |
| DN | -0.50 | (1,111) | 0.193 | 0.661 | 0.215 | 0.001 | |
| DSW | -3.15 | (1,112) | 7.286 | 0.016* | 4.682 | 0.061 | |
Assessment task abbreviations: Simultaneous Spikiness (SSP); Simultaneous Number (SN); Simultaneous Switching (SSW); Delayed Spikiness (DSP); Delayed Number (DN); Delayed Switching (DSW). Training group abbreviations: Simultaneous Spikiness Training (SSPT); Delayed Switching Training (DSWT).
p < .05.
p < .01.
p < .001 (Group-wise holm-corrected).
Figure 5.
Improvements on the training tasks across training sessions. Session 1 statistics exclude the first 4 blocks to mitigate the bias of starting at level 1. Sessions 2 and 3 statistics include all 8 blocks. ***p < .001
Simultaneous Spikiness Training vs Control
After training, the Simultaneous Spikiness Training group had greater accuracy relative to Controls on the Simultaneous Spikiness and Delayed Spikiness tasks when controlling for pre-training scores.
Delayed Switching Training vs Control
After training, the Delayed Switching Training group had greater accuracy relative to Controls on the Simultaneous Spikiness , Delayed Spikiness , and Delayed Switching tasks when controlling for pre-training scores.
Simultaneous Spikiness Training vs Delayed Switching Training
Both training groups made greater on-task gains relative to the other. The Simultaneous Spikiness Training group had greater accuracy relative to the Delayed Switching Training group on the Simultaneous Spikiness task after training , when controlling for pre-training scores. Conversely, the Delayed Switching Training group had greater accuracy relative to the Simultaneous Spikiness Traininggroup on the Delayed Switching task after training , when controlling for pre-training scores.
Transfer to Components of the Switching Tasks
We further investigated whether there was any partial transfer to the switching tasks by analysing performance on the two judgment types within the switching tasks separately. This was primarily to examine whether practice on one judgment would improve performance on these judgments only in a switching context, or whether it would generalise to both judgment types. Summary statistics are provided in Table 4.
Table 4. Pairwise group ANCOVAs of post-training accuracy on the switching tasks by judgment type, adjusted for baseline performance.
| Task | Judgement | Post-training accuracy difference(%) |
ANCOVA | ||||
|---|---|---|---|---|---|---|---|
| df | F | p | BF 10 | ||||
| SSPT-Control | |||||||
| SSW | Spikiness | 3.97 | (1,111) | 6.02 | 0.048* | 3.07 | 0.05 |
| Number | 2.37 | (1,111) | 2.68 | 0.208 | 0.68 | 0.02 | |
| DSW | Spikiness | 1.08 | (1,110) | 0.60 | 0.440 | 0.27 | 0.00 |
| Number | 3.22 | (1,110) | 3.06 | 0.146 | 0.79 | 0.02 | |
| DSWT-Control | |||||||
| SSW | Spikiness | 2.34 | (1,112) | 2.09 | 0.302 | 0.55 | 0.01 |
| Number | 4.00 | (1,112) | 6.33 | 0.039* | 3.79 | 0.05 | |
| DSW | Spikiness | 3.73 | (1,111) | 8.18 | 0.015* | 8.19 | 0.06 |
| Number | 6.33 | (1,111) | 11.27 | 0.003** | 27.25 | 0.09 | |
| SSPT- DSWT | |||||||
| SSW | Spikiness | 1.73 | (1,116) | 1.35 | 0.302 | 0.35 | 0.01 |
| Number | -1.10 | (1,116) | 0.56 | 0.543 | 0.25 | 0.00 | |
| DSW | Spikiness | -2.75 | (1,112) | 3.59 | 0.122 | 0.99 | 0.03 |
| Number | -3.04 | (1,112) | 3.27 | 0.146 | 0.82 | 0.02 | |
Assessment task abbreviations: Simultaneous Switching (SSW); Delayed Switching (DSW). Training group abbreviations: Simultaneous Spikiness Training (SSPT); Delayed Switching Training (DSWT).
p < .05.
p < .01.
p < .001 (Group-wise holm-corrected).
As expected, after training the Delayed Switching Training group had greater accuracy relative to Controls on the spikiness and enumeration judgments on the Delayed Switching Task, when controlling for pre-training scores. In addition there was some evidence for partial transfer to the Simultaneous Switching Task, as the Delayed Switching Training group had greater accuracy relative to Controls on the enumeration judgments after training , when controlling for pre-training scores. Finally, there was some evidence that the Simultaneous Spikiness Training partially transferred to spikiness judgments on the Simultaneous Switching Task relative to Controls after training , when controlling for pre-training scores.
Task Relationships and Transfer
We tested whether the pattern of transfer could be predicted on the basis of task relationships. There were three different ways of operationalising task relationships: i) the overall number of shared features, proportional to the total number of features, ii) whether the ‘spikiness’ feature was shared, and iii) task correlations at baseline (see feature coding in Tables S1 and S2, and task correlation structure in Figure S1). We did this because we wanted to look at graded patterns of transfer across tasks, rather than binary criterion of whether individual tasks show significant transfer or not. The former is likely to be more informative as to the nature of transfer.
For this analysis to work we would need to calculate each subject’s individual task improvement, relative to the mean performance change for the control group. And then calculate how much of the variability in task improvement can be explained by the three ways of operationalising task relationship. In other words, how well can we predict the training gain for a given task by the strength of its relationship with the respective training task?
We first z-scored (across groups) all of the post-training scores within each task, then fit a simple regression model for the Simultaneous Spikiness Training and Delayed Switching Training groups separately, wherein the difference in performance at post-training relative to the control group was the outcome variable, and degree of overlap (operationalised in three ways) was the predictor variable. This gave one beta co-efficient for each individual (i.e. how much each individual subject’s pattern of transfer was determined by each of the three ways of calculating task relationship), and thus a distribution of beta coefficients across subjects (from which we derived Bayes factors). We then used 2000 bootstrapped samples to produce p-values (subsequently holm-corrected). The proportion of shared features was significantly predictive for the Simultaneous Spikiness Training group (mean β=0.17, p=0.010, BF10=5.52) but not the Delayed Switching Training group (mean β=0.04, p=0.215, BF10=0.20). This was repeated, but with the binary predictor of whether the tasks shared the spikiness feature, which was the only feature shared between the training groups. The results show that the spikiness feature was predictive of transfer for the Delayed Switching Training group (mean β=0.29, p<0.001, BF10=228.85) and the Simultaneous Spikiness Training group (mean β=0.12, p=0.042, BF10=1.29).Finally, we repeated this procedure once more but with the pre-training correlation values as predictors.The results showed that the correlations between training and assessment tasks at pre-training were not predictive of transfer across tasks for the Simultaneous Spikiness group (mean β=0.08, p=0.121, BF10=0.50) or the Delayed Switching Training group (mean β=0.00, p=0.458, BF10=0.15).
In a final analysis we examined how the correlation structure of the tasks changed as a result of training. Performance on the tasks generally correlated more after training and was significant for certain task pairs in each training group (see Figure S2). We do not discuss these here, but interested readers can find them in the Supplement.
Discussion
The current study evaluated how practice-dependent transfer is related to shared task features using a tightly controlled randomised design with a relatively large sample and adaptive control group. All of the tasks required same-different judgements on a common set of spikey shapes. We systematically varied the task components required to perform each task, such that they formed a nested hierarchy. Training was then performed on two of the tasks: one was relatively ‘low’ in the hierarchy requiring just simultaneous judgments of shapes’ spikiness, whereas the other was relatively ‘high’, requiring delayed judgments of shapes’ spikiness or number of spikes in a switching paradigm. Using the full complement of tasks before and after training we could then test whether and how these ‘low’ and ‘high’ training tasks cascade through their hierarchy.
Both training groups showed on-task improvements as well as selective transfer to other tasks, relative to active controls. Specifically, Simultaneous-Spikiness training transferred to a delayed-presentation variant but not to tasks requiring an enumeration judgement nor those requiring switching between judgements. The Delayed-Switching training transferred to two tasks requiring spikiness judgements but not to tasks requiring an enumeration judgment nor to the other switching task variant with a simultaneous presentation type. In short, there was evidence of transfer to other tasks requiring the same basic spikiness judgement, but no evidence of transfer of ‘switching’ ability, or transfer to the enumeration judgement.
In a final analysis we directly assessed whether task relationships could predict transfer patterns within the hierarchy. For both training groups, relative to the control, whether or not an assessment task required the spikiness judgement was significantly predictive of the pattern of transfer. For the Simultaneous-Spikiness training group, the overall overlap in features between the training and assessment tasks significantly predicted the pattern of transfer, but not for the Delayed-Switching training group. Pre-training between-task correlations were not predictive of the pattern of transfer for either group.
Multi-component Training Resulted in Broader Transfer
The higher-level delayed switching training transferred to lower-level tasks requiring spikiness judgments, but the reverse is not true. Simple spikiness training did not transfer up the hierarchy to either of the tasks requiring switching. More precisely, these findings suggest that switching was a boundary condition for transfer within our hierarchy. Furthermore, training on both judgments in the switching paradigm did not prevent transfer to the spikiness tasks despite this training group receiving half as much practice on these judgment types. That the Simultaneous-Spikiness training did not transfer to either of the switching tasks, suggests that the additional demands imposed by switching are enough to nullify the Spikiness judgement training effects. That is, getting better at one of the constituent tasks does not influence the ease with which participants can switch between the tasks.
As previously discussed in the introduction, the idea of a task-set or cognitive routine may help explain these findings. Simple spikiness training may engender a task-set/cognitive-routine that serves to bias information processing and prevent interference from irrelevant stimulus information (e.g. number of spikes) and/or response-mappings, helping to maintain an improved on-task performance (Rogers & Monsell, 1995; Dreisbach & Wenke, 2011; Gathercole et al., 2019). However, this same task-set/cognitive-routine may get exogenously activated in the context of the switching tasks due to the use of the same stimuli and similar demands across tasks. This could cause some initial negative transfer effects (Rogers & Monsell, 1995; Dreisbach & Wenke, 2011) and may explain why switching was a boundary condition here. Conversely, switching training may engender a more relaxed task-set/cognitive-routine leading to broader transfer effects on novel untrained tasks (Dreisbach & Wenke, 2011; Sabah et al., 2019; Gathercole et al., 2019).
Task Relationships have Mixed Predictive Power for Transfer
Establishing taxonomic relationships amongst cognitive tasks, and thus the overlap between them, is required for making quantitative predictions about the magnitude and distance of transfer, both of which are important for theoretical progress (Reder & Klatzky, 1994; Barnett & Ceci, 2002; Taatgen, 2013; Gathercole et al., 2019). Despite this, few training studies explicitly quantify relationships between training and assessment tasks such that they can be used in predictive models of transfer, though we are aware of some notable exceptions (Singley & Anderson 1985; Taatgen, 2013; Gathercole et al., 2019). In this study we explored three simple measures of relatedness between each of the trained tasks and the remaining untrained assessment tasks to predict transfer. We found that the proportion of shared features between the training and assessment task was predictive of transfer in the case of the Simultaneous Spikiness group but not the Delayed Switching group. Whereas the binary measure of whether the spikiness features was shared was predictive of transfer for both groups. Pre-training correlations were not predictive of transfer for either of our training groups. In short, transfer in this study was best predicted by the presence of a specific shared feature (spikiness judgement) rather than the more general measures of relatedness (number of shared features or correlations).
The lack of predictive power for pre-training correlations suggests that that the two tasks may be predictive of one another prior to training, but this does not mean that an improvement on one will transfer to the other. Presumably this is because two tasks may share key cognitive processes but the cognitive processes recruited likely change or get re-weighted as a function of training (see also Rennie et al. 2019). That is, two tasks may share many of the same cognitive processes both before and after training but unless the ‘key ingredient’ acquired during training, i.e. the process that is responsible for the improvement, is also applicable to the untrained task then we will not see transfer (Gathercole et al., 2019, Rennie et al., 2019; Taatgen, 2013).
Taken together, these findings suggest that the patterns of transfer in this study do not manifest in neat accordance with any of these simple measures of relatedness. Instead, they seem to further echo the sentiment of prior research that not only is transfer tied to specific features but also to the specific context in which these features arise (Gathercole et al., 2019; Holmes et al., 2019; Norris et al., 2019; Sala & Gobet, 2019;Soveri et al., 2017).
Switching does not Transfer Across Presentation Types
Previous studies have shown that task-switch training consistently transfers to other similarly structured switching tasks (requiring different categorical judgements about objects in pictures), evidenced primarily by reduced reaction time switch costs, thought to reflect a reduced interference caused by switching demands (Dorrenbacher et al., 2014; Karbach & Kray, 2009). Despite substantial on-task gains, the Delayed-Switching training failed to transfer to the Simultaneous-Switching paradigm in terms of accuracy or reaction time. Thus, presentation type was a boundary condition for transfer suggesting that the switching skills acquired during training are tied specifically the delayed mode of presentation. It is unclear why this might be, however one key difference between our study and others before is the emphasis we placed on accuracy by instructing participants to be as accurate as possible rather than the more commonplace instruction to be as fast and as accurate as possible. Previous research also failed to find transfer with respect to accuracy (Dorrenbacher et al., 2014; Karbach & Kray, 2009), therefore it is possible that this effect is specific to reaction time, something that was not encouraged by our task-switch training.
We further investigated transfer effects after splitting switching task performance into its constituent spikiness and enumeration judgments. There was a small amount of evidence for transfer of the Delayed-Switching training to the enumeration judgement type in the Simultaneous-Switching task. This suggests that whilst the transfer of skills pertaining to the spikiness judgment (in a switching context) were bounded by presentation type those pertaining to the enumeration judgement were not (in a switching context).In addition, there was anecdotal evidence that Simultaneous-Spikiness training partially transferred to spikiness judgments on the Simultaneous-Switching task. This suggests a graded pattern of transfer, whereby practice on one perceptual judgment may transfer to trials of the same type in a switching context, but only when the presentation type is consistent (i.e. simultaneous).
Transfer was Constrained by the Type of Perceptual Judgment
The generalisability of training gains appeared to be bounded by the judgement type in both groups, as neither showed substantive transfer to tasks involving enumeration judgements. This is most clearly demonstrated by the fact that Simultaneous-Spikiness training transferred to a task (Delayed-Spikiness) comprised of an identical judgement but different presentation, and conversely did not transfer to a task (Simultaneous-Number) comprised of a different judgement but identical presentation. Moreover, the transfer effect for Simultaneous-Spikiness training to the Delayed-Spikiness task was small-medium, whereas the on-task effect was large. This suggests that the cognitive processes learnt during training were specific to both the spikiness judgement and simultaneous presentation mode in tandem (Ahissar & Hochstein, 2004; Dosher & Lu, 2017).
One possible explanation for this specificity is that participants are learning to better represent the spikiness feature by reducing the signal to noise ratio of population codes in the visual cortices via the updating of synaptic weights (Dosher & Lu, 2017; Fahle, 2005).This may reduce ambiguity when making spikiness judgements but not enumeration judgements, due to the number of spikes feature being differentially encoded and unaffected by training. Relatedly, there may be alterations to attentional or executive processes responsible for orchestrating the parsing of spikiness representations and subsequent decision action mappings (Ahissar & Hochstein, 2004; Dosher & Lu, 2017; Taatgen, 2013). Finally, the cognitive routine framework of transfer (Gathercole et al., 2019) emphasises how novelty necessitates the acquisition of new cognitive routines that engender transfer when they can be applied to similarly structured tasks. It is plausible that the routines used for making rapid enumeration judgements are relatively well established prior to the study and thus show less room for improvement and transfer.
Limitations
The current study had several limitations. It could have benefited from a fuller range of training groups. For example, including a group that trained solely on the enumeration judgement would help verify whether this judgment type was truly capped due to prior experience/lack of sensitivity or whether it was the case that more training was required for improvements to manifest. Similarly, including a group that trained on the Delayed-Spikiness task would help determine the extent to which the observed transfer to and from this task was limited due to the presentation type. Another limitation is that participants received only a very small amount of training (three sessions per group) relative to most other training studies, so we cannot know if these findings would extend to longer periods of training which may have induced wider patterns of transfer. This decision was made because our pilot data suggested that on these very simple tasks, improvements were rapid (in accordance with the power law of practice; Newell & Rosenbloom, 1981), and we wanted to maximise our sample size. Nonetheless, it may be that transfer would be more extensive if more extended periods of training were given.
Conclusion
In summary, training at different levels within a feature based taxonomic task hierarchy produces different transfer patterns. The design allowed us to quantify different types of task overlap. The best predictor of whether transfer would occur was whether the tasks shared a particular feature – the spikiness judgement. However, for one training group transfer was also graded depending upon the overall proportion of shared features. Finally, whether task performance is correlated pre-training is not a good predictor of transfer patterns. Collectively, these findings provide a further demonstration of the specificity of transfer, and provide an experimental exploration of the nature of task overlap that is crucial for the transfer of performance improvements.
Supplementary Material
Figure 6.
Mean accuracies pre- and post-training for each group on each task. Significant group differences are shown at pre-training (from Table 1) and at post training after controlling for pre-training performance (from Table 3). Error bars show the 95% confidence interval about the mean. *p <.05.**p < .01. ***p <.001(Group-wise holm-corrected).
Acknowledgments
We would like to give a special thanks to Becky Gilbert for her help setting up the experiment online. And we would like to thank Susan Gathercole and Dennis Norris for helpful guidance on the design of the study.
Funding
This work was supported by the Medical Research Council [grant number MC-A0606-5PQ41].
Footnotes
Conflicting interests
The Author(s) declare(s) that there is no conflict of interest.
References
- Anderson JR. Acquisition of cognitive skill. Psychological review. 1982;89(4):369. [Google Scholar]
- Ahissar M, Hochstein S. The reverse hierarchy theory of visual perceptual learning. Trends in cognitive sciences. 2004;8(10):457–464. doi: 10.1016/j.tics.2004.08.011. [DOI] [PubMed] [Google Scholar]
- Barnett SM, Ceci SJ. When and where do we apply what we learn?: A taxonomy for far transfer. Psychological bulletin. 2002;128(4):612. doi: 10.1037/0033-2909.128.4.612. [DOI] [PubMed] [Google Scholar]
- Berry AS, Zanto TP, Clapp WC, Hardy JL, Delahunt PB, Mahncke HW, Gazzaley A. The influence of perceptual training on working memory in older adults. PloS one. 2010;5(7) doi: 10.1371/journal.pone.0011537. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chein JM, Morrison AB. Expanding the mind’s workspace: Training and transfer effects with a complex working memory span task. Psychonomic bulletin & review. 2010;17(2):193–199. doi: 10.3758/PBR.17.2.193. [DOI] [PubMed] [Google Scholar]
- Chen SY, Feng Z, Yi X. A general introduction to adjustment for multiple comparisons. Journal of thoracic disease. 2017;9(6):1725. doi: 10.21037/jtd.2017.05.34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cole MW, Laurent P, Stocco A. Rapid instructed task learning: A new window into the human brain’s unique capacity for flexible cognitive control. Cognitive, Affective, & Behavioral Neuroscience. 2013;13(1):1–22. doi: 10.3758/s13415-012-0125-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Covey TJ, Shucard JL, Shucard DW. Working memory training and perceptual discrimination training impact overlapping and distinct neurocognitive processes: Evidence from event-related potentials and transfer of training gains. Cognition. 2019;182:50–72. doi: 10.1016/j.cognition.2018.08.012. [DOI] [PubMed] [Google Scholar]
- Dörrenbächer S, Müller PM, Tröger J, Kray J. Dissociable effects of game elements on motivation and cognition in a task-switching training in middle childhood. Frontiers in psychology. 2014;5:1275. doi: 10.3389/fpsyg.2014.01275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Doorn J, van den Bergh D, Bohm U, Dablander F, Derks K, Draws T, et al. Ly A. The JASP guidelines for conducting and reporting a Bayesian analysis. 2019 doi: 10.3758/s13423-020-01798-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Leeuw JR. jsPsych: A JavaScript library for creating behavioral experiments in a web browser. Behavior Research Methods. 2015;47(1):1–12. doi: 10.3758/s13428-014-0458-y. [DOI] [PubMed] [Google Scholar]
- Dörrenbächer S, Müller PM, Tröger J, Kray J. Dissociable effects of game elements on motivation and cognition in a task-switching training in middle childhood. Frontiers in psychology. 2014;5:1275. doi: 10.3389/fpsyg.2014.01275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dosher BA, Lu ZL. Hebbian reweighting on stable representations in perceptual learning. Learning & Perception. 2009;1(1):37–58. doi: 10.1556/LP.1.2009.1.4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dosher B, Lu ZL. Visual perceptual learning and models. Annual Review of Vision Science. 2017;3:343–363. doi: 10.1146/annurev-vision-102016-061249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dreisbach G, Wenke D. The shielding function of task sets and its relaxation during task switching. Journal of Experimental Psychology: Learning Memory and Cognition. 2011;37:1540–1546. doi: 10.1037/a0024077. [DOI] [PubMed] [Google Scholar]
- Fahle M. Perceptual learning: specificity versus generalization. Current opinion in neurobiology. 2005;15(2):154–160. doi: 10.1016/j.conb.2005.03.010. [DOI] [PubMed] [Google Scholar]
- Fine I, Jacobs RA. Comparing perceptual learning across tasks: A review. Journal of vision. 2002;2(2):5–5. doi: 10.1167/2.2.5. [DOI] [PubMed] [Google Scholar]
- Gathercole SE, Dunning DL, Holmes J, Norris D. Working memory training involves learning new skills. Journal of Memory and Language. 2019;105:19–42. doi: 10.1016/j.jml.2018.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hedge C, Powell G, Sumner P. The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences. Behavior Research Methods. 2018;50(3):1166–1186. doi: 10.3758/s13428-017-0935-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Holmes J, Woolgar F, Hampshire A, Gathercole SE. Are working memory training effects paradigm-specific? Frontiers in psychology. 2019;10:1103. doi: 10.3389/fpsyg.2019.01103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jeter PE, Dosher BA, Petrov A, Lu ZL. Task precision at transfer determines specificity of perceptual learning. Journal of vision. 2009;9(3):1–1. doi: 10.1167/9.3.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jeter PE, Dosher BA, Liu SH, Lu ZL. Specificity of perceptual learning increases with increased training. Vision research. 2010;50(19):1928–1940. doi: 10.1016/j.visres.2010.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karbach J, Kray J. How useful is executive control training? Age differences in near and far transfer of task-switching training. Developmental science. 2009;12(6):978–990. doi: 10.1111/j.1467-7687.2009.00846.x. [DOI] [PubMed] [Google Scholar]
- Karbach J, Könen T, Spengler M. Who benefits the most? Individual differences in the transfer of executive control training across the lifespan. Journal of Cognitive Enhancement. 2017;1(4):394–405. [Google Scholar]
- Katz B, Shah P, Meyer DE. How to play 20 questions with nature and lose: Reflections on 100 years of brain-training research. Proceedings of the National Academy of Sciences. 2018;115(40):9897–9904. doi: 10.1073/pnas.1617102114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Klingberg T. Training and plasticity of working memory. Trends in cognitive sciences. 2010;14(7):317–324. doi: 10.1016/j.tics.2010.05.002. [DOI] [PubMed] [Google Scholar]
- Lu ZL, Dosher BA. Mechanisms of perceptual learning. Learning & perception. 2009;1(1):19–36. doi: 10.1556/LP.1.2009.1.3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maul A, Irribarra DT, Wilson M. On the philosophical foundations of psychological measurement. Measurement. 2016;79:311–320. [Google Scholar]
- Melby-Lervåg M, Redick TS, Hulme C. Working memory training does not improve performance on measures of intelligence or other measures of “far transfer” evidence from a meta-analytic review. Perspectives on Psychological Science. 2016;11(4):512–534. doi: 10.1177/1745691616635612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Minear ME, Shah P, Park D. Age, task switching, and transfer of training; Poster presented at the Ninth Cognitive Aging Conference; Atlanta, GA. 2002. [Google Scholar]
- Minear M, Shah P. Training and transfer effects in task switching. Memory & cognition. 2008;36(8):1470–1483. doi: 10.3758/MC.336.8.1470. [DOI] [PubMed] [Google Scholar]
- Minear M, Brasher F, Guerrero CB, Brasher M, Moore A, Sukeena J. A simultaneous examination of two forms of working memory training: Evidence for near transfer only. Memory & Cognition. 2016;44(7):1014–1037. doi: 10.3758/s13421-016-0616-9. [DOI] [PubMed] [Google Scholar]
- Miyake A, Friedman NP, Emerson MJ, Witzki AH, Howerter A, Wager TD. The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis. Cognitive psychology. 2000;41(1):49–100. doi: 10.1006/cogp.1999.0734. [DOI] [PubMed] [Google Scholar]
- Newell A. Unified theories of cognition. Harvard University Press; 1994. [Google Scholar]
- Newell A, Rosenbloom PS. Mechanisms of skill acquisition and the law of practice. Cognitive skills and their acquisition. 1981;1(1981):1–55. [Google Scholar]
- Norris DG, Hall J, Gathercole SE. Can short-term memory be trained? Memory & cognition. 2019;47(5):1012–1023. doi: 10.3758/s13421-019-00901-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parsons B, Magill T, Boucher A, Zhang M, Zogbo K, Bérubé S, et al. Faubert J. Enhancing cognitive function using perceptual-cognitive training. Clinical EEG and neuroscience. 2016;47(1):37–47. doi: 10.1177/1550059414563746. [DOI] [PubMed] [Google Scholar]
- Reder L, Klatzky RL. The effect of context on training: Is learning situated? Carnegie-Mellon UNIV Pittsburgh PA School of computer Science; 1994. No. CMU-CS-94-187. [Google Scholar]
- Redick TS. The hype cycle of working memory training. Current Directions in Psychological Science. 2019;28(5):423–429. doi: 10.1177/0963721419848668. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rogers RD, Monsell S. Costs of a predictible switch between simple cognitive tasks. Journal of Experimental Psychology: General. 1995;124:207–231. [Google Scholar]
- Rennie JP, Zhang M, Hawkins E, Bathelt J, Astle DE. Mapping differential responses to cognitive training using machine learning. Developmental science. 2020;23(4):e12868. doi: 10.1111/desc.12868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sabah K, Dolk T, Meiran N, Dreisbach G. When less is more: Costs and benefits of varied vs. fixed content and structure in short-term task switching training. Psychological research. 2019;83:1531–1542. doi: 10.1007/s00426-018-1006-7. [DOI] [PubMed] [Google Scholar]
- Sala G, Gobet F. Cognitive training does not enhance general cognition. Trends in cognitive sciences. 2019;23(1):9–20. doi: 10.1016/j.tics.2018.10.004. [DOI] [PubMed] [Google Scholar]
- Schmiedek F, Lövdén M, Lindenberger U. Hundred days of cognitive training enhance broad cognitive abilities in adulthood: Findings from the COGITO study. Frontiers in aging neuroscience. 2010;2:27. doi: 10.3389/fnagi.2010.00027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schweizer K. Investigating the relationship of working memory tasks and fluid intelligence tests by means of the fixed-links model in considering the impurity problem. Intelligence. 2007;35(6):591–604. [Google Scholar]
- Senn S. Change from baseline and analysis of covariance revisited. Statistics in medicine. 2006;25(24):4334–4344. doi: 10.1002/sim.2682. [DOI] [PubMed] [Google Scholar]
- Singley MK, Anderson JR. The transfer of text-editing skill. International Journal of Man-Machine Studies. 1985;22(4):403–423. [Google Scholar]
- Singley MK, Anderson JR. The transfer of cognitive skill. 9. Harvard University Press; 1989. [Google Scholar]
- Simons DJ, Boot WR, Charness N, Gathercole SE, Chabris CF, Hambrick DZ, Stine-Morrow EA. Do “brain-training” programs work? Psychological Science in the Public Interest. 2016;17(3):103–186. doi: 10.1177/1529100616661983. [DOI] [PubMed] [Google Scholar]
- Soveri A, Antfolk J, Karlsson L, Salo B, Laine M. Working memory training revisited: A multi-level meta-analysis of n-back training studies. Psychonomic bulletin & review. 2017;24(4):1077–1096. doi: 10.3758/s13423-016-1217-0. [DOI] [PubMed] [Google Scholar]
- Sprenger AM, Atkins SM, Bolger DJ, Harbison JI, Novick JM, Chrabaszcz JS, et al. Dougherty MR. Training working memory: Limits of transfer. Intelligence. 2013;41(5):638–663. [Google Scholar]
- Taatgen NA. The nature and transfer of cognitive skills. Psychological review. 2013;120(3):439. doi: 10.1037/a0033138. [DOI] [PubMed] [Google Scholar]
- Von Bastian CC, Oberauer K. Effects and mechanisms of working memory training: a review. Psychological research. 2014;78(6):803–820. doi: 10.1007/s00426-013-0524-6. [DOI] [PubMed] [Google Scholar]
- Van Breukelen GJ. ANCOVA versus change from baseline had more power in randomized studies and more bias in nonrandomized studies. Journal of clinical epidemiology. 2006;59(9):920–925. doi: 10.1016/j.jclinepi.2006.02.007. [DOI] [PubMed] [Google Scholar]
- Van Dam LC, Ernst MO. Mapping shape to visuomotor mapping: learning and generalisation of sensorimotor behaviour based on contextual information. PLoS computational biology. 2015;11(3) doi: 10.1371/journal.pcbi.1004172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wagenmakers EJ, Marsman M, Jamil T, Ly A, Verhagen J, Love J, et al. Matzke D. Bayesian inference for psychology. Part I: Theoretical advantages and practical ramifications. Psychonomic bulletin & review. 2018;25(1):35–57. doi: 10.3758/s13423-017-1343-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.






