Abstract
BALB/c (n = 8) and C57BL/6 (n = 11) male mice were trained under an incremental repeated acquisition (IRA) procedure using two distinct training procedures: forward and backward chaining. A new metric for assessing progress on the IRA procedure, progress quotient (PQ), quantified progress as the product of chain length and number of reinforcers earned during a session divided by the total number of reinforcers earned. BALB/c mice progressed further, had higher overall responding, earned more reinforcers, and acquired the response sequences faster than the C57BL/6 mice on both training procedures. There were only minimal effects of training procedure for either strain. The strain differences found between BALB/c and C57BL/6 mice confirm the importance of genetic background to behavior. C57BL/6 mice may be deficient in learning as compared with BALB/c mice but other contributing factors probably include overall responding, motivation, and more rapid satiation or habituation to sucrose reinforcement by the C57BL/6 mice. PQ is a sensitive and valid measure of progress for use in studies of mastery-based incremental repeated acquisition and BALB/c mice perform this challenging learning task better than do C57BL/6 mice.
Keywords: backward chaining, BALB/c, C57BL/6, forward chaining, incremental repeated acquisition, learning
1. Introduction
By requiring a subject to acquire a new sequence of responses during each experimental session, repeated acquisition (RA) procedures are well suited for studying learning processes using a within-subject experimental design (Boren, 1963; Boren & Devine, 1968; Cohn & Paule, 1995). These procedures are frequently used to examine the effects of drug administration and neurotoxicant exposure on learning (Cohn, Cox, & Cory-Slechta, 1993; Cohn, MacPhail, & Paule, 1996; Thompson & Moerschbaecher, 1978, 1979) and have been successfully adapted to both operant chambers and mazes, in which the acquisition of behaviors such as lever-pressing, nose-poking, door openings, and entering the correct arm of a maze have been examined (Brooks, Cory-Slechta, Murg, & Federoff, 2000; Cohn & Paule, 1995; Peele & Baron, 1988).
The incremental repeated acquisition of behavioral chains procedure (IRA) (Pieper, 1976; Wienberger & Killam, 1978) is a variation of an RA procedure that begins with a short chain (usually a single response) and increments to progressively longer chains through the course of a single session as behavior meets pre-set criteria. The IRA procedure allows for the assessment of drug effects on learning at different levels of task-difficulty within a single session (Cohn & Paule, 1995; Weinberger & Killam, 1978). Since IRA procedures adjust difficulty according to an animal’s performance, they can identify the longest chain that a particular subject can master quickly and without making the task so difficult that responding deteriorates due to a lack of reinforcers.
The validity of the IRA procedure as a measure of overall cognitive function is suggested by its high correlation, in children, with scores on IQ tests (Paule, Chelonis, Buffalo, Blake, & Casey, 1999; Paule, Cranmer, Wilkins, Stern, & Hoffman, 1988). RA procedures may also offer a laboratory model of executive function (Shannon & Love, 2004). IRA (and RA) procedures have been used with a wide variety of species including rats (Mayorga, Popke, Fogle, & Paule, 2000; Wright & Paule, 2007; Wright, Popke et al., 2007), mice (Brooks et al. 2000; Sanders, Williams, & Wenger, 2009; Wenger, Schmidt, & Davisson, 2004), mini-pigs (Ferguson, Gopee, Paule, & Howard, 2009), non-human primates (Paule et al., 1988; Paule et al., 1990; Pieper, 1976; Weinberger & Killam, 1978) and humans (Bickel, Higgins, & Hughes, 1991; Higgins et al., 1992; Paule et al., 1999; Paule et al., 1988; Paule, Forrester, Maher, Cranmer, & Allen, 1990). Wenger et al. (2004) and Sanders et al. (2009) trained Ts65Dn mice and littermate controls on the IRA procedure, finding that both Ts65Dn mice, a model of Down’s syndrome, and littermate controls could be trained to respond on this operant task when chain-length began at one and incremented according to the number of reinforcers earned. Ts65Dn mice were indistinguishable from littermate controls when the required chain length was one or two, but showed a learning deficit when the required chain length was increased to three or four. Thus, a task in which difficulty varies within a session can identify learning deficits.
When using mice as experimental subjects, strain differences become an important consideration. BALB/c and C57BL/6 mice are inbred strains commonly used in behavioral research, and C57BL/6 mice often serve as the background strain for genetic studies (Crawley et al., 1997; Otobe & Makino, 2004; Zarcone, Chen, & Fowler, 2004). These strains display differing patterns of responding on many behavioral tasks. BALB/c mice nose-poke for sucrose reinforcers at a higher rate than C57BL/6 mice (Johnson, Pesek, & Newland, 2009) and produce more licks on a force-sensing disk and lick with a higher force (Wang & Fowler, 1999). BALB/c mice outperform C57BL/6 mice during the acquisition and, especially, reversals of discrimination in appetitively-motivated T-mazes (Moy et al., 2007). BALB/c mice tend to perform more poorly on water mazes (Crawley et al., 2007; Owen, Logue, Rasmussen, & Wehner, 1997; Van Dam, Lenders, & De Deyn, 2006), but this may be an artifact of poor swimming (Chapillon & Debouzie, 2000; Wahlsten, Cooper, & Crabbe, 2005).
Procedural manipulations affect the acquisition of response sequences (Cohn et al., 1996), and these could interact with strain. There are two approaches to increasing chain length in the IRA procedure, forward and backward chaining. In forward chaining, a response chain is acquired by adding new links after (at the end of) previously acquired links. For example, to acquire a sequence of RBLR using forward chaining, the chain would increment as follows: R → sucrose, R-B → sucrose, R-B-L → sucrose, R-B-L-R → sucrose. In backward chaining, a response chain is acquired by adding new links before (in front of) previously acquired links. For example, to acquire the same sequence using backward chaining, the chain would increment as follows: R → sucrose, L-R → sucrose, B-L-R → sucrose, R-B-L-R → sucrose. Backward chaining is the more commonly used approach (Ferguson et al., 2009; Mayorga et al., 2000; Sanders et al., 2009; Wenger et al., 2004; Wright, Popke, et al., 2007), but direct comparisons have not been conducted with laboratory animals. The few studies conducted with humans suggest that forward chaining produces comparable or sometimes higher accuracy than backward chaining (Batra & Batra, 2006; Smith, 1999).
The present study was designed to assess differences between two behaviorally divergent inbred mouse strains (BALB/c and C57BL/6), and to examine the efficacy of two training procedures (backward and forward chaining) on an IRA procedure. A mastery-based IRA procedure was used in which a new link was added only after a shorter chain was mastered. When a mastery-based criterion is used to increase the chain length, a simple measure such as accuracy is problematic since the conditions under which it is calculated will not be consistent across sessions. This issue is described in more detail in the discussion. Because of difficulties in the interpretation of traditional dependent measures when using a mastery-based criterion, a global measure of the quality of learning that we call the progress quotient (PQ) was used. PQ produces more meaningful comparisons by integrating multiple measures to produce one more revealing gauge of progress. Under certain conditions, PQ has been shown to detect improvements in learning by low doses of d-amphetamine and decrements at high doses that were not detected by traditional dependent measures (i.e. accuracy) (Bailey, Johnson, & Newland, 2010).
2. Materials and methods
2.1. Subjects
Eight BALB/c and 11 C57BL/6 experimentally naïve male mice obtained from Harlan Laboratories (Indianapolis, IN) were housed individually in a temperature- and humidity-controlled AAALC-accredited colony room with 12-h light/dark cycles with free access to water in their home cages. The mice were approximately 12 weeks old at the start of the study. Weight was maintained at approximately 25 grams, or 80–85% of free-feeding weight, as determined by the vendor’s growth charts. A portion of the animals’ caloric intake was available during experimental sessions via 20 mg sucrose pellets. All experiments were approved by the Auburn University Animal Care and Use Committee.
2.2. Apparatus
Eleven model #ENV-007 rat operant chambers (Med Assoc., St. Albans, VT) fitted to accommodate mice were situated inside sound-attenuating ventilated shells. Each chamber contained two front nose-poke devices, designated left (L) and right (R), and one back (B) nose-poke. Mice activated the nose-poke devices by interrupting a photobeam. A pellet dispenser delivered 20 mg sucrose pellets into a food tray situated between the front two nose-pokes. Each chamber was equipped with Sonalert tones (2900 and 4500 Hz) for presentation of auditory stimuli, which served as discriminative stimuli. A house light located at the top of the back wall illuminated each chamber when errors were made. MED-PC (Med Associates, Georgia, VT) was used to program the experiment and collect data with 0.01 s resolution.
2.3. Training and behavioral procedure
Nose-poking was autoshaped in 4-hour sessions. First, sucrose was delivered under a fixed-time 5.5 min schedule (non-contingent sucrose delivered every 330 s); a tone sounded and a LED light above the nose-poke device and the LED within the nose-poke recess were lit during the last 30 s of this interval. Sucrose was also available following a nose-poke under a fixed-ratio 1 schedule running concurrently. After 10 reinforced nose-poke responses, the fixed-time schedule terminated and only a fixed-ratio 1 schedule remained in operation, during which the LED within the nose-poke remained on for the duration of the session. This was conducted on each nose-poke device until at least 40 reinforced nose-pokes occurred on the available device for two sessions. Only one device was active at a time, and the other devices were inactive and darkened. After autoshaping, all experimental sessions were conducted between 8:00 AM and 1:00 PM five days a week, and sessions ended after 1 hr or after 50 reinforcers were obtained in the four-link chain. All animals were run in the same experimental chambers each day.
Originally, six BALB/c mice and five C57BL/6 mice were randomly assigned to a backward chaining group, and five BALB/c mice and six C57BL/6 mice were randomly assigned to a forward chaining group. Three BALB/c mice died unexpectedly over the course of the experiment and are excluded from all analyses, leaving five mice in the backward chaining group and three in the forward chaining group.
The mice acquired the performance chain first and then the learning chains were introduced. For both groups, the session began with a one-link chain, i.e. a single response was reinforced. For the backward chaining group, a second link was added to the front of the chain, to form a two-link chain, after six consecutive correct one-link chains were produced. A third link was added to the front of the two-link chain after three consecutive correct two-link chains, and a fourth link was added after three consecutive correct three-link chains. For the forward chaining group, each new link was added after previous links. As with backward chaining, a second, third, and fourth link were added after six, three, and three consecutive correct chains, respectively. A stricter criterion was used to advance out of the one-link chain, as compared to that used to transition out of longer chains, because of the ease of performing a single correct response and the higher likelihood of performing three correct responses serendipitously.
Sessions began with all three nose-poke device LED lights illuminated within the nose-poke recesses. A response on the correct device was followed by a 0.8 s tone, the delivery of one 20 mg sucrose pellet, and the darkening of all lights for an inter-trial interval (ITI) of 5 s. An incorrect response resulted in the illumination of the house-light and the deactivation of the nose-poke devices (thus turning off the nose-poke LED lights) which lasted 5 s (i.e. a time-out period). The current chain reset to the beginning of the sequence after all errors. To prevent adventitious reinforcement and discourage intertrial responding, a nose-poke during the last second of the ITI or time-out delayed the start of the next trial by 2 s. Links one (closest to reinforcement) to four were paired with a distinct auditory stimulus: low-tone, low-pulsing tone, high-pulsing tone, or high-tone respectively. The low-pulsing tone and high-pulsing tones followed a square-wave pulse of 0.4 s or 0.2 s frequency, respectively. Note that backward- and forward-trained animals encountered these stimuli in a different order.
Testing for both strains was done simultaneously. Initially, animals were required to perform the same four response sequence (RBLR) during each performance session. After 20 sessions, the learning chains were introduced on alternate days. The 12 learning chains used for this experiment were chosen based on the following criteria: no consecutive responses on the same nose-poke device (e.g. RBLL), no sequence could require more than one response in the same position as the performance chain, and each sequence required a response on each of the three nose-poke devices. A learning session required the animals to acquire one of 12 four response sequences: LBRL, LRBL, LRBR, LRLB, LBRB, RLBL, RLRB, BLRL, BRBL, BLBR, BRLB, BLRB. These sequences were presented semi-randomly (without replacement) in such a way as to balance the initial nose-poke and thus reduce bias from developing. Learning and performance sessions were identical with the exception that a discriminative stimulus signaled whether a learning (flashing nose-poke LEDs) or performance (steady nose-poke LEDs) session was in effect.
2.4. Data analysis
The dependent measures of interest in the present study included a progress quotient (PQ), overall accuracy [(total number of correct responses/total number of responses) × 100], accuracy for one-, two-, three-, and four-link chains, total responses per session, total reinforcers, and maximum chain length reached. As our primary measure of progress during the IRA, PQ is a weighted sum of the number of reinforcers delivered. Each reinforcer is weighted by the chain length, which is the number of successive correct responses required for reinforcement. This weighted sum was divided by the total number of reinforcers delivered:
(Equation 1) |
Rfi = number of reinforcers earned on a chain length of length i and Rftot = total number of reinforcers earned throughout the session. In the present experiment, PQ could vary from 1.0 (the animal received all of its reinforcers in the one-link chain) to 3.56 (the animal had perfect performance, i.e. six reinforcers in the one-link chain, three reinforcers in the two-link chain, three reinforcers in the three-link chain, and 50 reinforcers in the four-link chain).
The initial 20 performance sessions before the implementation of learning chains are deemed “acquisition of the performance chain” and analyses on these sessions were conducted independently of the other performance sessions. The last five performance sessions (36, 38, 40, 42, and 44) and the last five learning sessions (35, 37, 39, 41, and 43) were used for all other analyses.
Two-way repeated-measures analyses of variance (RMANOVA) were used to examine the effects of session, strain, and training procedure. The Huynh-Feldt correction was used when necessary (i.e. epsilon < 0.75). All error bars represent the standard error of the mean and p < 0.05 was the criterion for statistical significance. Only statistically significant interactions are reported. Statistical analyses were conducted using SYStat11 (San Jose, CA).
3. Results
3.1. Comparison of PQ and accuracy (percent correct) as an overall index
Figure 1 shows overall accuracy (top) and PQ (bottom) on the performance chain (left) and learning chains (right) across all sessions. The longest chain length reached for selected sessions is also shown on these graphs. To simplify the presentation of the comparison between these two measures, only the backward chaining groups are reported in this section (BALB/c n = 5; C57BL/6 n = 5). The forward chaining groups were essentially the same as in Figure 1.
For the acquisition of the performance chain (sessions 1–20), overall accuracy increased as a function of session (F(19, 152) = 5.29, p < .001) and BALB/c mice displayed higher overall accuracy than C57BL/6 mice (F(1, 8) = 7.17, p = 0.028). There was a strain X session interaction (F(1, 19) = 2.02, p = 0.023) such that accuracy for BALB/c mice was higher than that for C57BL/6 mice as a function of session. PQ values also increased as a function of session (F(19, 152) = 10.651, p < 0.001), and BALB/c mice had higher PQ values than C57BL/6 mice (F(1, 8) = 16.12, p = 0.004). There was a strain X session interaction (F(1, 19) = 2.49, p = 0.012) such that PQ values for BALB/c mice were higher than that for C57BL/6 mice as a function of session. The F-values for PQ were always larger than those for accuracy, indicating that the ratio of variance due to session or strain to error variance was greater for PQ than for accuracy.
Most BALB/c mice acquired a two-link chain during session one and reached three- or four-link chains by session five. This is indicated by the average chain length value inserted in the overall accuracy and PQ figures (left, top and bottom) corresponding to sessions 1, 5, 10, 15 and 20. This improvement is clearly visible in the PQ measure but not in accuracy, especially during the initial 15 sessions. By session 20, most of the C57BL/6 mice had reached three- or four-link chains, but they received very few or no reinforcers on these longer chain lengths (data not shown here). This inferior performance by the C57BL/6 mice is captured by the low PQ values for these animals. Note also that PQ, but not accuracy, revealed an effect of a 12-day winter break between sessions 11 and 12 and, for the C57BL/6 mice only, by the introduction of the learning chains at session 21. None of these effects were adequately captured by overall accuracy.
The large strain differences on the performance chain remained intact throughout the course of the study. On the last five performance sessions, BALB/c mice displayed higher overall accuracy (F(1, 8) = 39.11, p < 0.001) and PQ values (F(1, 8) = 98.66, p < 0.001) than C57BL/6 mice. There was a main effect of session for both overall accuracy (F(4, 32) = 3.73, p = 0.013) and PQ values (F(4,32) = 3.97, p = 0.018).
On the last five learning sessions, BALB/c mice displayed higher overall accuracy (F(1, 8) = 13.37, p = 0.006) and PQ values (F(1, 8) = 16.46, p = 0.004) than C57BL/6 mice. There was a main effect of session for both overall accuracy (F(4, 32) = 11.53, p < 0.001) and PQ values (F(4, 32) = 7.36, p = 0.001). There was a strain X session interaction (F(4, 32) = 2.76, p = 0.044), such that accuracy for BALB/c mice was higher than that for C57BL/6 mice as a function of session.
3.2. Effect of training procedure
Figure 2 shows PQ values for the performance chain (top) and learning chains (bottom) for BALB/c mice (left) and C57BL/6 mice (right) when trained using backward or forward chaining procedures. For BALB/c mice, the backward chaining procedure appeared to facilitate acquisition of the performance chain (Figure 2, top left), but this difference did not meet traditional levels of significance (F(1, 6) = 5.32, p = 0.061). For C57BL/6 mice, there was no effect of training procedure on PQ values during acquisition of the performance chain. When a two-way RMANOVA was conducted with training procedure and session as independent variables (collapsed across strain), there was a main effect of training procedure such that the backward chaining procedure supported higher PQ values than the forward chaining procedure (F(1,17) = 4.94, p = 0.040). For both the last five days of performance and learning, there was no statistical difference between the PQ values of the forward and backward chaining groups for either BALB/c or C57BL/6 mice.
3.3. Total responses
Figure 3 shows total responses per session for the performance chain (top) and learning chains (bottom) for BALB/c mice (left) and C57BL/6 mice (right) when trained using backward or forward chaining procedures. There were no significant differences in total responses per session between backward and forward chaining groups for either strain during the acquisition of the performance chain (all p-values > 0.5), the last five performance sessions (all p-values > 0.1) or the last five learning sessions (all p-values > 0.3). Training procedures were combined to examine strain X session interactions. During acquisition of the performance chain, BALB/c mice responded more than C57BL/6 mice (F(1, 17) = 23.69, p < 0.001). For both strains, total responses increased as a function of session (F(19, 323) = 13.18, p < 0.001) and this increase was greater for BALB/c mice than C57BL/6 mice, as revealed in a session X strain interaction (F(19, 323) = 2.67, p = 0.006).
3.4. Other dependent measures
Table 1 shows strain and training-procedure differences on several dependent measures during the last five performance sessions (left) and the last five learning sessions (right). Overall, there was no effect of training procedure on any measure during the last five days of performance or learning. This was also true for the acquisition of the performance chain. The only exception is that C57BL/6 mice in the forward chaining procedure reached slightly higher overall accuracy than C57BL/6 mice in the backward chaining procedure during the last five learning chains (F(1, 9) = 5.17, p = 0.049).
Table 1.
Strain group | Performance |
Learning |
||||||||
---|---|---|---|---|---|---|---|---|---|---|
Chain length | PQ | Accuracy | Responses | Rfs | Chain length | PQ | Accuracy | Responses | Rfs | |
BALB/c backward | 3.96 (.04) | 3.14 (.07) | .73 (.03) | 390 (36) | 76 (1.4) | 3.60 (.13) | 2.09 (.17) | .48 (.03) | 414 (57) | 54 (8.1) |
BALB/c forward | 3.93 (.07) | 3.03 (.17) | .68 (.07) | 480 (27) | 77 (5.8) | 3.27 (.24) | 2.04 (.11) | .47 (.03) | 480 (8.5) | 58 (2.3) |
C57BL/6 backward | 3.12 (.14) | 2.05 (.09) | .49 (.03) | 213 (17) | 32 (1.9) | 2.24 (.12) | 1.36 (.07) | .36 (.02) | 208 (16) | 30 (2.8) |
C57BL/6 forward | 2.87 (.30) | 1.83 (.18) | .51 (.03) | 187 (32) | 28 (3.8) | 2.13 (.10) | 1.29 (.09) | .42 (.09) | 165 (38) | 29 (4.9) |
The values represent the average (SEM) across all animals for the last five sessions. BALB/c backward training n = 5; BALB/c forward training n = 3; C57BL/6 backward training n = 5; C57BL/6 forward training n = 6; Rfs = total reinforcers earned.
Because there was no consistent effect of training procedure, backward and forward chaining groups were combined for further analyses of strain differences. There were consistent and large strain differences on all measures (Table 1). During the last five performance sessions, BALB/c mice reached longer chain lengths, higher PQ values (F(1, 17) = 72.26, p < 0.001), higher overall accuracy (F(1, 17) = 37.96, p < 0.001), responded more (F(1, 17) = 48.35, p < 0.001), and received more reinforcers (Rfs) (F(1, 17) = 214.95, p < 0.001) than C57BL/6 mice. These differences also occurred during the learning chains: BALB/c mice reached longer chain lengths, higher PQ values (F(1, 17) = 44.49, p < 0.001), higher overall accuracy (F(1, 17) = 12.01, p = 0.003), responded more (F(1, 17) = 40.27, p < 0.001), and received more reinforcers (F(1, 17) = 23.56, p < 0.001) than C57BL/6 mice. These large and consistent strain differences were also apparent during the acquisition of the performance chain (data not shown).
3.5. Accuracy within each chain length
Figure 4 shows accuracy across different chain lengths for BALB/c mice (left) and C57BL/6 mice (right) when trained using backward and forward chaining procedures during the last five performance sessions (top) and the last five learning sessions (bottom). Accuracy is an appropriate measure here because chain length is held constant. Clearly, C57BL/6 mice showed a significant impairment as chain length increased.
During the last five performance sessions, for BALB/c mice there was no effect of training procedure, but there was an effect of chain length (F(3, 18) = 3.92, p = 0.026). However, there was not a systematic decrease in accuracy as a function of chain length. Post-hoc tests revealed that accuracy in the two-link chain was slightly lower than accuracy in the one- (p = 0.013) and four-link chains (p = 0.029) and accuracy in the three-link chain was lower than accuracy in the four-link chain (p = 0.018). For C57BL/6 mice, there was no effect of training procedure, but there was an effect of chain length (F(3, 27) = 54.11, p < 0.001), such that accuracy decreased as a function of chain length. This systematic decrease was confirmed with post-hoc tests showing that accuracy in each successive chain length was lower than the accuracy in shorter chains (all p-values < 0.030). Training procedure did not interact with any measure when analyzed as a function of chain length for either strain.
For the learning chains, for BALB/c mice there was no effect of training procedure, but they did show an impairment with longer chain lengths (F(3, 18) = 6.59, p = 0.003). Post-hoc tests revealed that this effect appeared because accuracy in the four-link chain was significantly lower than accuracy in all of the other chain lengths (all p-values < 0.043). For C57BL/6 mice, there was no effect of training procedure, but there was a large effect of chain length (F(3, 27) = 185.79, p < 0.001), such that accuracy decreased as a function of chain length. This systematic decrease was confirmed with post-hoc tests showing that accuracy in each successive chain length was lower than the accuracy in shorter chains (all p-values < 0.037). Training procedure did not interact with any measure when analyzed as a function of chain length for either strain.
4. Discussion
The performance of BALB/c and C57BL/6 mice was examined on an IRA procedure in which backward or forward chaining was used to increase chain length. Three main conclusions arose from this investigation. First, as a global measure of progress on the IRA, maximum chain length reached and overall accuracy are problematic in a mastery-based implementation of the IRA procedure. A weighted count of reinforcers delivered that was normalized by total reinforcers earned (PQ) is a sensitive and valid global index to measure progress. Second, BALB/c mice perform much better than C57BL/6 mice on both the performance and learning chains, and these differences persist even with extended training. Third, the role of training procedure is relatively minor.
4.1. Mastery-based criterion used to increment chain length
In an RA procedure, a behavioral chain with a set length is acquired within a session but the responses comprising that chain vary from session to session or sometimes within a session. With an IRA procedure, the chain length begins at one and increments throughout the course of a session. With these procedures, criteria for incrementing the chain must be imposed, and studies differ regarding these criteria, which are often based on the number of correct chains completed.
In one implementation with mice, chain length increased after a preset number of nonconsecutive reinforcers were delivered and error correction was not permitted (i.e. an incorrect response resulted in a time-out period and reset the sequence to the beginning) (Sanders et al., 2009; Wenger et al., 2004). In an implementation with rats, chain length increased after 20 nonconsecutive correct chains and errors resulted in a time-out period, but did not reset the sequence (i.e. error correction was allowed) (Wright, Pearson, Hammond, & Paule, 2007). In the present implementation, an animal had to complete six (for the first, one-link chain) or three (for subsequent two- and three-link chains) consecutive successful chains before the chain would increment. Error correction was not permitted. The requirement that reinforcers be consecutive ensured that a chain was well-mastered. These specific criteria were selected after preliminary studies with rats and mice revealed that a longer criterion took a long time to attain within a single session and a shorter criterion did not result in good performance on the longer chains.
The incrementing procedure used here, and in a related study with rats (Bailey et al., 2010), produced longer chains than reported in other studies of incremental repeated acquisition procedures using rats where a chain length of two or three is common (Wright, Popke et al., 2007; Wright, Pearson et al., 2007). Seven BALB/c mice (87.5%) reached a four-link performance chain within 16 sessions and six BALB/c mice (75%) reached a four-link learning chain within seven sessions. In fact, this strain displayed evidence of a ceiling effect (see below), as did rats in Bailey et al. (2010). In studies of a hybrid mouse strain (Ts65Dn littermate controls), four-link learning chains (Wenger et al, 2004; Sanders et al., 2009) and performance chain lengths (Sanders et al., 2009) were eventually reached. The training procedure used here was built on the one used in those studies differing mainly in the criteria for incrementing the chain length.
Several possible reasons for the rapid acquisition and high level of performance seen here can be offered. First was the requirement that a chain occur correctly three (or, for the one-link chain, six) times in a row before the next link was added, a strict mastery criterion that ensured a high level of performance before a chain was lengthened. Another possibility was that the performance chain was well-established before learning chains were introduced. Thus, the acquisition of relevant behavior, including reliable nose-poking and the execution of sequences, as well as the extinction of irrelevant behavior occurred while acquiring an unchanging chain. Other possibilities, which do not necessarily distinguish this study from others, may be the use of separate, 1 hr sessions for learning and performance chains, the frequent reinforcement of correct chains and the use of specific auditory (rather than visual) stimuli with each separate chain length.
4.2. Session accuracy and maximum chain length are problematic measures here
Using a mastery-based criterion renders many global dependent measures problematic since the chain length acquired varies across sessions and subjects. Overall session accuracy, for example, implicitly assumes homogeneity in the task over the session, but this consistency is not guaranteed. For example, one animal may perform with relatively high accuracy, but never have the chain advance beyond one- or two-links, or it may respond at such a low rate that a short chain is in place throughout the session. It would be inappropriate to claim that such performance is equivalent to that of an animal that performs with similar accuracy while completing 50, four-link chains.
One way to circumvent the limitations of overall accuracy is to examine accuracy as a function of chain length (Figure 4). This was useful here in making simple comparisons across strains, but becomes cumbersome when examining the effects of drugs (Bailey et al., 2010) and toxicants where multiple doses are examined. For example, to examine drug effects on accuracy for different chain lengths would require separate analyses for each chain length and each dose. Moreover an additional set of interpretational difficulties is produced when maximum chain length reached varies across different conditions or subjects.
Maximum chain length reached is also a problematic global measure of progress or performance in the mastery-based approach used here. An animal receiving only one or two reinforcers in a four-link chain would be quite different from an animal that acquired 50 reinforcers (the maximum) in the four-link chain, although the maximum chain length reached is the same for both animals.
4.3. PQ
PQ is the normalized sum of the products of reinforcers obtained and criterion chain lengths. Thus, it counts all responses that are part of correctly completed sequences and weights long chains more heavily than short ones. It can be viewed as the area under the curve produced by plotting criterion sequences against chain length. A measure used by Paule and McMillan (1986) and Wright, Popke, et al. (2007), percent task completed, accomplishes a similar goal, but requires that there be a pre-defined number of trials or reinforcers so that a percentage can be calculated. A strict mastery-based IRA procedure, such as the one implemented in the present study, renders percent task completed inappropriate because the number of reinforcers earned within a specific chain length is free to vary. Therefore, the PQ equation normalizes the score by the number of reinforcers actually obtained so it remains valid even if the number of trials or reinforcers that occur varies.
An example of the distinction between PQ and overall accuracy is clearly visible during sessions 1–15 of acquisition of the performance chain. For these initial 15 sessions, there was no difference in overall accuracy between BALB/c and C57BL/6 mice. PQ, however, captured large differences between these two strains, differences also reflected in maximum chain length reached. The difference in PQ values reveals that BALB/c mice were performing more correct sequences in longer chains than C57BL/6 mice. Similarly, PQ often revealed strain differences during learning sessions that were not detected using overall accuracy.
Some disadvantages of PQ can be noted (see also Bailey et al., 2010). PQ does not directly incorporate errors. We experimented with more complex measures that did include errors and found none to be as satisfactory (not reported). It can be noted that errors enter the calculation indirectly since they decrease the number of reinforcers that can be obtained during a time-limited or reinforcer-limited session. Also, PQ is a novel measure so it is difficult to form comparisons across studies. However, by determining a maximum PQ it is possible to compare actual performance against idealized performance.
4.4. Backward and forward chaining procedures
The present experiment revealed that the difference between backward and forward chaining procedures was relatively minor. In some cases there appeared to be differences, based on visual inspection, but these were not statistically significant. For example, it appears from Figure 2 that BALB/c mice under the backward chaining procedure acquired the performance chain more rapidly than the forward chaining group, but the p-value associated with this comparison was 0.06. It is possible that the sample sizes used as well as variability in behavior is responsible for the inability to demonstrate statistical significance. Support for this hypothesis is revealed by the main effect of training procedure seen when running a two-way RMAVOVA on training procedure and session (collapsed across strain) during acquisition of the performance sequence. Thus, there may still be value to examining conditions under which forward or backward chaining procedures support better performance on the IRA.
4.5. Strain differences
Mice are being used more frequently to understand genetic contributions to behavior, and specifically learning. The present experiment and those of Wenger et al. (2004) and Sanders et al. (2009) offer support for the use of mice in complex operant learning tasks such as the IRA. Both BALB/c and C57BL/6 mice could be trained to respond under the IRA procedure. Each strain autoshaped quickly and all mice from each strain were eventually able to produce a performance chain and acquire learning chains to some degree. For both strains, the performance chain supported a higher PQ and accuracy than the learning chains.
There were, however, notable strain differences found between BALB/c and C57BL/6 mice on this challenging learning task, confirming the importance of genetic background to behavior. On every measure the performance of the BALB/c mice was superior to that of the C57BL/6 mice. These differences became apparent as early as the first two sessions. The BALB/c mice not only acquired longer chains faster, as indicated by the maximum chain length reached, but received increasingly more reinforcers on these longer chain lengths, as indicated by the increasing PQ values. Even after extensive practice (approximately 60 performance sessions), the PQ values of the C57BL/6 mice did not improve over those values seen in Figure 1, suggesting a deficit in learning compared to BALB/c mice. These strain differences also appeared on the learning chains where BALB/c mice progressed further than the C57BL/6 mice.
BALB/c mice also recovered from a disruption more rapidly than C57BL/6 mice. The performance chain was disrupted by the 12-day winter break for both strains, as revealed by PQ values, but the BALB/c mice recovered within two to four sessions, while C57BL/6 mice required more sessions. PQ values on the performance chain were disrupted by the introduction of the learning chains at session 21 for the C57BL/6 mice but not for the BALB/c mice.
When accuracy on the performance chain was examined as a function of chain length (Figure 4), the strain differences became especially striking. The BALB/c mice performed the one-link and four-link chains with equally high accuracy. In contrast, the C57BL/6 mice, even after extensive training, showed a length-dependent decline in accuracy. The finding that there was no systematic effect of chain length in the BALB/c mice suggests an unanticipated ceiling effect with our implementation of the IRA. In an ongoing study using a six-link IRA procedure, BALB/c mice do show a length-dependent decline in accuracy that is only revealed in the five-and six-link chains. Similarly for the learning chains, the C57BL/6 mice showed much greater sensitivity to chain length than the BALB/c mice. It is interesting to note that while the accuracy of the C57BL/6 mice in the one-link chain resembled that of the BALB/c mice, the strain differences become increasingly evident as the chain length (i.e. difficulty) increased.
It is unclear why the BALB/c mice were so superior to the C57BL/6 mice on the IRA task, but the data here point to genetic predispositions that may have interfered with learning by the C57BL/6 mice. BALB/c mice responded more during the session and received more sucrose reinforcers than C57BL/6 mice. C57BL/6 mice consistently received approximately 30 reinforcers per session over the course of the entire experiment, even as PQ scores and total responding increased, while BALB/c mice continued to respond vigorously throughout the course of the session, receiving nearly 80 reinforcers during performance sessions. It was not unusual for C57BL/6 mice to leave sucrose pellets uneaten at the end of the session (unreported observations), while this was never seen with the BALB/c mice.
These findings suggest that C57BL/6 mice may satiate more quickly to sucrose reinforcers and motivation may decrease throughout a session. Alternatively, C57BL/6 mice may habituate to the sucrose reinforcer more quickly than BALB/c mice, a hypothesis that is quite distinct from the satiation hypothesis (McSweeney & Murphy, 2009). Another possibility is suggested by observations that C57BL/6 mice are more active in open fields (Crawley et al., 1997). This competing behavior (i.e. increased activity directed away from sucrose-reinforced responding) displayed by the C57BL/6 mice may have interfered with acquisition. These factors could interact to produce more reinforced responding for the BALB/c mice and, accordingly, greater progress through the chain.
Factors like satiation or habituation may contribute to the differences in learning, but they seem unlikely to provide a complete account since the C57BL/6 mice never achieved the performance levels seen in the BALB/c mice, even after extensive experience and reinforcement. Thus, C57BL/6 mice may have some diminished capacity to acquire response sequences on an IRA procedure as compared with BALB/c mice. Overall, these findings point to the importance of equating reinforcer potency and efficacy and/or motivational levels in order to separate strain differences in motivation from other learning differences, as argued by Crawley et al. (1997) and Brooks et al. (2000).
4.6. Conclusion
The present investigation revealed that a weighted measure, called PQ, served as a sensitive global measure of progress on the IRA. It also showed that an incremental repeated acquisition of behavioral chains could be established in BALB/c and C57BL/6 mice, and that the BALB/c mice progressed further, displayed higher total responses, obtained more reinforcers, and produced much longer chains than C57BL/6 mice. Finally, there were only minimal effects of training procedure on the IRA.
Acknowledgments
This work was supported by an Auburn University Undergraduate Research Fellowship awarded to Jennifer M. Johnson and NIH grant ES003299.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Batra M, Batra V. Comparison between forward chaining and backward chaining techniques in children with mental retardation. The Indian Journal of Occupational Therapy. 2006;37(3):57–63. [Google Scholar]
- Bailey JM, Johnson JE, Newland MC. Mechanisms and performance measures in mastery-based incremental repeated acquisition: Behavioral and pharmacological analyses. Psychopharmacology. 2010 doi: 10.1007/s00213-010-1801–3. [DOI] [PubMed] [Google Scholar]
- Bickel WK, Higgins ST, Hughes JR. The effects of diazepam and triazolam on repeated acquisition and performance of response sequences with an observing response. Journal of the Experimental Analysis of Behavior. 1991;56:217–237. doi: 10.1901/jeab.1991.56-217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boren JJ. Repeated acquisition of new behavioral chains. American Psychologist. 1963;17:421. [Google Scholar]
- Boren JJ, Devine DD. The repeated acquisition of behavioral chains. Journal of the Experimental Analysis of Behavior. 1968;11(6):651–660. doi: 10.1901/jeab.1968.11-651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brooks AI, Cory-Slechta DA, Murg SL, Federoff HJ. Repeated acquisition and performance chamber for mice: a paradigm for assessment of spatial learning and memory. Neurobiology of Learning and Memory. 2000;74:241–258. doi: 10.1006/nlme.1999.3951. [DOI] [PubMed] [Google Scholar]
- Chapillon P, Debouzie A. BALB/c mice are not so bad in the Morris water maze. Behavioural Brain Research. 2000;117(1–2):115–118. doi: 10.1016/s0166-4328(00)00292-8. [DOI] [PubMed] [Google Scholar]
- Cohn J, Cox C, Cory-Slechta DA. The effects of lead exposure on learning in a multiple repeated acquisition and performance schedule. Neurotoxicology. 1993;14(2–3):329–346. [PubMed] [Google Scholar]
- Cohn J, MacPhail RC, Paule MG. Repeated acquisition and the assessment of centrally acting compounds. Cognitive Brain Research. 1996;3:183–191. doi: 10.1016/0926-6410(96)00005-5. [DOI] [PubMed] [Google Scholar]
- Cohn J, Paule MG. Repeated acquisition of response sequences: The analysis of behavior in transition. Neuroscience and Biobehavioral Reviews. 1995;19(3):397–406. doi: 10.1016/0149-7634(94)00067-b. [DOI] [PubMed] [Google Scholar]
- Crawley JN, Belknap JK, Collins A, Crabbe JC, Frankel W, Henderson N, et al. Behavioral phenotypes of inbred mouse strains: implications and recommendations for molecular studies. Psychopharmacology. 1997;132:107–124. doi: 10.1007/s002130050327. [DOI] [PubMed] [Google Scholar]
- Ferguson SA, Gopee NV, Paule MG, Howard PC. Female mini-pig performance of temporal response differentiation, incremental repeated acquisition, and progressive ratio operant tasks. Behavioural Processes. 2009;80:28–34. doi: 10.1016/j.beproc.2008.08.006. [DOI] [PubMed] [Google Scholar]
- Higgins ST, Rush CR, Hughes JR, Bickel WK, Lynn M, Capeless MA. Effects of cocaine and alcohol, alone and in combination, on human learning and performance. Journal of the Experimental Analysis of Behavior. 1992;58:87–105. doi: 10.1901/jeab.1992.58-87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson JE, Pesek EF, Newland MC. High-rate operant behavior in two mouse strains: A response bout-analysis. Behavioural Processes. 2009;81:309–315. doi: 10.1016/j.beproc.2009.02.013. [DOI] [PubMed] [Google Scholar]
- Mayorga AJ, Popke EJ, Fogle CM, Paule MG. Similar effects of amphetamine and methylphenidate on the performance of complex operant tasks in rats. Behavioural Brain Research. 2000;109:59–68. doi: 10.1016/s0166-4328(99)00165-5. [DOI] [PubMed] [Google Scholar]
- McSweeney FK, Murphy ES. Sensitization and habituation regulate reinforcer effectiveness. Neurobiology of Learning and Memory. 2009;92(2):189–198. doi: 10.1016/j.nlm.2008.07.002. [DOI] [PubMed] [Google Scholar]
- Moy SS, Nadler JJ, Young NB, Perez A, Holloway LP, Barbaro RP, Barbaro JR, Wilson LM, Threadgill DW, Lauder JM, Magnuson TR, Crawley JN. Mouse behavioral tasks relevant to autism: Phenotypes of 10 inbred strains. Behavioural Brain Research. 2007;176:4–20. doi: 10.1016/j.bbr.2006.07.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Otobe T, Makino J. Impulsive choice in inbred strains of mice. Behavioural Processes. 2004;67:19–26. doi: 10.1016/j.beproc.2004.02.001. [DOI] [PubMed] [Google Scholar]
- Owen EH, Logue SF, Rasmussen DL, Wehner JM. Assessment of learning by the Morris water task and fear conditioning in inbred mouse strains and F1 hybrids: implications of genetic background for single gene mutations and quantitative trait loci analyses. Neuroscience. 1997;80(4):1087–1099. doi: 10.1016/s0306-4522(97)00165-6. [DOI] [PubMed] [Google Scholar]
- Paule MG, Chelonis JJ, Buffalo EA, Blake DJ, Casey PH. Operant test battery performance in children: Correlation with IQ. Neurotoxicology and Teratology. 1999;21(3):223–230. doi: 10.1016/s0892-0362(98)00045-2. [DOI] [PubMed] [Google Scholar]
- Paule MG, Cranmer JM, Wilkins JD, Stern HP, Hoffman EL. Quantification of complex brain function in children: Preliminary evaluation using a nonhuman primate behavioral test battery. Neurotoxicology. 1988;9(3):367–378. [PubMed] [Google Scholar]
- Paule MG, Forrester TM, Maher MA, Cranmer JM, Allen RR. Monkey versus human performance in the NCTR operant test battery. Neurotoxicology and Teratology. 1990;12:503–507. doi: 10.1016/0892-0362(90)90014-4. [DOI] [PubMed] [Google Scholar]
- Paule MG, McMillan DE. Effects of Trimethyltin on incremental repeated acquisition (learning) in the rat. Neurobehavioral Toxicology and Teratology. 1986;8:245–253. [PubMed] [Google Scholar]
- Peele DB, Baron SP. Effects of scopolamine on repeated acquisition of radial-arm maze performance by rats. Journal of the Experimental Analysis of Behavior. 1988;49:275–290. doi: 10.1901/jeab.1988.49-275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pieper WA. Great apes and rhesus monkeys as subjects for pharmacological studies of stimulants and depressants. Federation Proceedings. 1976;35(11):2254–2257. [PubMed] [Google Scholar]
- Sanders NC, Williams DK, Wenger GR. Does the learning deficit observed under an incremental repeated acquisition schedule of reinforcement in Ts65Dn mice, a model for Down syndrome, change as they age. Behavioural Brain Research. 2009;203:137–142. doi: 10.1016/j.bbr.2009.04.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shannon HE, Love PL. Within-session repeated acquisition behavior in rats as a potential model of executive function. European Journal of Pharmacology. 2004;498:125–134. doi: 10.1016/j.ejphar.2004.04.054. [DOI] [PubMed] [Google Scholar]
- Smith GJ. Teaching a long sequence of behavior using whole task training, forward chaining, and backward chaining. Perceptual and Motor Skills. 1999;89:951–965. doi: 10.2466/pms.1999.89.3.951. [DOI] [PubMed] [Google Scholar]
- Thompson DM, Moerschbaecher JM. Operant methodology in the study of learning. Environmental Health Perspectives. 1978;26:77–87. doi: 10.1289/ehp.782677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson DM, Moerschbaecher JM. An experimental analysis of the effects of d-amphetamine and cocaine on the acquisition and performance of response chains in monkeys. Journal of the Experimental Analysis of Behavior. 1979;32:433–444. doi: 10.1901/jeab.1979.32-433. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Dam D, Lenders G, De Deyn PP. Effect of Morris water maze diameter on visual-spatial learning in different mouse strains. Neurobiology of Learning and Memory. 2006;85(2):164–172. doi: 10.1016/j.nlm.2005.09.006. [DOI] [PubMed] [Google Scholar]
- Wahlsten D, Cooper SF, Crabbe JC. Different rankings on inbred mouse strains on the Morris maze and a refined 4-arm water escape task. Behavioural Brain Research. 2005;165:36–51. doi: 10.1016/j.bbr.2005.06.047. [DOI] [PubMed] [Google Scholar]
- Wang G, Fowler SC. Effects of haloperidol and clozapine on tongue dynamics during licking in CD-1, BALB/c and C57BL/6 mice. Psychopharmacology. 1999;147:38–45. doi: 10.1007/s002130051140. [DOI] [PubMed] [Google Scholar]
- Weinberger SB, Killam EK. Alterations in learning performance in the seizure-prone baboon: effects of elicited seizures and chronic treatment with diazepam and Phenobarbital. Epilepsia. 1978;19:301–316. doi: 10.1111/j.1528-1157.1978.tb04493.x. [DOI] [PubMed] [Google Scholar]
- Wenger GR, Schmidt C, Davisson MT. Operant conditioning in the Ts65Dn mouse: Learning. Behavior Genetics. 2004;34(1):105–119. doi: 10.1023/B:BEGE.0000009480.79586.ee. [DOI] [PubMed] [Google Scholar]
- Wright LKM, Paule MG. Response sequence difficulty in an incremental repeated acquisition (learning) procedure. Behavioural Processes. 2007;75:81–84. doi: 10.1016/j.beproc.2007.01.007. [DOI] [PubMed] [Google Scholar]
- Wright LKM, Pearson EC, Hammond TG, Paule MG. Behavioral effects associated with chronic ketamine or remacemide exposure in rats. Neurotoxicology and Teratology. 2007;29:348–359. doi: 10.1016/j.ntt.2006.12.004. [DOI] [PubMed] [Google Scholar]
- Wright LKM, Popke EJ, Allen RR, Pearson EC, Hammond TG, Paule MG. Effects of chronic MK-801 and/or phenytoin on the acquisition of complex behaviors in rats. Neurotoxicology and Teratology. 2007;29:476–491. doi: 10.1016/j.ntt.2007.02.001. [DOI] [PubMed] [Google Scholar]
- Zarcone TJ, Chen R, Fowler SC. Differential acquisition of food- reinforced disk pressing by CD-1, BALB/cJ and C57BL/6J mice. Behavioural Brain Research. 2004;152:1–9. doi: 10.1016/j.bbr.2003.09.010. [DOI] [PubMed] [Google Scholar]