Abstract
The effects of two types of mands on participants’ adherence to instructions were examined across two groups using procedures based on Hackenberg and Joker (Journal of the Experimental Analysis of Behavior 62:367–383, 1994). Participants were presented with instructions describing a pattern of responding for producing points later exchanged for money and were exposed to choice trials in which a progressive-time (PT) and a fixed-time (FT) schedule were concurrently available. The instructions initially described how to optimize point production; however, the PT schedule was manipulated over the course of the experiment such that response patterns maximizing point production differed across conditions. All participants experienced the same experimental arrangement, and the two groups differed only in the form of the mand contained in the instructions presented to them. The instructions for the directive group contained the mand “you must…” (i.e., command) preceding the instructed response pattern, whereas the non-directive group instructions contained the mand “you might consider…” (i.e., suggestion) preceding the instructed response pattern. Results indicated that instruction type influenced response patterns across changing contingencies. The directive group exhibited greater adherence to the instruction than the non-directive group when instruction following was less profitable. Results are interpreted in terms of Skinner’s analysis of verbal behavior, and implications for practical application are discussed.
Keywords: Rule-governed behavior, Mand, Verbal behavior, Instructional control, Schedule control, Progressive-time schedules, Fixed-time schedules, Adult humans
Verbal antecedents, or rules, influence nearly all aspects of the human experience. Rules have been defined as verbal “contingency-specifying stimuli” (Skinner 1969) that set the occasion for discriminated responding known as rule-governed behavior or rule following (Glenn 1987).1 Rule following can be highly adaptive, potentially leading to efficient behavior for temporally proximate reinforcers in our everyday lives and promoting behavior that may lead to more remote reinforcers. For example, personal rules may govern our morning routines to arrive to work on time or rules about daily calorie consumption may lead to better health later in life.
Examining the interaction of antecedent verbal stimuli (e.g., rules, instructions, commands) with contingencies of reinforcement has potential for wide applicability to many important human behaviors. For instance, adherence to rules is especially important when it is undesirable for an individual to directly contact potentially harmful contingencies (e.g., “look both ways before crossing the road”). Alternatively, reducing control by maladaptive or inefficient rules may be beneficial, as in some forms of psychotherapy (Hayes 1993). Behavioral scientists have examined the variables associated with rule governance with much success (e.g., Catania et al. 1982; Doll et al. 2009; Drake and Wilson 2008; Hackenberg and Joker 1994; Hayes et al. 1986a, b; Galizio 1979; Milgram 1963; Raia et al. 2000; Schlinger and Blakely 1987; Schmitt 1998; Vaughan 1985). For example, Galizio (1979) examined whether a large discrepancy between instructions and actual contingencies of reinforcement would influence instruction following. In the study, researchers instructed participants to pull a lever on a certain schedule to avoid a monetary loss. When the instructions provided were accurate, participants consistently followed the instructions and avoided the loss of monetary reinforcers. When rules were inaccurate and described a pattern of responses that was inefficient, but still resulted in the functional consequence of avoiding monetary loss (i.e., pulling the lever more rapidly than required), rule following persisted. However, when inaccurate instructions resulted in the participants contacting the monetary loss, participant responding deviated from the instructions in favor of response patterns minimizing loss.
In another study, Hackenberg and Joker (1994) demonstrated how rules influence responding across changing contingencies of reinforcement. Using a human operant paradigm, participants responded in a concurrent operant arrangement (selecting blue or red boxes on a computer screen using keyboard presses) to earn points exchangeable for money. In their arrangement, responses to the blue box initiated a progressive-time (PT) schedule of point delivery, while responses to the red box initiated a fixed-time (FT 60) schedule. In the PT schedule, the first response produced a point immediately and each subsequent response produced a point at a progressively increased delay based on the scheduled step size. For the PT 4 schedule, the first response to the blue box produced a point after 0 s, the second response after 4 s, the third after 8 s, the fourth after 12 s, and so on. Additionally, responses to the red box reset the PT schedule on the blue box to a 0-s delay. In the experiment, instructions were provided to participants that described a specific response pattern (“The way to earn the most points is to select the blue flashing box, then select the solid blue box four consecutive times, then select the red box”; p. 370). Initially, responding according to the stated instructions resulted in the maximum amount of programmed reinforcers. Across the experiment, the PT schedule ascended from 4 s (step sizes varied across participants) and eventually descended. As the PT schedule deviated from 4 s, the participant could earn the most points by departing from the instructed pattern and matching the reinforcement schedule. The results from the study showed that participants initially engaged in the instructed response pattern, which persisted despite changes in the programmed schedule. Responding eventually shifted to match the programmed reinforcement, thereby maximizing reinforcement when the schedule supported a pattern of responding substantially disparate from the instructed pattern. The rapid acquisition of the initial response pattern and the delayed transition to optimal responding as the schedule changed are indicative of instructional control.
In addition to manipulating reinforcement contingencies, some studies have varied the instruction provided to participants (e.g., Hayes et al. 1986a, b). In a translational investigation by Bicard and Neef (2002), the authors examined the effects of two different types of instructions on the academic responding of children with ADHD. Participants received both tactical instructions, in which a specific pattern of behavior was described as optimal for obtaining reinforcement, and strategic instructions, in which a general pattern of behavior was described to assist the participants in determining the optimal pattern for obtaining reinforcement. The results showed that greater sensitivity to changes in contingencies occurred when participants received strategic instructions, indicating that the structure of rules can influence the extent to which reinforcement schedules exert control on behavior. However, a great deal remains unknown regarding how various characteristics (e.g., accuracy, structure, context) of the complex stimuli of rules function to modify behavior.
Given that instructions and other forms of rules are verbal stimuli, an analysis of the verbal operants may prove useful. Skinner (1969) suggested that rules can function as tacts or mands. Tacts are controlled by non-verbal stimuli and reinforced by generalized conditioned reinforcement from a listener. In the case of rules, the speaker may state the contingency for the listener’s behavior (e.g., “Tow away zone. Vehicles will be towed at owner’s expense.”). Mands are largely controlled by the speaker’s motivating operations and specify the reinforcing consequences for the speaker. In the case of rules, the speaker specifies the listener’s behavior (or its absence) that will be reinforced (e.g., “No parking.”).
Skinner (1957) described mands as particularly expressive verbal operants, with wide variation in kind and dynamic properties. In describing the various types of mands, Skinner loosely grouped them into two kinds based on (1) the initial probability of the specified (manded) behavior of the listener and the contingencies added by the speaker to increase it (command, request, prayer, bribe) and (2) the contingencies for the listener’s behavior independent of the speaker (advice, permission, warning, offer, call). Given that mands do not necessarily benefit the listener, manded behavior may not be highly probable. As a result, mands that invoke additional contingencies mediated by the speaker may increase listener compliance. A common means of accomplishing this is via commands, which include implicit or explicit threats for not engaging in manded behavior (e.g., “Stop the car, or else!”). The classic obedience studies conducted by Milgram (1963) clearly demonstrated the ability of commands to evoke listener behavior typically suppressed by the verbal community. Milgram asked 40 male participants to deliver electric shock at increasingly higher voltages to confederate learners following errors on a learning task. Confederates never actually experienced shock, but behaved in ways that suggested they were experiencing pain, particularly at higher shock values. The experimenter encouraged participants to continue delivering shock if they expressed concern or hesitation by indicating, “Please go on,” but as participants continued to express reluctance, the experimenter used different statements including “You have no other choice, you must go on.” Despite confederates displaying pain and participants showing signs of tension (e.g., seizures, profuse sweating, trembling), a majority of participants delivered the maximum shock voltage. The variables that influenced compliance in Milgram’s studies may be related to the constellation of variables described by Skinner as the dynamic properties of mands (e.g., intonation, loudness, relative strength or weakness of the mand, ability of the speaker to impose aversive consequences, and the authority or prestige of the speaker). That is, “if the listener is not already predisposed to act, the probability of his mediating a reinforcement may depend upon the effectiveness of the aversive stimulation supplied by the speaker” (Skinner 1957, p. 42).
Behavioral investigations of instructional control have also used mands; however, their effects have not been explicitly evaluated. Shimoff et al. (1981), for example, compared schedule sensitivity of low-rate responding established via either shaping or instructions. The instructions included the phrase “you must” when describing the initial pattern of responding (“To make the RED LIGHTS come on, you must press the BLACK BUTTON. You must press slowly; pressing too rapidly will not work.”), potentially functioning as a command. They found that, unlike shaped behavior, instructed behavior remained consistent with experimenters’ instructions despite contacting changes in the programmed contingencies. Although the insensitivity only occurred with instructed behavior, the degree to which the mand “you must” contributed to this result is unknown.
Contingencies of reinforcement and punishment for the manded behavior of the listener may also be independent of the speaker. When a listener is apt to respond in a given way, but appropriate discriminative stimuli are lacking, mands can serve to evoke these behaviors. Skinner (1957) termed mands for which the listener’s behavior would contact positive reinforcement as advice (e.g., “Turn here; it’s a shortcut”). A related mand that signals the absence or removal of aversive consequences for listener behavior is permission (e.g., “No parking, except on weekends”).
The loose grouping of mands noted by Skinner (1957) is consistent with terms later developed by Hayes and colleagues (Hayes 1989; Zettle and Hayes 1982) to describe rules and rule-governed behavior. Zettle and Hayes differentiated rule following based on the operating social and natural contingencies. For example, following the rule, “drive the posted speed limit” might be classified functionally according to whether the driver complies to avoid a citation (social consequence) or to reduce the risk of injury during an accident (natural consequence). If rule following is controlled by consequences mediated by the individual or agency that provided the rule, the behavior is called pliance. Thus, pliance refers to the behavior of the listener in response to a mand in which the speaker provides either reinforcement for compliance or an aversive consequence for non-compliance. If the behavior is controlled by consequences contacted naturally in the environment (e.g., obeying the speed limit reduces the risk of injury in an accident), the behavior is called tracking. During tracking, the listener’s behavior contacts natural reinforcement as a result of compliance with a mand, which Skinner called advice. Thus, tracking could be conceptualized as the listener’s behavior in response to advice when following advice has been reinforced in the past.
Despite advances in research on both verbal behavior and rule governance, a thorough account of how verbal operants influence rule following has yet to be established. The present study sought to evaluate how different mands affect instructional control by employing procedures similar to those of Hackenberg and Joker (1994). We presented instructions that incorporated either the words “must” or “might consider” to alter the function of the antecedent verbal stimulus. These two forms of instruction could potentially capitalize on preexisting histories of reinforcement for rule following. The directive instruction “must” could function as a command, signaling social contingencies for rule adherence and establishing participant pliance; whereas the non-directive instruction “might consider” could function as advice, signaling natural contingencies (or permission, signaling the absence of social contingencies) for rule adherence and establishing participant tracking. Although no attempt was made to program participants’ idiosyncratic reinforcement histories (and this would likely have been a difficult undertaking given the age and learning histories of a college student population; see Baron et al. 1991; Branch 1991), we speculated that all had similarly experienced both directive and non-directive instructions (and the differential consequences associated with rule following). We predicted that participants given the directive instruction (you must…) would persist longer in rule following in the presence of diminishing returns, relative to participants given the non-directive instruction (you might consider…).
Method
Participants
Six undergraduate students enrolled in introductory-level courses in applied behavior analysis at a large Midwestern university participated in the present study. Participants included one male and five females, 18 to 21 years of age (M = 19.86). Monetary incentives were offered in exchange for participation. Participants earned $1.50 per experimental session plus $0.04 per point earned during the study, which they were told when they were recruited for participation and again at the start of the study. Participants did not receive payment until the conclusion of the study. Consistent with the University’s Human Subject Committee guidelines, all participants received the same payment regardless of the amount they earned during experimental sessions. All participants received a payment equivalent to the maximum amount earned by any one participant, which totaled $54.16.
Apparatus
Experimental sessions took place in a small research room measuring 2.2 m by 2.0 m by 2.4 m. The room contained a computer desk, chair, and a wide-aspect touch screen monitor measuring 48.3 cm wide by 26.7 cm high. A computer keyboard was also present for participants to type responses to a verbal prompt at the end of each block. A mirrored glass window on one wall separated the research room from an observation room of the same dimensions. The touch screen monitor was used to present stimuli and served as the primary manipulandum. Responses (touches) registered by the monitor produced brief visual feedback in the form of a small expanding circle at the point of contact. The interface of the computer program consisted primarily of two colored squares—one red and one blue—measuring 8 cm by 8 cm centered on the screen and separated by 6.5 cm. A 23 cm by 7 cm white rectangle displayed above the two colored squares contained instructions printed in black text. Finally, a small white rectangle measuring 3 cm by 1 cm located in the lower left corner of the screen displayed the word “Score:” and the number of points earned by a participant during each block.
Procedure
Prior to beginning the experiment, the experimenter obtained participants’ informed consent and demographic information. Sessions consisted of four 15-min blocks of choice trials. At the beginning of each block, the monitor displayed the instructions and one green square measuring 8 cm by 8 cm centered on the screen, which read, “Press to Begin.” The experimenter asked the participants to read the instructions aloud before beginning the first block of each session. Before commencing subsequent blocks in a session, the experimenter informed the participants that the same instructions still applied. The instructions were present for the duration of each block, but participants were not required to read them aloud prior to beginning the latter three blocks. After participants read the instructions, the experimenter told them to begin by pressing the green box on the monitor and exited the room. Pressing the green box initiated the procedure, indicated by the presentation of the colored squares.
The program presented a series of trials in which two schedules of delayed reinforcement were concurrently available. During each choice trial, the blue and red squares were displayed on the monitor. Selection of the blue square initiated a PT schedule and selection of the red square initiated an FT schedule. The position (i.e., left or right) of the squares on the screen was randomized across choice trials. Participants selected a schedule at each choice trial by pressing the corresponding colored square on the touch screen. The selected square remained on the screen during the time delay until the computer program delivered a point. During the delay, the computer program presented text that read, “Please Wait…” There were no programmed consequences for any response made during the delay. Point delivery was signaled by a brief sound (a two-toned chime) played over the monitor’s speakers and the addition of one point to the score displayed in the lower left corner of the screen. A 3-s intertrial interval was imposed between point delivery and the presentation of the next choice trial. During the intertrial interval, the computer program presented text that read, “Loading…”
Responses to the blue square resulted in a PT delay to point delivery in which delays progressively increased for each subsequent selection of that schedule. The increase in delay was determined by the PT step size in effect. Three step sizes were used in this experiment: 4, 12, and 20 s. Each participant experienced the same three PT schedules first in ascending and then descending order (see Table 1). During all blocks, the initial delay of the PT schedule was 0 s (i.e., a point was delivered immediately). Whenever the delay on the PT schedule was 0 s, the blue square flashed (i.e., alternated between light and dark blue on screen). Subsequent selections of the PT schedule resulted in delays that increased arithmetically. For example, during the PT 4 schedule, the first selection resulted in the immediate delivery of a point, the second in a 4-s delay to point delivery, the third in an 8-s delay, then 12-s, 16-s, and so on. Similarly, during the PT 12 schedule, consecutive selections of the blue square resulted in point delays of 0, 12, 24, 36, 48 s, and so on. Responses to the red square resulted in an FT 60 delay to point delivery throughout the entire experiment. Additionally, selecting the FT schedule reset the PT schedule such that the point delay was again 0 s. The point at which a participant switched from choosing the PT to the FT schedule was considered the switch point. That is, the switch point was the number of consecutive PT schedule selections before the FT schedule was chosen.
Table 1.
Number of blocks required to reach stability per condition
| Schedule | Directive | Non-directive | ||||
|---|---|---|---|---|---|---|
| D1 | D3 | D5 | N4 | N6 | N7 | |
| PT 4 | 3 | 3 | 3 | 10 | 6 | 6 |
| PT 12 | 3 | 3 | 8 | 10 | 3 | 3 |
| PT 20 | 3 | 3 | 4 | 5 | 8 | 4 |
| PT 12 | 3 | 10 | 4 | 4 | 3 | 3 |
| PT 4 | 6 | 3 | 3 | 3 | 3 | 5 |
Blocks were programmed to end following the first point delivery after 15 min had elapsed. At the termination of each 15-min block, the stimuli on the screen were replaced with a verbal prompt that read as follows: “Thank you for participating! Please record your thoughts on how to earn the most points.” After pressing “okay”, text in a 30.2 cm by 18.8 cm gray box indicated, “To access the keyboard, please touch the text-entry area and then touch the keyboard icon that appears. Answer the following prompt. The best way to earn points is to _______.” Once participants typed their responses, they pressed a button that read “Submit” and the block ended. The experimenter then entered the room and informed the participant how many points s/he had earned. At the conclusion of each of the first three blocks, participants were permitted to take a 2-min break outside of the experimental room. After the fourth block, the experimenter told participants the total monetary amount earned from the entire session and that the session was over. Total participation time each day was approximately 70 min (four 15-min blocks plus three 2-min breaks).
Experimental Design
To evaluate the effects of mands on instructional control, a group design was used. Groups were based on the specific mand presented in the instructions, and the effects were measured across several different PT schedules. For one schedule (PT 4), engaging in the instructed response pattern resulted in optimal reinforcement, whereas during other schedules (PT 12, PT 20) optimal reinforcement rates could be obtained only by deviating from the instructed response pattern.
Instructions
Instructions were adapted from those used by Hackenberg and Joker (1994). The content of the instructions varied depending on the condition to which the participant was assigned. Instructions were present on-screen throughout the entire experiment.
Directive Group
For the three participants assigned to the directive group, the instructions read:
Instructions: please read/listen carefully. To begin, press the green square on your screen. To earn points, press gently on a colored shape. Each point you earn is worth 4 cents. For example, if you earn 300 points, you will be paid $12.00. You must select the blue flashing box, then the solid blue box 4 times, then the solid red box. Each session will last for about 15 min, with a 2-min rest period between sessions. You may leave the room during the rest period. At the end of each session, a message will come up on the screen asking you to record your thoughts about the experiment. When four sessions have been completed you may leave. Of course, you may leave at any time during the exercise in the event of an emergency. Thanks for your participation.
Non-directive Group
For the three participants assigned to the non-directive group, the instructions differed by three words contained within one sentence. Specifically, rather than reading, “You must select the blue flashing box […],” the instructions in the non-directive condition read, “You might consider selecting the blue flashing box […].”
Dependent Variable
The concurrent schedules presented in the experimental preparation allowed for maximization of point production across the session through a sequence of responding to the PT and FT schedules. As noted by Hackenberg and Joker (1994), sequences of choices in this arrangement likely constitute functional response units with respect to the number of consecutive PT schedule selections preceding an FT selection (switch point). Therefore, the switch point is an appropriate measure of schedule control by temporally extended behavior-consequence interactions (i.e., molar contingencies) and was the dependent variable of interest in this experiment.
Under the PT 4 schedule, optimal responding matched the pattern specified by the instructions (i.e., choosing the PT five consecutive times before choosing the FT). As the PT schedule deviated from 4 s (i.e., the PT 12 and PT 20 schedules), the instructions no longer accurately described the pattern of responding associated with the maximization of session-wide (molar) reinforcement. That is, to continue to earn the most points possible as the PT schedule progressed, it was necessary to deviate from the instructions by switching from the PT to the FT schedule after fewer responses (e.g., choosing the PT two consecutive times before choosing the FT for the PT 20 schedule).
In addition to the molar contingency, local behavior-consequence interactions (i.e., molecular contingencies) simultaneously operated on every choice trial, in that one of the two responses minimized delay to the next point delivery. The switch point is also sensitive to schedule control by these molecular contingencies. On a local scale, PT selections are supported when PT delays are greater than those supported by the molar contingency but less than the FT delay.
Stability Criterion
Participants experienced each PT schedule until steady-state responding was obtained. Steady-state responding was calculated based on the median switch point during a block. When the median switch point was the same for three consecutive blocks, the participant experienced the next PT step size according to the fixed schedule sequence previously described. For example, if a participant’s median switch point during the PT 4 schedule was 5 for three consecutive blocks, the next block would begin with the PT 12 schedule. Table 1 shows the total number of blocks participants experienced at each PT value. As in Hackenberg and Joker (1994), the initial schedule value was set at PT 4 because it exposed participants to a history of the instructions accurately describing the optimal response pattern. The subsequent (PT 12 and PT 20) schedules were selected based on the potential control by molar and molecular contingencies. At the PT 12 schedule, molar contingencies favored deviating from the instructions in that more points would be accumulated over the session by doing so; however, adhering to the instructions could still possibly favor more molecular contingencies in that the longest instructed PT delay (48 s) would be shorter than the FT delay (60 s). At the PT 20 schedule, neither molecular nor molar contingencies favored adhering to the instructions.
Optimal versus Instructed Switch Points
Hackenberg and Joker (1994) calculated the points per minute that would be earned across switch points within a schedule and designated those producing the highest rate of point production as optimal. For example, in the PT 4 schedule, selecting the PT schedule five times and the FT schedule once would produce 6 points in 118 s or a rate of 3.05 points per minute. In the present analyses, the definition of optimal switching was expanded to also include those values that would produce a rate within ±0.0667 points per minute of the highest rate. Rates differing by less than this amount would not result in differential point production during the 15-min blocks that participants experienced. Across all schedules, only one response pattern was considered instructed (instructed switch point = 5). During both the PT 4 and PT 12 schedules, two response patterns resulted in optimal point production (PT 4 optimal switch points = 4, 5; PT 12 optimal switch points = 2, 3), and during the PT 20 schedule, only one response pattern resulted in optimal point production (optimal switch point = 2). Using this criterion, switch points were categorized as being consistent with the instructed switch point, optimal switch point, or neither.
Results
Figure 1 depicts steady-state switch point data across each PT schedule for all participants. Graphs in the left panels are results for the directive group and graphs in the right panels are those for the non-directive group. Identical patterns of responding were observed across each PT schedule for all three participants who were given the directive instruction (D1, D3, and D5). Participants’ median switch points were in accordance with the instructed pattern of responding during the PT 4 schedule, when instruction following produced the optimal amount of points. This pattern of responding continued across the ascending (PT 12, PT 20) and descending phases (PT 12, PT 4). That is, the median switch point never deviated from the instructed response pattern for any of the participants in the directive group during the last three blocks of the schedule, despite earning fewer points for following the instruction during the two PT 12 and the one PT 20 schedules. As such, results for this group are highly indicative of rule governance.
Fig. 1.

Steady-state median switch points from PT to FT schedule selection as a function of PT step size for participants presented with directive (left panels) and non-directive (right panels) mand instructions. PT schedules are on the x-axis and the number of PT choices before FT choices (switch point values) are on the y-axis. Dashed lines indicate the instruction-appropriate or optimal schedule-appropriate switch points, as indicated. Shaded areas indicate the range within which switch points resulted in optimal point production. Open squares and closed triangles denote median switch points during the ascending and descending sequences, respectively
Median switch points for participants in the non-directive group showed different patterns than the directive group. All three participants (N4, N6, and N7) displayed optimal patterns of steady-state responding during the PT 4 schedule of the ascending sequence; however, only participant N4 engaged in the optimal response pattern that was described by the instructions. During the PT 12 schedule of the ascending sequence, median switch points for N4, N6, and N7 differed from that of the instructed pattern, but only N7 switched according to the optimal pattern. On the remaining schedules in which optimal and instructed switch points diverged, all three participants responded differently from the instructed pattern, although not always in accordance with the optimal pattern. On the final PT 4 schedule, when the instructed pattern again was optimal, only N6 had a median switch point that deviated from the instructions; however, the median switch point was still at an optimal pattern of responding. Overall, results for this group are indicative of schedule control.
In a more detailed analysis, further regularities were observed within each group. Figure 2 depicts the individual switch points throughout the entire experiment for each participant. Overall response patterns for individual switch points were markedly different for the non-directive group as compared to the directive group, most notably during the first block of the experiment (PT 4 schedule). Participants in the directive group did not initially deviate from instruction-appropriate response patterns, whereas initial switch point values were highly variable in the non-directive group during the first block, ranging from 5 to 7, 0 to 6, and 0 to 9 for participants N4, N6, and N7, respectively. Although switch points of participants in the directive group occasionally departed from the instructed pattern, switch points of participants in the non-directive group varied earlier and to a much greater extent, particularly for N4 and N6. Participant N7 showed initial variability in switching, but subsequently engaged in highly consistent, schedule-appropriate response patterns within blocks.
Fig. 2.

Individual switch points from PT to FT schedule selection for participants presented with directive (left panels) and non-directive (right panels) mand instructions. Individual switch points are on the x-axis. Switch point values are on the y-axis. Closed circles indicate the first switch point at the start of a session block. All other switch points are denoted by a solid line. Dashed horizontal lines depict instruction-appropriate and optimal schedule-appropriate switch points, as indicated. Shaded areas indicate the range within which switch points resulted in optimal point production
Despite a general tendency to engage in instruction-appropriate responding, individual switch points of all participants in the directive group deviated from the instructed pattern for multiple successive switches at least once over the course of the experiment. For D1, the first instance in which three consecutive switch points deviated from the instructed pattern occurred in the descending PT 4 schedule, at which point doing so resulted in suboptimal point production. D3 engaged in sustained deviation from the instructed pattern during the descending PT 12 schedule; however, despite contacting optimal point production through schedule-appropriate responding, this pattern did not maintain and D3 again responded in accordance with the instructions in this and subsequent conditions. Inspection of the data from blocks when D3 responded optimally indicated that the participant continued to press the response keys during the delay periods in a pattern that was similar to the instructed pattern. That is, D3 pressed the flashing blue box (producing a point immediately), then pressed the solid blue box initiating the delay, and during the delay pressed that box three additional times, despite that these responses did not produce points or any other programmed change (however, the responses did produce the visual feedback that occurred whenever the monitor registered a touch). Engaging in this response pattern resulted in a switch point of 2 because only the first and second presses occurred at times when the manipulanda were active. D5 deviated from instruction-appropriate responding during the ascending PT 12 and PT 20 schedules, but resumed the instructed response pattern during each of these schedules. At neither schedule did the participant engage in optimal responding for multiple consecutive switches. Interestingly, for all three participants, nearly every switch point occurring at the start of a session block was of the instructed pattern (except once each for D1 and D3).
The percentages of points earned out of the maximum points possible for each participant are displayed in Fig. 3. Possible points were calculated for each schedule by using the optimal switch point pattern(s) and determining the number of points this would produce assuming a 0-s latency to respond throughout the entire session. Points earned within each block were then divided by this value and then averaged across all blocks completed at that schedule (ascending and descending schedules were averaged separately). In general, participants in the directive group earned the most points when instruction-appropriate and schedule-appropriate responding coincided (during the PT 4 schedules). During both PT 12 and PT 20 schedules, participants in the directive group earned fewer points than those in the non-directive group. Participants in the non-directive group showed somewhat lower percentages of points earned in the ascending PT 4 schedule, but generally earned approximately the same or higher percentages across subsequent schedules.
Fig. 3.

Average percentage earned of possible points during each schedule. PT step sizes are on the x-axis and percentage earned is on the y-axis. Participants in the directive (D) and non-directive (N) groups are denoted by the closed and open data points, respectively
Participants were provided a survey at the end of each block to describe how to earn the most points. Participants in the directive group frequently responded by indicating that one should follow the instruction, either in general terms (e.g., “Do what it says”) or by specifying the response pattern (e.g., “By selecting the flashing blue box, then the solid blue box four times, then the red box”). Additionally, some participants indicated that they did not know how to earn the most points (e.g., “idk [sic]”—a colloquial abbreviation of “I don’t know”). Participants in the non-directive group often responded by indicating that they were evaluating the relation between their response patterns and points earned as compared to preceding blocks (e.g., “I tried something different and got fewer points”) or by simply specifying the response pattern (e.g., “I pressed the flashing blue box once, then the solid blue box, then the red box”). Anecdotally, changes in survey answers often corresponded with changes in patterns of switching during blocks.
Discussion
The purpose of this laboratory simulation was to examine the effects of two different mands within instructions (“must” vs. “might consider”) on the degree to which instructional control persisted under varying reinforcement schedules. Overall, the present findings suggest that the two mands differentially modulated the extent to which the reinforcement schedule influenced behavior. Although the molar contingencies at PT 12 and both molar and molecular contingencies at PT 20 favored deviation from the instruction and participant responding occasionally contacted contingencies for schedule-appropriate behavior, patterns of responding for participants in the directive group persisted in accordance with the instructions across these schedule values. That is, participants were sensitive to the directive mand, thereby showing evidence of rule-governed behavior (possibly in the form of pliance). These results are counter to those of Galizio (1979) who showed that instructional control diminished once participant behavior contacted programmed contingencies; however, they parallel those of Shimoff et al. (1981), in that rule governance persisted even after contacting the altered contingencies. Moreover, the variability observed in the present study during the ascending sequence of the directive group is less than that observed in Hackenberg and Joker (1994). These findings suggest that the directive mand “must” functioned as a command, producing greater responding in accordance with the instruction (i.e., pliance).
Although response patterns for participants in the non-directive group clearly differed from those of the directive group and more closely approximated responding reported by Hackenberg and Joker, the specific effects of the non-directive mand are ambiguous. In general, the participants in the non-directive group displayed a distinct tendency not to engage in the instructed pattern across all but the final PT 4 schedule; however, the extent to which they engaged in schedule-appropriate behavior was idiosyncratic. All three participants engaged in highly variable behavior early in the study. Response patterns for N7 quickly became very consistent and indicative of schedule control. A similar pattern occurred later in the experiment for N4, but N6 rarely demonstrated consistent responding or a high degree of schedule-controlled behavior. As such, the non-directive instruction clearly did not function as advice in that it did not evoke the manded behavior. Whether the non-directive mand functioned to increase sensitivity to programmed contingencies via tracking or decrease sensitivity to the instruction itself remains unclear. That is, the mand “might consider” could have indicated (a) the possibility of reinforcement for behavior other than that instructed, (b) a reduced probability of reinforcement for engaging in the instructed behavior, (c) the absence of an implicit social punishment contingency by the experimenter for deviating from the instruction, or (d) some combination of these. It seems likely that the individual histories of the participants would determine the extent to which any one of these options contributed to the observed behavior.
Recently published research examining the effects of punishment on instructional control may provide insight as to the functional properties of each mand. Using a similar experimental preparation as Hackenberg and Joker (1994) and the present study, Fox and Pietras (2013) evaluated the effects of a response cost contingency (penalty) for deviating from instructions on participant responding. They showed that participants who experienced both the penalty and no penalty conditions demonstrated more consistent rule following when the penalty was in effect, suggesting that the contingencies in the penalty condition punished responding that was inconsistent with the rule. In the case of the present study, increased rule following by participants in the directive group is likely a product of each individual’s history of punishment for deviating from instructions containing the mand “must.” Moreover, a history lacking punishment for deviating from instructions containing the mand “might consider” may have decreased rule following in the non-directive group. As a result, it is likely that the individual’s punishment history determined whether the mand signaled that the consequences were socially mediated or not, particularly as the instructions used in the experiment specified the behavior while leaving the consequences implicit.
Incorporating verbal operants in the interpretation of the present results adds depth to the evaluation of rule governance. A large portion of research on instructional-versus-schedule control has focused on manipulating reinforcement contingencies, but applying the conceptual framework of Skinner’s analysis of verbal behavior allows a systematic approach to addressing the influence of various instructions. The instruction used by Hackenberg and Joker (1994) may have had the same effect on the participant as a tact given the phrasing of the instruction (“The best way to earn the most points is to select…”), whereas participants in the present study responded to the instructions as mands (“You must select…” vs. “You might consider selecting…”). To the extent that the two studies are comparable and their results generalizable, commands appear to produce greater instructional control than tacts, and advice produces greater schedule control than tacts. Future experiments should directly compare instructions in the forms of mands and tacts to more directly address this relation.
Bicard and Neef (2002) brought attention to the classes of behavior engendered by rules in terms of tactical versus strategic instruction. Both forms of instruction used in the present study would be considered tactical in that a specific pattern of behavior was identified. As noted in that study, tactical instructions tend to make behavior more rigid and less sensitive to changes in contingencies. The results suggest that tactical instructions that also include a directive mand may be highly insensitive to changes. The results also suggest that a non-directive mand can sufficiently weaken the insensitivity to reinforcement engendered by tactical instructions. Whether such effects would be observed with strategic instructions remains to be seen.
These results may have implications for settings outside of the laboratory, although the generality of this study is likely limited. For example, business institutions are fraught with issues of instruction adherence, from both employees and administrators, despite management’s best attempts to control behavior and establish effective policies (Daniels 1994). Managers, trainers, and bosses may therefore be considered the architects of the instructions imposed on employees or subordinates. Especially in settings governed by regulations, laws, or policy, behaviors that run contrary to provided instructions may result in less effort, or access to competing, immediate, or more certain reinforcers. Identifying ways to promote supervisory instructional control (i.e., pliance) in lieu of schedule control (i.e., tracking)—particularly in the face of inaccurate instructions generated by employees or other colleagues—may produce positive benefits for individuals in the workplace (e.g., reduced safety-related accidents and injuries). These findings may also have relevance for other applied arenas such as providing instructions in the context of coaching athletes, teaching study skills to students, promoting safe driving, and many others, which are areas ripe for investigation.
A number of limitations warrant discussion and suggest areas for future research. Given the analogue nature of the experimental preparation, applied replications and extensions are necessary to understand how these processes impact behavior of humans in practical settings and to guide the formulation of robust rule governance procedures. In addition, this study included fewer PT schedule values than those used by Hackenberg and Joker (1994), which may have influenced participant behaviors in unanticipated ways (i.e., it may have been more salient to participants when the schedule values changed). However, the patterns of responding for participants in the non-directive group were similar to participants in Hackenberg and Joker despite the fewer PT schedule values, which suggest that any potential impact may be low. What remains unclear is whether the non-directive mand interacted with the development of schedule control by molar or molecular contingencies for participants N4 and N6. The inclusion of additional PT values in which molar, but not molecular, contingencies were favored (step sizes between 15 and 60 s) might have illuminated this. Furthermore, the total number of blocks participants experienced differed substantially across the two studies, with the present participants experiencing an average of 23 blocks as compared to 212.5 blocks in the experiment by Hackenberg and Joker. The comparatively limited experience of participants in the present study may partially explain why schedule control by the molar contingencies was observed less consistently with the participants in the non-directive group. For the PT 4 schedule, the average points earned per minute when engaging in instruction-appropriate behavior (i.e., switch point of 5) is 3.05, which is only 0.02 greater than responding with a switch point of 4 and translates to a difference in earnings of approximately $0.01 per session. This example serves as the most extreme case in which a small difference would require extended experience and a high degree of sensitivity to the molar contingency to identify differences in earnings, as this is the smallest difference in earning between an optimal and non-optimal switch point across any of the schedules. It seems likely that with increased exposure to molar contingencies, behavior will be influenced to a greater degree when it contacts increased reinforcement under such conditions.
Footnotes
The behavioral perspective of rules implies a functional definition that involves discriminated responding in the presence of a verbal antecedent. This is in contrast to more colloquial definitions that need not invoke compliance to the verbal antecedent stimulus. The use of the word “rule” throughout the remainder of this article refers to the behavioral definition that is functionally determined by its discriminative effects on behavior (i.e., verbal governance; Catania 2006).
References
- Baron A, Perone M, Galizio M. Analyzing the reinforcement process at the human level: can application and behavioristic interpretation replace laboratory research? The Behavior Analyst. 1991;14:95–105. doi: 10.1007/BF03392557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bicard DF, Neef NA. Effects of strategic versus tactical instructions on adaptation to changing contingencies in children with ADHD. Journal of Applied Behavior Analysis. 2002;35:375–389. doi: 10.1901/jaba.2002.35-375. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Branch MN. On the difficulty of studying “basic” behavioral processes in humans. The Behavior Analyst. 1991;14:107–110. doi: 10.1007/BF03392558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Catania AC. Antecedents and consequences of words. The Analysis of Verbal Behavior. 2006;22:89–100. doi: 10.1007/BF03393030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Catania AC, Matthews BA, Shimoff E. Instructed versus shaped human verbal behavior: interactions with nonverbal responding. Journal of the Experimental Analysis of Behavior. 1982;38:233–248. doi: 10.1901/jeab.1982.38-233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Daniels AC. Bringing out the best in people: how to apply the astonishing power of positive reinforcement. New York: McGraw Hill; 1994. [Google Scholar]
- Doll BB, Jacobs WJ, Sanfey AG, Frank MJ. Instructional control of reinforcement learning: a behavioral and neurocomputational investigation. Brain Research. 2009;1299:74–94. doi: 10.1016/j.brainres.2009.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drake CE, Wilson KG. Instructional effects on performance in a matching-to-sample study. Journal of the Experimental Analysis of Behavior. 2008;89:333–340. doi: 10.1901/jeab.2008-89-333. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fox AE, Pietras CJ. The effects of response-cost punishment on instructional control during a choice task. Journal of the Experimental Analysis of Behavior. 2013;99:346–361. doi: 10.1002/jeab.20. [DOI] [PubMed] [Google Scholar]
- Galizio M. Contingency-shaped and rule-governed behavior: instructional control of human loss avoidance. Journal of the Experimental Analysis of Behavior. 1979;31:53–70. doi: 10.1901/jeab.1979.31-53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Glenn SS. Rules as environmental events. The Analysis of Verbal Behavior. 1987;5:29–32. doi: 10.1007/BF03392817. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hackenberg TD, Joker VR. Instructional versus schedule control of humans’ choices in situations of diminishing returns. Journal of the Experimental Analysis of Behavior. 1994;62:367–383. doi: 10.1901/jeab.1994.62-367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hayes S, editor. Rule-governed behavior: cognition, contingencies, and instructional control. New York: Plenum Press; 1989. [Google Scholar]
- Hayes SC. Rule governance: basic behavioral research and applied implications. Current Directions in Psychological Science. 1993;2(6):193–197. doi: 10.1111/1467-8721.ep10769746. [DOI] [Google Scholar]
- Hayes SC, Brownstein AJ, Haas JR, Greenway DE. Instructions, multiple schedules, and extinction: distinguishing rule-governed from schedule-controlled behavior. Journal of the Experimental Analysis of Behavior. 1986;46:137–147. doi: 10.1901/jeab.1986.46-137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hayes SC, Brownstein AJ, Zettle RD, Rosenfarb I, Korn Z. Rule-governed behavior and sensitivity to changing consequences of responding. Journal of the Experimental Analysis of Behavior. 1986;45:237–256. doi: 10.1901/jeab.1986.45-237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Milgram S. Behavioral study of obedience. Journal of Abnormal and Social Psychology. 1963;67:371–378. doi: 10.1037/h0040525. [DOI] [PubMed] [Google Scholar]
- Raia CP, Shillingford SW, Miller HL, Jr, Baier PS. Interaction of procedural factors in human performance on yoked schedules. Journal of the Experimental Analysis of Behavior. 2000;74:265–281. doi: 10.1901/jeab.2000.74-265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schlinger H, Blakely E. Function-altering effects of contingency-specifying stimuli. The Behavior Analyst. 1987;10:41–45. doi: 10.1007/BF03392405. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schmitt DR. Effects of consequences of advice on patterns of rule control and rule choice. Journal of the Experimental Analysis of Behavior. 1998;70:1–21. doi: 10.1901/jeab.1998.70-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shimoff E, Catania AC, Matthews BA. Uninstructed human responding: sensitivity of low-rate performance to schedule contingencies. Journal of the Experimental Analysis of Behavior. 1981;36:207–220. doi: 10.1901/jeab.1981.36-207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skinner BF. Verbal behavior. Englewood Cliffs: Prentice-Hall; 1957. [Google Scholar]
- Skinner BF. Contingencies of reinforcement: a theoretical analysis. Englewood Cliffs: Prentice Hall; 1969. [Google Scholar]
- Vaughan ME. Repeated acquisition in the analysis of rule-governed behavior. Journal of the Experimental Analysis of Behavior. 1985;44:175–184. doi: 10.1901/jeab.1985.44-175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zettle RD, Hayes SC. Rule-governed behavior: a potential theoretical framework for cognitive-behavioral therapy. In: Kendall PC, editor. Advances in cognitive-behavioral research and therapy. New York: Academic; 1982. pp. 73–118. [Google Scholar]
