The Interplay between Uncertainty Monitoring and Working Memory: Can Metacognition Become Automatic?

Mariana V C Coutinho; Joshua S Redford; Barbara A Church; Alexandria C Zakrzewski; Justin J Couchman; J David Smith

doi:10.3758/s13421-015-0527-1

. Author manuscript; available in PMC: 2016 Jan 21.

Published in final edited form as: Mem Cognit. 2015 Oct;43(7):990–1006. doi: 10.3758/s13421-015-0527-1

The Interplay between Uncertainty Monitoring and Working Memory: Can Metacognition Become Automatic?

Mariana V C Coutinho ¹, Joshua S Redford ¹, Barbara A Church ¹, Alexandria C Zakrzewski ¹, Justin J Couchman ², J David Smith ¹

PMCID: PMC4721958 NIHMSID: NIHMS750370 PMID: 25971878

Abstract

The uncertainty response has grounded the study of metacognition in nonhuman animals. Recent research has explored the processes supporting uncertainty monitoring in monkeys. It revealed that uncertainty responding in contrast to perceptual responding depends on significant working memory resources. The aim of the present study was to expand this research by examining whether uncertainty monitoring is also working memory demanding in humans. To explore this issue, human participants were tested with or without a cognitive load on a psychophysical discrimination task including either an uncertainty response (allowing the decline of difficult trials) or a middle-perceptual response (labeling the same intermediate trial levels). The results demonstrated that cognitive load reduced uncertainty responding, but increased middle responding. However, this dissociation between uncertainty and middle responding was only observed when participants either lacked training or had very little training with the uncertainty response. If more training was provided, the effect of load was small. These results suggest that uncertainty responding is resource demanding, but with sufficient training, human participants can respond to uncertainty either by using minimal working memory resources or effectively sharing resources. These results are discussed in relation to the literature on animal and human metacognition.

Keywords: metacognition, uncertainty monitoring, cognitive load, working memory, comparative psychology, controlled processing

Humans have feelings of knowing and not knowing, of confidence and doubt. Their ability to accurately identify these feelings and to respond to them adaptively are the focus of the research literature on metacognition (e.g., Benjamin, Bjork, & Schwartz 1998; Flavell, 1979; Koriat & Goldsmith, 1994; Metcalfe & Shimamura, 1994; Nelson, 1992; Scheck & Nelson, 2005; Schwartz, 1994). Metacognition refers to the ability to monitor and control one’s own perceptual and cognitive processes (Nelson & Narens, 1990; 1994). This ability plays an important role in learning and memory.

The monitoring component of metacognition has been widely investigated in humans (e.g., Begg, Martin, & Needlham, 1992; Dunlosky & Nelson, 1992; Hart, 1967; Koriat, 1993; Koriat & Goldsmith, 1996; Lovelace, 1984; Metcalfe, 1986) and nonhuman animals (e.g., Beran, Smith, Coutinho, Couchman, & Boomer, 2009; Beran, Smith, Redford, & Washburn, 2006; Call & Carpenter, 2001; Fujita, 2009; Hampton, 2001; Kornell, 2009; Smith, Beran, Redford, & Washburn, 2006; Smith et al., 1995; Smith, Shields, Allendoerfer, & Washburn, 1998; Smith, Shields, Schull, & Washburn, 1997). In humans, metacognitive monitoring is normally assessed by asking participants to make judgments of learning (JOLs), feelings of knowing judgments (FOKs), or confidence ratings (for review see Koriat, 2007). In animals, the most common method of assessment is the uncertainty-monitoring paradigm because it does not rely on verbal reports or verbal knowledge. This method involves presenting subjects with stimulus trials varying in objective difficulty and providing them with a response (the uncertainty response) that allows them to decline any trial they choose. The idea behind this test is that subjects who have access to their mental states of uncertainty—knowing when they do not know—will complete trials for which they know the answer (easy trials) and skip the ones for which they do not know the answer (difficult trials). Subjects who do not have access to such states will not show this pattern. Thus, it is expected that the frequency of uncertainty responses for subjects who are capable of monitoring their mental states will be higher for the objectively difficult items.

In the uncertainty monitoring paradigm, it is adaptive for subjects to decline trials that they are unsure of because errors can result in timeouts, unpleasant sounds, and (in humans) a point loss. When subjects skip error-prone trials, they not only avoid these negative consequences, but they also increase their chance to earn points (in the case of humans) or pellets (in the case of animals) because they don’t waste time on timeouts. Therefore, using the uncertainty response for trials they cannot discriminate produces significant point gains compared to guessing.

Since the uncertainty-monitoring paradigm was proposed, a number of studies have been conducted investigating whether animals have the ability to monitor their mental states (e.g., Beran et al., 2006; Couchman, Coutinho, Beran, & Smith, 2010; Shields, Smith, & Washburn, 1997; Smith et al., 2006; Smith, Redford, Beran, & Washburn, 2010; Smith et al., 1995; 1997; Smith, Shields, & Washburn, 2003; Washburn, Gulledge, Beran, & Smith, 2010; Washburn, Smith, & Shields, 2006). These studies demonstrated that monkeys (Macaca mullata), similar to humans, used the uncertainty response adaptively—that is, they used it to decline only the trials that were difficult and prone to error. But despite the similarity in uncertainty responding across species, the appropriate interpretation of these findings is still sharply debated (e.g., Couchman et al., 2010; Crystal & Foote, 2009; Hampton, 2009; Jozefowiez, Staddon, & Cerutti, 2009; Smith, Beran, & Couchman, 2012; Smith, Beran, Couchman, & Coutinho, 2008). Some researchers argue that uncertainty responding in animals reflects their ability to monitor their mental states, whereas others believe it is based on perceptual, associative processes.

To clarify this issue, Smith, Coutinho, Church and Beran (2013) conducted a study to assess the role of executive resources in uncertainty and perceptual responding in rhesus monkeys. They hypothesized that if the uncertainty response is a high-level decisional response, cognitive load should have a differential effect on uncertainty and perceptual responding. It should disrupt uncertainty responding but not perceptual responding, or at least not to the same degree. The results of this study confirmed their hypothesis. These results provide strong evidence that the uncertainty response is qualitatively different from perceptual responses, and that monkeys may be capable of monitoring their mental states.

In line with the findings from Smith et al. (2013), a study conducted with humans found that some metacognitive judgments, such as tip-of-the-tongue states (TOTs) depend on working memory resources (Schwartz, 2008). Interestingly, a similar pattern of results was not observed for FOKs. This dissociation suggests that different types of monitoring judgments may tap different processes that are more or less dependent on working memory resources. Neuroimaging studies have also provided support for this claim (e.g., Maril, Simons, Mitchell, Schwartz, & Schacter, 2003; Maril, Wagner, & Schacter, 2001). For instance, researchers reported differential patterns of neural activity during TOTs and FOKs judgments. In particular, TOTs judgments were associated with an increase in neural activity in regions that had been previously reported to be involved in working memory activities, such as the anterior cingulate, right dorsolateral and right inferior prefrontal cortex regions (see Ruchkin, Grafman, Cameron, & Berndt, 2003). On the other hand, FOKs judgments were mostly associated with differences in neural activity within the left prefrontal and parietal regions.

One possible reason why TOTs depend on working memory resources and FOKs do not is that TOTs unlike FOKs may be mediated by processes such as conflict detection and conflict resolution which are both controlled (for more information about controlled processes, see Shiffrin & Schneider, 1977). These two processes may be essential for TOTs because TOTs involve a conflict between what one feels certain one knows, and the incapacity to recall that information despite having a feeling of imminent recall. Additionally, given that TOTs are commonly preceded by the retrieval of a variety of information that is related to the to-be-recalled item, in order for individuals to have TOTs, they first need to decide whether the information retrieved is leading to the recall of the target or interfering with it. Thus, they need to solve the conflict about the value of the information being retrieved. On the other hand, FOKs may be mediated primarily by interpreting processing fluency, and with experience this may become automatic. Individuals may base their FOKs on how familiar or how fluent the information to be remembered is, and this may be a process that humans have lots of experience doing.

Evidence that metacognitive monitoring is resource-consuming has also been demonstrated across individuals of different ages during recall. Stine-Morrow, Shake, Miles, and Noh (2006) tested younger and older adults on a memory task that either required them to make a metacognitive judgment before they were asked to recall an item, or not. They found that when older adults made these judgments, performance level decreased whereas no change in performance was observed for the younger group. This suggests that the act of monitoring one’s recall processes consumes resources that would otherwise be employed in the memory task.

Considering that different types of metacognition in humans may be mediated by different processes and uncertainty monitoring in monkeys clearly depends on working memory resources, it is important to ask whether the processes supporting uncertainty monitoring in humans are similar to those in animals. That is, does working memory also play a role in uncertainty monitoring in humans? If it does, this would suggest a possible continuity in the processes mediating uncertainty monitoring in humans and monkeys, which could potentially shed light on the evolutionary development of the metacognitive capacity.

To explore whether the processes supporting uncertainty monitoring in humans are working memory intensive (as they are in monkeys), we conducted three experiments assessing the effect of concurrent load on uncertainty and perceptual-middle responding at different levels of practice with these responses.

Experiment 1

Experiment 1 evaluated the effect of a concurrent load on uncertainty and middle responding during perceptual discrimination learning. It was hypothesized that if uncertainty responding draws resources from working memory (as it does for monkeys), then concurrent load should reduce uncertainty responding to a greater degree than middle responding.

In this experiment, participants performed a Sparse-Uncertain-Dense (SUD) or a Sparse-Middle-Dense (SMD) discrimination task with or without concurrent load. For the SUD task, participants were asked to judge pixel boxes varying in difficulty as Sparse or Dense, and were also provided with an option of declining to make a response by selecting the uncertainty response. They were told that this response should be used when they were not sure which category the stimulus belonged, and it would help them gain points by avoiding timeouts. Uncertainty responses were not followed by a reward or a penalty. Participants simply moved to the next trial. Pixel boxes were designated as Sparse or Dense based on their level of pixel density. Sparse stimuli had between 1,085 and 1,550 pixels whereas dense stimuli had between 1,578 and 2,255 pixels. For the SMD task, participants were asked to discriminate the same pixel boxes into three categories (Sparse, Middle, and Dense) by selecting their corresponding responses (Sparse, Middle, or Dense). In this task, all three responses behaved exactly the same way—that is, correct responses resulted in a reward and incorrect responses yielded a penalty. Sparse, Middle, and Dense stimuli had between 1,085 and 1,470, 1,496 and 1,636, and 1,665 and 2,255 pixels, respectively. Participants either performed the SUD or SMD tasks alone, or with a concurrent load. In the concurrent load condition, participants were presented with a pair of digits prior to each discrimination trial and were required to hold the size and value of two digits in mind while making a discrimination response. This manipulation gave rise to four different conditions: uncertain non-concurrent (UN), uncertain concurrent (UC), middle non-concurrent (MN), and middle concurrent (MC).

Methods

Participants

One hundred and twelve undergraduates from the University at Buffalo participated in a 52-minute session to fulfill a course requirement. They were assigned randomly to the uncertainty or middle tasks and to the no concurrent load or concurrent load conditions. Participants who completed fewer than 225 test trials in the task, or who were not able to perform above 60% correct with the five easiest trial levels at both the sparse and dense ends of the stimulus continuum were not included for further analysis. In the end, 2 participants from the UC and 10 from the MC condition were excluded based on these criteria. The data from 24, 26, 24, and 26 participants, respectively, were included for analysis in the UN, UC, MN, and MC conditions.

Design

A 2 × 2 × 42 mixed factorial design was used with task (SUD and SMD) and condition (Concurrent Load and No Concurrent Load) serving as between-participant variables and stimuli level (1 to 42) serving as a within-participant variable. The dependent variable was the proportion of intermediate responding (uncertainty and middle).

Stimulus continuum

The discriminative stimuli were unframed 200 × 100 pixel boxes presented in the top center of the computer screen. The area of the box was filled with a variable number of randomly placed lit pixels. The pixel density of the boxes varied along a continuum running from 1,085 pixels (Level 1) to 2,255 pixels (Level 42). Given the maximum possible number of lit pixels (20,000), these pixel counts correspond to 5.4% density for the sparsest stimulus and 11.3% density for the densest stimulus. Each successive level had 1.8% more pixels than the last. Each trial level’s pixel count was given by the formula Pixels_{Trial Level} = round (1066 × 1.018^{Trial Level}). The sparsest and densest trials of the stimulus continuum are shown in Figure 1.

Examples of pixel-box stimuli used in the present Sparse-Middle-Dense and Sparse-Uncertain-Dense discriminations. Shown are the easiest Sparse trial level (Level 1) and the easiest Dense trial level (Level 42).

Sparse-Uncertain-Dense (SUD) task

The participant’s task was to identify boxes that had pixel densities falling within the sparser or denser portion of the stimulus continuum. Twenty-one trial levels—Level 1 (1,085 pixels) to Level 21 (1,550 pixels)—were designated Sparse and were rewarded in the context of sparse responses. Twenty-one trial levels—Level 22 (1,578 pixels) to Level 42 (2,255 pixels)—were designated Dense and were rewarded in the context of dense responses. Of course the trials near Level 1 and Level 42 are easy sparse and dense trials, respectively. The trials near the breakpoint of the discrimination at Level 21–22 are the most difficult.

Along with the stimulus box on each trial, participants saw a large S to the bottom left of the pixel box and a large D to the bottom right of the pixel box. The uncertainty icon was a ? placed below and between the S and D icons. These different responses were selected by pressing labeled keyboard keys arranged to duplicate the spatial layout of the response icons on the screen. For correct and incorrect responses, respectively, participants heard a computer generated 0.5 s reward whoop or an 8 s penalty buzz, they gained or lost one point, and they saw a green or red text banner announcing “Right Box” or “Wrong Box.” The next trial followed this feedback. The uncertainty response did not bring either positive or negative feedback. It simply canceled the present trial and advanced the participant to the next randomly chosen trial. Participants generally adaptively use this response for the difficult trial levels surrounding the discrimination breakpoint (e.g., Smith et al., 2006). Participants were explicitly instructed that they should use the ? key when they were not sure how to respond, that it would let them decline any trials they chose, and that it would let them avoid the 8 s error buzz and the point penalty.

Sparse-Middle-Dense (SMD) task

The participant’s task was to identify boxes that had pixel densities falling within the sparser, middle, or denser portion of the stimulus continuum. Eighteen trial levels—Level 1 (1,085 pixels) to Level 18 (1,470 pixels)—were designated Sparse and were rewarded in the context of sparse responses. Eighteen trial levels—Level 25 (1,665 pixels) to Level 42 (2,255 pixels)—were designated Dense and were rewarded in the context of dense responses. Six trial levels—Level 19 (1,496 pixels) to Level 24 (1,636 pixels)—were designated Middle and were rewarded in the context of middle responses. We deliberately made the middle response region narrower than the sparse and dense response regions. We did this to equate the middle response region with the levels of the stimulus continuum where humans typically make uncertainty responses (Smith et al., 2006; 1997; Zakrzewski, Coutinho, Boomer, Church, & Smith, 2014).

The S and D icons were placed exactly as in the SUD task. The M icon was located below and between the S and D icons, exactly where the uncertainty icon was for the SUD task. Participants made their responses by pressing labeled keyboard keys. Correct and incorrect responses generated the same feedback as described in the SUD task. The M response also received this feedback.

Concurrent task

The stimuli for the concurrent task were digits that were presented top-left and top-right on the computer screen. The two digits varied in physical size as follows. One digit was presented in a large font within Turbo-Pascal 7.0—it was about 3 cm wide and 2.5 cm tall as it appeared on the screen. One digit was presented in a smaller font—it was about 1.5 cm wide and 1 cm tall on the screen. The digits were never equal in size—participants were always able to judge which digit was physically smaller or larger. The two digits varied in numerical size from 3 to 7. They were never equal in quantity—participants were always able to judge which digit was numerically smaller or larger.

On each concurrent-task trial, the two digits appeared top-left and top-right on the monitor. After 2 seconds, the digits were masked with white squares, then the digits and squares were cleared from the screen. Participants had to remember the digit-size and digit-quantity information until a memory cue appeared top-middle. The cue was "big size,” Ȭbig value,” “small size,” or “small value”. Participants were supposed to select the response icon under the former position of the physically or numerically bigger or smaller digit. For correct and incorrect responses, respectively, participants heard a computer generated 0.5 s reward whoop or an 8 s penalty buzz. Participants gained/lost 2 points for each concurrent-task trial, and they saw text banners that said “Right number”/”Wrong number”. The next trial followed this feedback. The 2-point gain/loss helped participants focus effort and cognitive resources toward the concurrent task. We also motivated participants to optimize performance in the discrimination and concurrent tasks by awarding $10 prizes to the participants who earned the most points in each condition.

Training trials

Participants received 20 training trials that taught either the Sparse-Dense or Sparse-Middle-Dense discriminations. These trials randomly presented the easiest Sparse/Dense stimuli (Level 1, Level 42) in the case of the SUD discrimination, and the easiest Sparse/Middle/Dense stimuli (Level 1, Level 21, Level 42) in the case of the SMD discrimination. Participants in the UC and MC conditions also received 20 training trials on the concurrent task alone.

Test trials

Following the training phase(s), participants received discrimination trials that could vary in difficulty. Now, stimuli were chosen randomly from across the 42-level continuum. Now, too, the uncertainty response became available during discrimination trials for those participants in the SUD task. Participants in the non-concurrent conditions (UN and MN) received no simultaneous cognitive load. Participants in the concurrent conditions (UC and MC) experienced memory and discrimination trials interdigitated as follows. First, the memory digits were presented on the computer screen for two seconds, and then masked and erased. Second, the pixel box appeared on the screen along with the discrimination-response options and participants made their response, responding Sparse, Dense, Middle or Uncertain as allowed within their particular task assignment. Third, feedback for the discrimination trial was delivered. Fourth, the memory cue and the memory-response options were presented on the computer screen and participants made their response. Fifth, feedback for the memory trial was delivered. After that, this cycle of trials was repeated multiple times until the duration of the experimental session was equal to 52 minutes.

Modeling performance and fitting data

We instantiated formal models of the present tasks. Our models were grounded in Signal Detection Theory (MacMillan & Creelman, 2005). Signal Detection Theory assumes that performance in perceptual tasks is organized along an ordered series (a continuum) of psychological representations of changing impact or increasing strength. Here, the continuum of subjective impressions would run from clearly sparse to clearly dense. Given this continuum, Signal Detection Theory assumes that an objective event will create subjective impressions from time to time that vary in a Gaussian distribution around the objective stimulus level presented. This perceptual error is part of what produces errors in discrimination and part of what may foster uncertainty in the task. Finally, Signal Detection Theory assumes a decisional process by which criterion lines are placed along the continuum so that response regions are organized. Here, by the overlay of Sparse-Uncertain (SU) and Uncertain-Dense (UD) criteria, for example, the stimulus continuum would be divided up into Sparse, Uncertain, and Dense response regions.

Our models took the form of a virtual version of the tasks as humans in the present studies would experience them. We then placed simulated observers in those task environments for 10,000 trials.

The simulated observers experienced Perceptual Error. The value of Perceptual Error—that is, the standard deviation of the Gaussian distribution that governed misperception—was one free parameter in our model. On each trial, given some stimulus (Levels 1–42), simulated observers misperceived the stimulus obedient to this Gaussian distribution. Given a Perceptual Error of 4, for example, they could misperceive a Level 12 stimulus generally in the range of Level 8 to Level 16. This misperceived level became the subjective impression on which the simulated observer based its response choice for that trial.

The simulated observers were also given individually placed criterion points. The placements of the SU and UD criterion points, or the Sparse-Middle (SM) and Middle-Dense (MD) criterion points, defined three response regions for the simulated observer that determined its response choice to a subjective impression. The placements of SU and UD (or SM and DM) were two more free parameters that could be adjusted to optimally fit the data.

To fit observed performance, we vary a set of parameters of the model (i.e., Perceptual Error, the placement of the lower criterion [SU, SM], and the placement of the upper criterion [UD, MD]). The simulated observer’s predicted performance profile is produced by finding its response proportions for 42 stimulus levels for each of the parameter configurations. We calculated the sum of the squared deviations (SSD) between the corresponding observed and predicted data points. We minimized this SSD fit measure to find the best fitting parameter configuration. For this best fitting configuration, we also calculated a more intuitive measure of fit—the average absolute deviation (AAD). This measure represents the average of the deviations between observed and predicted response levels (with the deviations always signed positively). (For more information about the application of this model in studies of human and nonhuman animal uncertainty monitoring, see Smith et al., 2006; 2013)

Results

Overall statistical analysis: uncertainty-middle responding

Participants in the UN, UC, MN, and MC conditions completed on average 927, 345, 647, and 286 discrimination trials, respectively. Participants in the concurrent conditions completed fewer discrimination trials than those in the non-concurrent conditions because they also performed the working memory task. The average proportions of intermediate (uncertain or middle) responding for the four conditions were .11, .02, .14, and .25, respectively.

To statistically explore participants’ uncertain and middle responding across the four conditions, we conducted a General Linear Model with Level (1–42) as a within-participant variable and Task (SUD and SMD) and Condition (non-concurrent and concurrent) as between-participants variables. Figure 2 shows the four response curves overlain, to help readers interpret the effects. All the statistical analyses had an alpha level of .05, two-tailed.

Mean proportion of middle or uncertainty responses (black circles), sparse responses (open diamonds) and dense responses (open triangles) for participants in each condition of the first experiment. A: Uncertain No-Concurrent, B: Uncertain Concurrent, C: Middle No-Concurrent, D: Middle Concurrent.

There was a main effect of trial level, F (41, 3936) = 43.19, p< .001, η_p² = .31. This was due to the increase in the use of the intermediate responses (uncertain or middle) for the trial levels near the midpoint of the stimulus continuum. There was also a main effect of task, F (1, 96) = 77.67, p< .001, η_p² = .45. Participants in the SUD and SMD tasks used their intermediate responses at rates of .0575 and .2003, respectively. There was a task by condition interaction, F (1, 96) = 37.41, p< .001, η_p² = .28. Planned comparisons revealed that concurrent load significantly decreased uncertainty responding for the most difficult trial levels (levels 19 to 24), t (48) = 3.41, p =.001, Cohen’s d = .959, whereas it increased middle responding, t (48) = 3.81, p< .001, Cohen’s d = 1.08. Finally there were milder, intuitive interactions involving task by level, F (41, 3936) = 17.38, p< .001, η_p² = .15 and condition by level, F (41, 3936) = 2.02, p< .001, η_p²= .02. These interactions signify that the response curves in Figure 2 were differentially affected across levels by task (SUD vs. SMD) and by condition (concurrent vs. non-concurrent) because the task and condition dependent differences are primarily affecting the middle levels. There were no other significant main effects and interactions, all F’s < 2.

Concurrent task performance

Performance on the memory task was very high and did not differ based on which task participants performed, t (50) = 1.05, p = .29. The average proportion correct for the SUD and SMD tasks was .91 (SD = .08) and .93 (SD = .05), Cohen’s d = 0.29, respectively.

Model fits

We used Signal Detection Theory to model group performance for each of the four conditions. The best-fitting predicted performance profiles for the four conditions are shown in Figure 3. The model yielded very good fits. The SSD measures of fit were 0.0789, 0.0581, 0.0985, and 0.1418 for UN, UC, MN, and MC groups respectively. The intuitive measures of fit (AAD) for all four groups were less than .03 (i.e., .0207, .0161, 0207, and .0238). This means that the model’s predictions had an error of less than 3% per data point on average.

The best-fitting predicted profile for the four conditions of the first experiment.A: Uncertain No-Concurrent, B: Uncertain Concurrent, C: Middle No-Concurrent, D: Middle Concurrent. The black circle illustrates the predicted proportions of intermediate (uncertainty or middle) responding. The open diamonds and open triangles show the predicted proportions of sparse and dense responding, respectively.

The model estimated that participants in the UN condition placed their SU and UD criteria at levels 20 and 23, whereas participants in the UC condition placed both criteria at level 20. This means that the UC group did not have an uncertainty region. They stopped responding uncertain. For the MN and MC groups, the model estimated that participants placed their SM and MD criteria at levels 19 and 24, and levels 14 and 24, respectively. Thus, the concurrent load increased the middle region by 5 steps. The modeling confirms the statistical findings that the concurrent load affected uncertainty and middle responding in opposite ways. It eliminated uncertainty responding but increased middle responding.

To better understand whether this effect was due to differences in participants’ ability to discriminate the items across the continuum, we looked at the Perceptual Error for each of the four groups. The Perceptual Error for UN, UC, MN, and MC were 9, 8, 8, and 9, respectively. This means that each stimulus could have been misperceived by 8 or 9 steps. For example, given a Perceptual Error of 8, a stimulus of level 10 could have been misperceived as any subjective stimulus impression generally in the range of 2 to 18 of the 42-level continuum. The similarity in Perceptual Error across conditions suggests that concurrent load did not change participants’ perceptual processes.

Discussion

The results of Experiment 1 demonstrated that the concurrent load significantly reduced the use of the uncertainty response whereas it increased the use of the middle response. These results provide support for the hypothesis that the uncertainty response is not simply a perceptual-middle response, although both of them may rely on working memory resources. Most importantly, the decrease in uncertainty responding is consistent with the findings of Smith et al. (2013) showing a similar pattern in rhesus monkeys. The similarity between the results of the present experiment and those from Smith et al. (2013) may suggest that uncertainty monitoring in humans and monkeys tap similar working memory intensive processes.

The drop in uncertainty responding observed in the current experiment may reflect participants’ inability to accurately monitor their mental states when they don’t have sufficient cognitive resources available to employ. Or, it may reflect participants’ choice not to monitor their mental states given that they know it is a cognitively demanding process. Regardless of whether the drop in uncertainty responding is caused by a deliberate strategy or by unintentional monitoring failure, they suggest that uncertainty monitoring is working memory intensive for humans like it is for monkeys, even though interpreting ease-of-processing in memory monitoring (FOK’s) is not (Schwartz, 2008).

In contrast to uncertainty responding, the proportion of middle responses increased with concurrent load. Participants broaden the middle region by incorrectly assigning sparse and dense stimuli to the middle category. The increased middle responding with the introduction of concurrent load may reflect decisional processes that change based on the availability of working memory resources. For instance, participants who were tested with the concurrent load may not have noticed as easily as the no-load participants that the middle region was smaller than the sparse and dense regions. Thus, their representations of a middle region may have been broader than the actual objective region because they assumed equal lengths for the regions (sparse, middle, and dense) of the continuum. The no-load participants have greater working memory resources to allow them to hypothesis test why they are initially getting middle responses wrong. This would allow them to understand that they need to use the middle response more conservatively than originally assumed. This would reduce their middle responding and confine it to a more conservative region. Perhaps the participants’ inability to easily consult their mental states of uncertainty drive both the decrease in uncertainty responding and the increase in middle responding because participants cannot use their feelings of uncertainty about the outer edges of the middle response to drive more conservative responding.

It is also possible that the concurrent load affected middle responding because the process of categorizing middle stimuli is intrinsically very difficult. There are only six stimulus-levels that belong to the middle category and for this reason even the middle-most middle stimulus (level 21) is difficult to categorize because this stimulus is only a few steps away from the SM and MD boundaries. The same is not true for sparse and dense categories because each of them includes 18 stimulus-levels. Thus, even if participants misperceive a stimulus of level 2 by 8 steps, their response would still be correct because a stimulus of level 10 is also sparse. On the other hand, if participants misperceive a middle stimulus of level 21 as a level 29, their response would be incorrect because a stimulus of level 29 is dense. Given that, middle responding may require considerably more careful decisional processes than sparse and dense responding, and therefore may require more working memory to choose to respond more conservatively.

In many respects the present findings are similar to those found with rhesus monkeys, and the methodologies in the human and monkey experiment have many similarities. Therefore, there is reason to suggest that some uses of the uncertainty response are working memory intensive for humans, as they are for monkeys. Our findings also complement those of Zakrzewski et al. (2014) who showed that uncertainty responses, but not primary perceptual responses, were reduced by strict response deadlines. Thus, uncertainty responses, at least in some uses, may be more working memory and time intensive.

However, there is an important difference between the monkey experiments and the experiment described here. The monkeys had significant experience with the uncertainty and middle responses before the concurrent load was introduced to the task. The humans in the present study had no experience with the uncertainty response prior to test, but they were familiarized with the middle response beforehand. As a result, there is a possibility that the differential training with these two responses interacted with the effect of concurrent load. They had to learn the functionality of the uncertainty response while they had a memory load. This was not true for the perceptual responses including the middle response which had a short training session before the concurrent load was introduced. To clarify this issue, we conducted two other experiments.

Experiment 2

In Experiment 2, we carefully equated the initial experience with the middle and uncertainty responses so that both groups had the same experience with the responses and clearly knew their functions before testing. We did this to rule out the possibility that the dissociation between uncertain and middle responding observed in Experiment 1 was due to differential training with these responses.