Skip to main content
PLOS One logoLink to PLOS One
. 2020 Dec 9;15(12):e0243532. doi: 10.1371/journal.pone.0243532

Captivated by thought: “Sticky” thinking leaves traces of perceptual decoupling in task-evoked pupil size

Stefan Huijser 1,*, Mathanja Verkaik 1, Marieke K van Vugt 1, Niels A Taatgen 1
Editor: Myrthe Faber2
PMCID: PMC7725397  PMID: 33296415

Abstract

Throughout the day, we may sometimes catch ourselves in patterns of thought that we experience as rigid and difficult to disengage from. Such “sticky” thinking can be highly disruptive to ongoing tasks, and when it turns into rumination constitutes a vulnerability for mental disorders such as depression and anxiety. The main goal of the present study was to explore the stickiness dimension of thought, by investigating how stickiness is reflected in task performance and pupil size. To measure spontaneous thought processes, we asked participants to perform a sustained attention to response task (SART), in which we embedded the participant’s concerns to potentially increase the probability of observing sticky thinking. The results indicated that sticky thinking was most frequently experienced when participants were disengaged from the task. Such episodes of sticky thought could be discriminated from neutral and non-sticky thought by an increase in errors on infrequent no-go trials. Furthermore, we found that sticky thought was associated with smaller pupil responses during correct responding. These results demonstrate that participants can report on the stickiness of their thought, and that stickiness can be investigated using pupillometry. In addition, the results suggest that sticky thought may limit attention and exertion of cognitive control to the task.

Introduction

Background

In response to pressing concerns and unreached goals we may catch ourselves in thoughts that we feel are difficult to disengage from. For example, we may be absorbed in thinking about a recently received paper rejection, while we should actually be reading this article. In general, task-unrelated thought is referred to as mind wandering [1]. However, in cases such as the paper rejection, these thoughts may not leave us alone, and make it very difficult to concentrate on our immediate tasks. In this case, one can call these thoughts sticky [2,3]. An extreme form of such sticky thought is rumination, a rigid and narrow-focused thought process that is hard to disengage from and often negative in valence and self-related [4]. In general, rumination causes individuals to be unable to concentrate and devote their attention to tasks at hand because attention is focused internally instead [2]. However, in contrast to depressive rumination, sticky thoughts could also have a positive valence, for example when we are caught up in a pleasant fantasy that we do not want to let go of, or thoughts with desire for a delicious cookie keep recurring in our minds [5,6]. Another term for sticky thought is perseverative cognition. Perseverative cognition has been associated with activation of the physiological stress system, and has been proposed to play a key role in the onset and maintenance of depression [7] and anxiety [8,9]. Finally, sticky thought is closely related to the concept of constrained thinking [10,11]. Constrained thinking refers to an experience in which thoughts do not move freely but instead are focused on a narrow set of content. It is different from our concept of sticky thinking in the question that is posed to the participant—while sticky refers to the experience of the participant that it is difficult to drop the current stream of thought, constrained refers to participants’ experience of having a stream of thought that is–deliberately or not–restricted to a narrow set of content.

Yet, sticky thoughts—especially in their non-clinical form—could also have advantages to the individual. By temporarily shielding thought from external distractions, they can help the individual to work on future goals [1215]. When goals remain unattained and concerns unresolved but the thoughts remain, sticky thoughts may become increasingly more intrusive, disrupting our everyday functioning [2,3,16]. Therefore, sticky thoughts may have important effects on our performance in everyday tasks.

Sticky thought has mostly received attention in literature on psychopathology. Studies have demonstrated that perseverative cognition has measurable negative effects on somatic health (for a review see, [17]). For example, rumination and worry are associated with prolonged activation of the immune system [18], decreases in heart rate variability [19], and increases in blood pressure [20]. Hence, sticky thoughts may not only be disruptive to task performance, but also pose a risk for developing mental and somatic health issues.

Examining stickiness of thought with self-report and task performance

Despite the known disruptive effects of sticky thought on task performance, we have limited understanding of the (attentional) processes that are associated with sticky thought and how those differ from non-sticky thought. One reason for this is that sticky thought is challenging to detect in the context of an experiment [1]. Sticky thinking is largely a covert process that leaves few directly observable signs. Indeed, related processes such as perseverative cognition and rumination have mostly been investigated using self-report questionnaires that measure trait rumination or worry (i.e., the general tendency to engage in sticky thinking), or alternatively, by asking the participant to report on the frequency of ruminative or worry episodes retrospectively. Correlating such measures with task performance [3,21,22] and neurocognitive measures (e.g., [19]) has yielded valuable insights (see [7]). For example, Beckwé et al. [22] found that in an exogenous cue task (ECT) participants with a strong tendency to ruminate had longer reaction times following invalid negative personality trait cues, suggesting that such participants experience more difficulty to disengage from negative personality trait cues, likely because these cues set off a train of negative self-related thinking. Aside from this cognitive inflexibility, studies with cardiac measures have shown that rumination is associated autonomic rigidity, demonstrated by persistent low heart rate variability [19,20]. Despite these insights from questionnaire-based measures of sticky thoughts, self-report arguably lacks precision, given limitations in memory that bias reporting [23,24] and participants’ tendency to produce socially desirable answers. Furthermore, because questionnaires only provide a single after-the-fact measure, it is not possible to compare sticky with non-sticky thought within an individual.

A different, and potentially better, method to measure sticky thought are thought probes. Thought probes are short self-report questionnaires that are embedded in a task to measure the content and dynamics of current thought at various points in time during an on-going task [25,26]. They have the advantage that experiences can be caught close to when they arise. Furthermore, they allow for repeated measures of experienced thought making it possible to investigate changes in thought content over the course of the experiment. For example, Unsworth and Robison [27] used thought probes to investigate how different attentional states, such as mind wandering and external distraction, correlated with task performance and pupil size measures in sustained attention task. The researchers observed that task performance decreased and pupil size became smaller with time-on-task. Also, they found that reports of mind wandering were more frequent when the experiment progressed. This demonstrates that time-on-task influences are important to consider when studying self-generated thinking.

So far, we are familiar with only one study that used thought probes to investigate sticky thought. Van Vugt and Broers [2] used thought probe responses in conjunction with task performance measures to investigate how self-reported stickiness of thought was associated with the probability of being disengaged from the task (i.e., off-task). In addition, the researchers examined how self-generated thought and its stickiness affected performance. They asked participants to perform a variation of a go/no-go task referred to as the sustained attention to response task (SART; [28]). This task is suitable for studying self-generated thought because it is slow-paced and induces habitual responding, therefore allowing self-generated thought to occur. In line with their expectation, self-reported stickiness of thought increased the probability of being disengaged from the task and negatively influenced performance. Stickiness of thought was associated with more variable response times. Previous research has also indicated that variability in response times may be a relevant correlate for self-generated thought (see e.g., [2931]). The increase in variability may indicate that participants allocate less attention to the task, resulting in reactive and more variable responding [32]. All in all, this study demonstrated that stickiness of thought is a relevant dimension of self-generated thought. Furthermore, it indicated that people can meaningfully report on the stickiness of their thought.

Correlates of self-generated thought and stickiness in pupillometry

In addition to task performance, neurocognitive measures can be used to detect sticky thought, and can provide insight into the processes and mechanisms associated with sticky thought. In this study, we will use pupillometry to gain insight in sticky thought. Pupil size is an interesting measure because it relatively unobtrusive and easy to record. Furthermore, research has indicated that lapses of attention can be distinguished in various pupil size measures [33,34]. Therefore, we may also be able to detect differences in pupil size depending on the stickiness of thought.

Pupil size is typically measured on two temporal scales, reflecting different cognitive or neural processes. The most common of these measures is the task-evoked response in pupil size. The task-evoked response is a transient increase in pupil size following the processing of a task event, peaking at around 1s after event onset [35]. The magnitude of this response has been demonstrated to depend on the amount of attention, cognitive control, and cognitive processing required by the task [3638]. Research has consistently found that when we engage in self-generated thought, our evoked responses in pupil size are smaller [33,3941]. Smallwood and colleagues [15] interpreted this smaller response in pupil size as evidence that external processing is being inhibited during self-generated thinking, so-called perceptual decoupling.

In addition to stimulus-evoked pupil responses, pupil size is also measured during task-free periods, referred to as baseline or tonic pupil size. Baseline pupil size is proposed to reflect locus coeruleus norepinephrine (LC-NE) system functioning, which has been associated with controlling overall arousal levels and the tendency to seek for novelty [27, see 42,43]. Large baseline pupil size has been correlated with high tonic LC-NE firing, indicating a state of over-arousal and tendency to explore new behaviors. On the other hand, smaller baseline pupil sizes have been related to low tonic firing, under-arousal, and inactivity. Interestingly, research has proposed that the relationship between baseline pupil size, task-evoked pupil size, and task performance can be described with an adaptive gain curve (see Fig 1; see e.g., [37,44]). Task performance is optimal at intermediate levels of baseline pupil size when task-evoked responses are maximal. Task performance decreases when baseline pupil size is either larger or smaller.

Fig 1. Adaptive gain curve.

Fig 1

The adaptive gain curve describes the relationship between baseline pupil size and task-evoked pupil size [43,44]. Task-evoked responses are maximized at intermediate levels of baseline pupil size, but decrease in magnitude when the baseline is smaller or larger. The curve also makes predictions about task performance. Performance on a task is optimal at intermediate baseline pupil size when task-evoked responses are maximal. Task performance decreases when baseline pupil size is smaller or larger than the intermediate level.

Since stickiness (i.e., the difficulty in disengaging from thought) is a novel topic, no studies have directly investigated how stickiness is reflected in baseline and task-evoked pupil size. Nonetheless, predictions can be made based on related research. Given the disruptiveness of sticky thought to ongoing activities, we may expect that sticky thought, similar to self-generated thinking, is associated with smaller task-evoked responses in pupil size. As predicted by adaptive gain (see Fig 1 above), a smaller task-evoked response in pupil size with episodes of sticky thought would imply that the thought process is associated with either smaller or larger than average baseline pupil size. However, which one is open to debate. In clinical samples, Siegle et al. [45] found that rumination was associated with larger baseline pupil sizes. The researchers hypothesized that this larger baseline pupil size reflected sustained emotional processing [46]. In contrast, Konishi et al. [47] found that in non-clinical samples, negative and intrusive thoughts were associated with smaller baseline pupil size [48]. One recent study investigated how the “intensity” of experienced thought was reflected in baseline pupil size, which may be a dimension somewhat comparable to the stickiness of the thought [41]. However, that study was unsuccessful in finding an effect of intensity of thought in baseline pupil size. Also for self-generated thought, the literature has not reached consensus on where the thought process lies on the adaptive curve [15,40,49]. Given that sticky thought is proposed to develop from thinking about pressing concerns and unreached goals [13,16], one might think that sticky thought is associated with high arousal, and therefore larger than average baseline pupil size. On the other hand, one might also think that sticky thought results from a state of inertia, reflected in smaller baseline pupil size.

The current study

The main goal of the present study was to investigate how stickiness of thought is reflected in task performance and pupillary measures (i.e., baseline and task-evoked response in pupil size). We asked participants to perform a variation of the SART, in which we embedded the personal concerns of participants in the task to potentially increase the tendency for sticky self-generated thought [see 50]. We included periodic thought probes in the SART to measure what participants were currently thinking about (i.e., attentional state) and how difficult it was to disengage from the thought (i.e., stickiness of thought).

In line with previous work, we expected that sticky thought would be associated with being more disengaged from the task. Since being disengaged from the task has been found to reduce no-go accuracy, speed-up response times and increase RTCV, we predicted that no-go accuracy would decrease, response times would be faster, and that RTCV would increase with reported stickiness of thought.

With respect to pupillary measures, we predicted that stickiness of thought would be associated with smaller task-evoked responses in pupil size, indicating reduced attention to the SART. Given that no research has investigated the influence of stickiness of thought on baseline pupil size, and the inconsistency in previous studies that tried to relate for baseline pupil size to self-generated thought, we formulated no prior hypotheses for that measure.

Materials and methods

Participants

We recruited 34 Native Dutch speakers for this experiment (20 female; M age = 22.7, SD age = 2.7). Participants were recruited from a paid research participant pool on Facebook, as well as from the Artificial Intelligence Bachelor and Master programs at the University of Groningen. We screened the participants for having normal or corrected-to-normal vision prior to testing. All participants provided informed consent at the start of the laboratory session. The experiment was conducted in accordance with the Declaration of Helsinki and approved by the Ethical Committee of Psychology (ECP) at the University of Groningen (research code: pop-015-170). Written informed consent was obtained prior to the experiment for each participant.

Materials

Questionnaire session

Participants were requested to fill out three online questionnaires prior to the experiment. Since we wanted to maximize the probability of observing sticky thinking, which is known to often be related to concerns and worries, we adopted the current concerns manipulation by McVay & Kane [50]. For this manipulation, individual current concerns were collected using an online version of the Personal Concerns Inventory (PCI; adapted from [51]). In this questionnaire, participants were asked to write down short statements about current goals or concerns in eight different areas, including: 1) home and household matters, 2) employment and finances, 3) partner, family, and relatives, 3) friends and acquaintances, 4) spiritual matters, 5) personality matters, 6) education and training, 7) health and medical matters, and 8) hobbies, leisure, and recreation. For every current goal or concern, participants were asked to rate the importance on a scale from one to ten, and to indicate a time frame in which the goal/concern was expected to be accomplished or resolved. Participants were encouraged to think about goals or concerns that were relevant in the coming year. In addition to the PCI, the Behavioral Inhibition System/Behavioral Approach System scales (BIS/BAS; [52]) and the Habit Index of Negative Thinking (HINT; [53]) were used as distractor questionnaires to make the goal of our study less obvious to our participants. The PCI and BIS/BAS questionnaires were administered in Dutch (translated from English), the HINT in English (original language) given that no validated translation was available.

Experimental session

The SART in this experiment was based on the task used by van Vugt and Broers [2] and McVay and Kane [50]. Our SART included 720 Dutch words as stimuli that were presented in black. The majority of words were lower-case go stimuli (n = 640, 89% of total set), while only a small set were upper-case no-go stimuli (n = 80; 11% of total set). Participants were instructed to press a button as fast as they could on go stimuli, but to withhold a response on the infrequent no-go stimuli. All stimuli were presented centrally against a grey background.

Similar to the earlier works we embedded participant’s personal current concerns in the SART task, along with the current concerns from another participant as a control (i.e., other concerns). We selected two personal concerns for each participant based on the PCI answers, and two ‘other’ concerns that were distinctly different from their personal concerns. Each current concern was translated into a triplet of words. For example, if a participant reported (A), this was translated into (B).

  1. “Er zijn nog wat dingen die ik moet voorbereiden voordat ik kan beginnen met een tussenjaar.”

    “There are still some things I need to arrange before I can start taking a gap year.”

  2. pauze loopbaan prepareren

    prepare break career

We looked for two personal concerns with the highest importance rating. Whenever two concerns had the same importance rating, we selected the most unique concern. Concerns that were too common or general were avoided. Concern words were always go stimuli.

The stimulus words that were not part of the personal/other concern triplets were selected from the Dutch word frequency database: SUBTLEX-NL [54]. This database contains word frequency values based on film and television subtitles. We selected the stimulus words based on the Lg10CD variable. This variable is a measure of the contextual diversity of a word, reflected in how many films or television shows it occurred. A validation study with a lexical decision task showed that the Lg10CD variable explained most variance in task performance (i.e., accuracy and response time; (see [54]). The same study also showed that the SUBTLEX-NL database explains 10% more variance compared to the common CELEX database (CELEX; [55]). Before selecting the word stimuli, we first discarded the least and most frequent words from the database. Thereafter, 312 words were selected with a Lg10CD value around the mean. We removed and replaced selected stimuli that were numbers, non-words, or high-arousal words.

We measured the occurrence of self-generated thought and the stickiness of thought by periodically including thought probes in the task. Thought probes consisted of two questions (Fig 2). The first question was adopted from Unsworth and Robison [27] and addressed the current thought content or attentional state. This question differentiated six types of attentional state: 1) on-task focus, 2) task-related interference (TRI), 3) concern related thought, 4) external distraction, 5) mind wandering, and 6) mind blanking/inattentiveness. The second question was adopted from van Vugt and Broers [2] and asked how “sticky” the current thoughts were. Stickiness was measured as thought being 1) very sticky, 2) sticky, 3) neutral, 4) non-sticky, and 5) very non-sticky. We included 48 thought probes in the experiment.

Fig 2. Thought probe question.

Fig 2

An English translation of the thought probe questions used in the experiment. The first (left) question was used to measure attentional state, the second (right) question to measure stickiness of thought.

It is relevant to note that the second ‘stickiness’ question has only been used once in previous research (see [2]), while the question on attentional state (and similar counterparts) have been used more frequently. Therefore, the reliability and validity of the measure cannot be guaranteed. However, there are signals that provide confidence in the reliability and validity of the stickiness question. First, the significant differences in task performance across the different levels of stickiness reported by the study of van Vugt and Broers does indicate that participants are able to report on the stickiness of their thought with similar accuracy to other thought responses. Furthermore, Mills et al. [11] showed that participants’ assessment of the extent to which their thoughts were constrained, a concept similar to our stickiness, correlated significantly with external reviewers’ assessments.

Apparatus and set-up

Participants completed the PCI, HINT, and BIS/BAS questionnaires online, prior to coming to the lab, using Google Forms. The SART was performed individually in the lab. This lab contained a desk for the participants on which a computer, monitor, eye tracker, and head-mount was located. Pupil size and gaze position of the dominant eye were recorded at a sampling rate of 250 Hz using an Eyelink 1000 eye tracker from SR Research. The experiment was programmed in Psychopy (version 1.83.04; [56]) and interfaced on a Mac mini running Windows 7. The stimuli were presented on a 20 inch LCD monitor with a resolution of 1600x1200 pixels (4:3 aspect ratio) and a refresh rate of 60 Hz.

Procedure

Questionnaire session

Following registration for the experiment, participants received an email with a single link to the three online questionnaires. Participants started with the HINT, followed by the BIS/BAS and PCI questionnaire respectively. They were instructed to complete the questionnaires no later than the day before the laboratory session. After filling out the questionnaires and before the experimental session, we collected the current concerns from the answers on the PCI as described in section Materials: Experimental Session. The selected concerns, together with the concerns of another participant, were subsequently embedded in the stimulus set of the respective participant.

Experimental session

The experimental session started with setting up the eye tracker. Participants were seated in front of the display computer and monitor, eye tracker, and head-rest. The head-rest was adjusted to the height of the participant. We performed a nine-point calibration and separate validation using the eye tracker software. The calibration and validation procedure were performed for the dominant eye of the participant, or in some cases the other eye if that provided a better signal. Following calibration and validation, the instructions for the experiment were presented on the screen. The instructions on how to perform the SART were presented first, including one example of a go and a no-go trail. Afterwards, participants were informed that they would be periodically asked to report on their current thoughts. The questions for attentional state and stickiness of thought were presented on the screen, including the instructions on how to report their answer. The participants were not otherwise instructed or trained on how to use the thought probes but were invited to ask questions any time. A short practice session followed the instruction phase. This practice session consisted of ten SART trials (including one no-go trial) and one thought probe. The practice session included no trials reflecting a personal or other concern. After practice, the experiment started and the eye tracker started recording.

Each trial (see Fig 3, bottom) started with an inter-trial interval (ITI) of variable duration between 1500 and 2100 ms. During the ITI, a fixation cross consisting of the ‘+’ symbol was presented centrally on the screen. The ITI was followed by the presentation of the stimulus word for 300 ms. Go stimuli were presented in lower-case, whereas no-go stimuli were presented in upper-case. Participants were instructed to only respond on go trials (as fast as possible) by pressing the ‘m’ key on the keyboard and to withhold a response on no-go trials. After stimulus presentation, a mask (‘XXXXXXXX’) was presented for 300 ms followed by a response interval of 3000 ms marked by a ‘+’ symbol. Pupil responses were recorded during the stimulus, mask, and response intervals. Once a participant responded during the mask or response interval, the experiment immediately moved on to the next ITI.

Fig 3. Task overview.

Fig 3

A series of trials (top) and a single trial (bottom). Whenever a concern triplet was presented (red boxes), this was followed by four go trials (blue boxes), one no-go trial (green box), and one thought probe. Each trial, go or no-go, started with a variable inter-trial interval (ITI). Thereafter, the stimulus was presented in lowercase for go trials and uppercase for no-go trials. The stimulus was followed by a mask (until response) and a response interval (until response). Whenever the participant responded during the mask or the response interval, the experiment immediately proceeded with the next trial (i.e., ITI is drawn).

The 720 trials in the experiment (640 go; 80 no-go) and 48 thought probes were equally distributed across eight blocks of 90 trials (80 go; 10 no-go) and six thought probes. All participants saw the same (no concern) stimulus words but in a random order. The blocks consisted of two similar sequences of 45 stimulus words and three thought probes. The only difference between the two sequences in a block was the concern condition. One sequence contained a personal concern triplet, whereas the other contained an other concern triplet. The order was counterbalanced across the experiment. Furthermore, each block contained only one of the two personal and other concerns. Which type of concerns was selected alternated between blocks. When a concern triplet was presented, the order of the trial type (i.e., go–personal concern, go–other concern, go–no concern, no-go) and thought probes was fixed. This order was based on the experiment of McVay and Kane [50]. As shown in Fig 3 (top), concern triplets were always followed by four go (no concern) trials, one no-go trial, and one thought-probe. The thought probe questions always immediately followed the no-go trial to ensure that the reported thought content and its stickiness could be reliably attributed to the trials before it. We are aware that a limitation of this design is that participants may confabulate their answer to the thought probe as being off-task when an error has been made on the no-go trial. Nevertheless, since this is the procedure used many prior studies on which we based our work, we kept this design.

Data analysis

Preprocessing of eye tracking recordings

Before analysis, we first removed pupil size measurements associated with blinks and other artifacts. Blinks were detected using the eye tracker software. We removed the pupil size measurements marked as a blink including 100 ms before and after the event. In addition, we removed sudden upward or downward jumps. Jumps were identified by first z-scoring the pupil size timeseries for each participant individually. Subsequently, we marked pupil size measurements that had a 0.05 absolute difference in pupil size from the previous measurement (i.e., 4 ms earlier with a 250 Hz sample rate) including 20 ms before and after the observation. Subsequently, we visually inspected the marked segments of the data that would be removed with this cut-off. We concluded that this cut-off was sensitive enough to remove the jumps, but not so sensitive that it would also discard ‘normal’ increases in pupil dilation. In total we discarded 12.2% of the pupil size measurements with SART trials, with percentages ranging from 1.7% to 29.4% across individual participants. Trials with more than 25% discarded/missing data were removed completely, resulting in the removal of 11.1% of the trials (range = 0.3% - 57.3%). We downsampled the data to 50 Hz, taking the median pupil size for each time bin. We did not interpolate the data, since our analysis methods (generalized additive mixed models and linear-mixed effect models; see Statistical Analysis) can deal with missing data. After downsampling, we segmented the pupil size measurements in timeseries for individual trials ranging from 500 ms before stimulus onset to 2000 ms after onset. This time window was chosen to fully capture the pupil response to the task stimuli, while preventing overlap in the segments of neighboring trials.

Eye tracking measures

We calculated the baseline pupil size by first taking the mean of the pupil size measurements in the window of 500 ms before stimulus onset. Subsequently, the baseline pupil size was then determined by z-scoring the means for each participant individually. Z-scoring the baseline values sets the grand average for each participant at zero, thereby removing individual differences in pupil size. Task evoked pupil size was obtained by subtracting the (non-transformed) baseline of each trial from the pupil size measurements in that trial.

Behavioral measures

We measured task performance in the SART on accuracy, response time, and variability in response time (RTCV). Accuracy was expressed as a binomial dependent variable, coding correct responses as ‘1’ (i.e., button press on go trials and no response on no-go trials) and incorrect responses as ‘0’ (i.e., no response on go trials and a button press on no-go trials). Response was measured in milliseconds, but log-transformed to account for right-skewness in the distribution of these measurements. RTCV was calculated by taking the mean of the four go trials preceding a no-go trial (Fig 4), divided by the standard deviation of response time in those trials. Similar to the response times, the RTCV values were log-transformed prior to analysis.

Fig 4. Influence of concern manipulation on frequency of different attentional states.

Fig 4

Report counts were derived from the first thought probe question. This question had six answer options. From left to right on the x-axis, on-task refers to the answer option for task focus, TRI to task-related interference, SGT (i.e., self-generated thought) to answers on mind wandering and thoughts on personal concerns, and other to answers on external distraction and inalertness. Error bars reflect one standard error of the subject mean.

Alongside these task performance measures, we included variables for the current concerns condition, attentional state (i.e., on-task, self-generated thought, task-related interference etc.), and stickiness level (i.e., very non-sticky, non-sticky, neutral, sticky, very sticky) in the analysis. Stickiness level was both used as (ordered) categorical dependent variable and predictor depending on the analysis. Current concerns condition and attentional state were only included as categorical predictors. The current concern condition predictor indicated whether a go/no-go trial was preceded by a triplet of personal or other concerns, or in cases where there were no concern related trials, within a window of eight trials (see Fig 4). The attentional state and stickiness level predictor indicated the answer on the first and second thought-probe question respectively.

We noticed that some answer options on both thought-probe questions had very few observations (see Tables 1 and 2). To increase the amount of observations per answer option and thereby increase statistical power, we decided to combine the answer options into larger categories. For attentional state, we combined the answer option for thoughts about current concerns (option 3) with mind wandering (option 5) into the larger category of self-generated thought, justified by the idea that thoughts about concerns are a special case of mind wandering. We also combined the option for external distraction (option 4) and inalertness (option 6) into the other category. This resulted in the following levels: on-task, task-related interference, self-generated thought, and other. For stickiness level, we decided to group the first two answer options into a sticky category, and the last two into non-sticky. We refer to the third (intermediate) answer option as neutral.

Table 1. Distribution of responses to attentional state question.

Average number of responses (out of N = 48) to each answer option on the attentional state question per subject. Relative frequencies, expressed in percentages, are presented in the third column.

Answer option Frequency (out of N = 48 per subject) Percentage
On-task 22.3 46.5
Task-related interference 10.5 21.8
Current concerns 4.9 10.1
External distraction 5.2 10.8
Mind wandering 3.9 8.0
Inalertness 1.3 2.3
Table 2. Distribution of responses to stickiness question.

Average number of responses (out of N = 48) to each answer option on the stickiness question per subject. Relative frequencies, expressed in percentages, are presented in the third column.

Answer option Frequency (out of N = 48 per subject) Percentage
Very sticky 3.4 7.4
Sticky 12.4 25.9
Neutral 19.8 41.1
Non-sticky 7.9 16.5
Very non-sticky 4.5 9.4

Statistical analysis

We investigated the thought probe reports by computing a count for each answer option to both questions for each participant. The resulting answer frequencies were analyzed using generalized linear models assuming a Poisson distribution. We used linear-mixed effects modeling (LME) in the remaining analyses, except when ‘time’ (i.e., time-in-trial, time-on-task) was considered as a predictor. We assumed a Gaussian distribution for fitting response time, RTCV, and baseline pupil size. A binomial distribution was assumed for accuracy measurements. We fitted an ordered categorical LME for predicting stickiness level.

When fitting LMEs, it is important to determine a good random effects structure. Including too few random effects makes the model potentially over-confident, resulting in more Type I errors [57]. Including too many random effects lowers statistical power [58,59]. To balance Type 1 error and statistical power, we determined the random effects structure of each model using a chi-square log-likelihood-based backwards model fitting procedure. With this procedure we removed one term from the random effects structure at every step, starting from the most complex model. We kept the simpler model if the more complex model did not significantly explain more variance. Random effects that correlated strongly (r > 0.5) with one or more other random effects were always removed. Models that did not converge or provided a singular fit were not considered. In such cases, we continued the procedure of leaving one term out at every step. We considered trial number, block number, and participant number as random intercepts. Current concern condition, attentional state, and stickiness level were considered to have random slopes whenever they were included as a fixed effect in the model.

Statistical significance of individual predictors in the fitted LMEs were determined using chi-square log-likelihood ratio tests, testing the model including the predictor against an intercept-only model. Interactions were tested by comparing a model with the interaction against a model with only the main effects. Predictors in the LMEs were categorical. Consequently, the test statistics only reflect comparisons to a reference group of the categorical predictor(s). The reference group for attentional state was ‘on-task’, for stickiness of thought ‘neutral’, and for current concerns the ‘no concern’ condition. Regression estimates (i.e., intercept and slopes) of individual LMEs were transformed back to the original scale to enhance interpretation. For Gaussian LMEs we did not determine p-values, but we use report t-statistics to indicate statistical significance (|t| ≥ 2).

We conducted timeseries analysis (e.g., for task-evoked pupil response; time-on-task effects on attentional state, stickiness, and baseline pupil size) using a nonlinear regression technique called generalized additive mixed modeling (GAMM; [60,61]). Unlike existing related research using summarizing measures such as mean pupil size after stimulus onset [40], the slope of pupil size [39,40,62], and the maximum pupil size in a specified time window [33], GAMM allows you to model full time courses. The difference from linear regression is that the slope estimates are replaced by smooth functions that describe how a timeseries measure such as task-evoked pupil size changes over time. When a categorical predictor is added to the GAMM, the model will fit a different smooth function for every level of this predictor. Such smooth functions can subsequently be visualized to examine the development of the statistical effects over time. GAMM also allows for including nonlinear random effects called random smooths. In essence, random smooths estimate random effects coefficients for the intercept as well as how the slope of a timeseries changes over time. In our analyses, we used a random smooth for events that reflected the individual time course of each trial and participant. Alongside a random smooth for events, we also included a nonlinear interaction between the x and y gaze position in each GAMM. This was to account for influences of gaze position on pupil size [63,64]. An issue with modeling task-evoked responses in pupil size is that the residuals of the model are not normally distributed. To account for non-normality, we fitted all GAMMs for task-evoked pupil responses (except for one) assuming a scaled-t distribution [65]. Only the GAMM estimating the influence of stickiness on go-trial evoked responses was fitted as a Gaussian model, since the model did not converge when assuming a scaled-t distribution. Another common issue is that the pupil size recordings are highly correlated over time, violating the method’s assumption that residuals are independent. Violation of this assumption may cause a GAMM model to underestimate the size of standard errors. We accounted for autocorrelation by including an autoregressive AR(1) error model within each GAMM [66]. For an excellent tutorial paper on how to use GAMMs for pupil size analysis, we refer to van Rij and colleagues [67].

From the fitted GAMMs, we could determine whether estimated smooth terms were statistically significant. In other words, we could determine whether there were significant (nonlinear) changes in the value of a dependent variable (such as task-evoked pupil size) along the time course of a trial for different attentional states and stickiness levels. We checked for significant differences between two timeseries by determining a difference curve based on the estimated smooth terms. Two timeseries were considered to be significantly different at some point in time when the estimated difference curve including a pointwise 95% confidence interval did not include zero (given that zero indicates the absence of a difference).

Preprocessing and data analysis were performed in R. We used the lme4 package to fit the Gaussian and binomial LMEs (version 1.1–19; [68]). GAMMs and ordered categorical LMEs were fitted using the mgcv package (version 1.8–28; [69]). Model estimates and diagnostics for GAMMs were visualized with the itsadug package (version 2.3; [70]). The data and the analysis code for preprocessing and model fitting are available online at: https://osf.io/m6ujg/.

Results

Thought reports

First, we analyzed the reports collected from the two thought probe questions. We assessed whether embedding participant’s personal concerns influenced participants’ tendency to engage in sticky or off-task thinking. Next, we examined whether time-on-task influenced attentional state (answer to the question “what were you just thinking about?”) and stickiness of thought (answer to the question “how difficult was it to disengage from the thought?”). Finally, we investigated the relationship between attentional state and the experienced stickiness of thought.

Current concerns manipulation

Fig 4 shows the effect of the current concerns manipulation. Following no concern triplets, we observed that on-task reports were most frequent (M = 7.97 (in count) out of N = 48 total reports; SD = 3.34), followed by reports of task-related interference (M = 4.00; SD = 2.67), self-generated thought (M = 2.06; SD = 1.76), and other reports (M = 1.97; SD = 1.85). The number of self-generated thought reports increased after a personal concern triplet relative to a no concern triplet (Mdiff = + 1.59). The average increase in self-generated thought reports after concerns was significant (β = + 1.59 (in count), z = -2.12, p < .001). In addition, we found that concerns from another participant increased the amount of self-generated thought reports (Mdiff = + 0.94; β = + 0.94 (in count), z = -2.73, p = .03). However, the mean increase in frequency of self-generated thought following such “other concerns” was found to be smaller compared to personal concerns (Mdiff = - 0.65; β = - 0.65 (in count), z = -2.52, p = .009). Therefore, while personal concerns and other concerns were both found to increase self-generated thinking, personal concerns were more potent.

With respect to the stickiness of thought, we found that participants most frequently reported their thought as neutral following no concern triplets (M = 6.71 (out of 48 total reports); SD = 4.48), followed by sticky (M = 5.06; SD = 3.85), and non-sticky (M = 4.24; SD = 4.36). In contrast to what we found for self-generated thought, we did not find support for an increase in stickiness of thought following personal (or other) concerns (χ2(2) = 3.82, p = .15). Therefore, it is unclear whether processing current concerns in the SART could increase the stickiness of thought.

Time-on-task influence

Fig 5 shows how attentional state (e.g., on-task, self-generated thought etc.) changed over the course of the experiment. The right figure shows the estimates from the fitted GAMM. Our results indicated that only the smooth terms for on-task and self-generated thought were significant (on-task: F = 8.00, p = .005; self-generated thought: F = 10.66; p = .001; task-related interference: F = 2.90, p = 0.09; other: F = 1.33, p = 0.31). Therefore, we can (only) conclude for on-task and self-generated thought that the amount of reports on this type of thinking changed over time. As shown in Fig 5, on-task thought decreased while self-generated thought increased as the task progressed.

Fig 5. Observed and estimated effect of time-on-task on attentional state.

Fig 5

The left plot presents the observed data and the right plot presents the estimated data from the best-fitting GAMM model. The GAMM model explains 27% of the variance in the thought probe data, calculated by taking the square of the correlation between the observed (fitted) and predicted data. Error bars in the right plot reflect estimated pointwise 95% confidence intervals.

We then asked how stickiness of thought changed over the course of the experiment. We fitted an ordered-categorical GAMM to test how time-on-task influenced the likelihood of reporting having neutral, sticky, or non-sticky thoughts. For this analysis we included the reported answer options as an ordinal dependent variable (1 being non-sticky, 2 neutral, and 3 being sticky). Block number was included as continuous predictor reflecting time-on-task. The results showed that the smooth term for block number was significant (χ2 = 12.11, p < .001), indicating that the reported level of stickiness changed over the course of the experiment. To inspect how the likelihood of reporting the different levels of stickiness changed over time, we obtained the predicted probability estimates from the model and plotted these in Fig 6 (right). With increasing time-on-task, we found that reports of neutral thought remained relatively constant. Furthermore, neutral thought was most prevalent in general. At the same time, we found that the probability of sticky thought increased with time-on-task, while it decreased for non-sticky thought. Together, this indicates that thought became more sticky as the task progressed.

Fig 6. Observed and estimated effect of time-on-task on stickiness level.

Fig 6

The left plot presents the observed data and the right the estimated data from the best-fitting GAMM model. The estimated probabilities in the right plot were derived by fitting an ordered-categorical GAMM model. Error bars in the right plot reflect estimated pointwise 95% confidence intervals.

Relationship between attentional state and stickiness level

We then examined whether the level of stickiness depended on whether a participant was focused on the task, mind-wandering or elsewhere. As shown in Fig 7, the reported stickiness of on-task thought strongly differed from the distracted attentional states. The majority of non-sticky (M = 0.58) and neutral thought reports (M = 0.61) were associated with on-task thought. On the other hand, reports of sticky thought were relatively more frequent in the distracted states. To test whether distracted states were experienced as stickier, we fitted an ordered categorical (ordinal) LME predicting stickiness level by attentional state. The model indicated that all off-task states were reported as stickier than on-task (on-task: intercept β = -0.40 (transformed), t = -1.79; self-generated thought: β = + 1.53 (transformed), t = 9.85; task-related interference: β = + 1.54 (transformed), t = 11.07; other: β = + 1.77 (transformed), t = 10.11).

Fig 7. Relationship between stickiness of thought and attentional state.

Fig 7

Report counts were derived from the first (attentional state) and second thought probe question (stickiness of current attentional state). Error bars reflect one standard error of the subject mean.

Task performance

We analyzed task performance to examine how attentional state and stickiness of thought was reflected in performance on go and no-go trials. Overall, we found that participants were 56.62% accurate (SD = 49.57%) on no-go trials. The mean response time to go trials was 375.51 ms (SD = 94.14 ms), with a mean coefficient of variance (RTCV) of 0.14 (SD = 0.12). As expected, we found that all ‘distracted’ attentional states were associated with a lower accuracy on no-go trials compared to on-task (χ2(3) = 216.08, p < .001; on-task: intercept β = 0.80, z = 5.92, p < .001; self-generated thought: β = - 0.36, z = -9.47, p < .001; task-related interference: β = - 0., z = -11.54, p < .001; other: β = - 0.42, z = -10.08, p < .001). No significant influence of attentional state was found on go response time (χ2(3) = 4.88, p = .18) nor on RTCV (χ2(3) = 5.43, p = .14). For stickiness of thought (see Fig 8), the results showed neither a significant influence of stickiness on response time (χ2(2) = 2.37, p = .31), nor on RTCV (χ2(2) = 2.68, p = .26). On the other hand, we did find a significant step-wise decrease in no-go accuracy from non-sticky thought to sticky thought. Compared to neutral stickiness, participants were 20% more accurate when current thinking was non-sticky (β = + 0.20, z = 5.96, p < .001), and 29% less accurate when current thinking was more sticky than neutral (β = - 0.29, z = -8.12, p < .001). When attentional state was added to the LME model as an additional categorical factor alongside stickiness, we found that stickiness remained a significant predictor of no-go accuracy (χ2(2) = 82.99, p < .001), but not RT (χ2(2) = 1.14, p = .57) or RTCV (χ2(2) = 0.83, p = .66). This suggests that stickiness exerts unique influence on no-go accuracy on top of attentional state. The model predicted that participants were 23% more accurate when self-generated thinking was non-sticky (β = + 0.23, z = 5.35, p < .001) compared to neutral (intercept β = 0.48, z = -0.25, p = .80). Participants were 17% less accurate when self-generated thought was reported as sticky (β = - 0.17, z = -4.63, p < .001).

Fig 8. Influence of stickiness on task performance.

Fig 8

Mean no-go accuracy (left), go response time (center), and go RTCV (right) for each level on the stickiness dimension. Error bars reflect one standard error from the subject mean.

Baseline pupil size

The behavioral results indicated that the frequency of different attentional states and stickiness of thought changed over the course of the experiment. Therefore, we need to take time-on-task into account when we assess how attentional state and stickiness are reflected in baseline pupil size. Fig 9 (top panel) shows the baseline pupil size across blocks for each attentional in the data (left) and predicted by a GAMM model (right). The data and the model demonstrated that baseline pupil size became smaller as the task progressed. At the same time, we failed to find consistent differences in the baseline pupil size between the attentional states. Therefore, we cannot conclude that baseline pupil size was predictive of experiencing a specific attentional state. As shown in Fig 9, bottom panel and assessed with a GAMM, we also failed to find consistent differences in baseline pupil size between the different stickiness levels.

Fig 9. Observed and estimated baseline pupil size for different attentional states and stickiness levels.

Fig 9

The left plots show the observed baseline pupil size across blocks for different attentional states (top panel) and stickiness levels (bottom panel). The right plots show the estimated baseline pupil size from the best-fitting GAMM. Error bars in the right plots reflect the estimated 95% confidence intervals.

Task-evoked response in pupil size

We assessed the task-evoked response in pupil size for each attentional state and stickiness level separately for go and no-go trials. For all following analyses, we only considered correct trials. We present the grand averages of the task-evoked pupil responses along with the estimates of a fitted GAMM model in Figs 10 and 11, for go and no-go trials respectively.

Fig 10. Task-evoked response in pupil size aligned to go stimulus onset (t = 0).

Fig 10

The left plots show the average evoked response for each attentional state (top) and stickiness level (bottom) as observed in the data. The right plots show the estimates of the best-fitting GAMM models. We checked for significant differences between two evoked responses by determining a difference curve based on the estimated evoked responses. Two evoked responses were considered to be significantly different at a particular point in time when the pointwise 95% confidence interval around the estimated difference curve not include zero. We indicated a significant difference between two evoked responses with a colored bar.

Fig 11. Task-evoked response in pupil size aligned to no-go stimulus onset (t = 0).

Fig 11

The left plots show the average evoked response for each attentional state (top) and stickiness level (bottom) as observed in the data. The right plots show the estimates of the best-fitting GAMM models. Two evoked responses were considered to be significantly different at a particular point in time when the pointwise 95% confidence interval around the estimated difference curve not include zero. Significant differences between two curves were indicated with colored bars in the plot.

Go trials

What is noticeable from the task-evoked pupil responses on go trials is that there appear to be two peaks in the pupil response. The first peak occurs at around 700 ms, followed by a second peak at approximately 1200 ms. Although it is difficult to determine what is precisely reflected in these two peaks, it is reasonable to assume that the first peak reflects the amount of attention allocated to the (visual) processing of the stimulus, while the second peak may reflect processing related to the response and/or processing of the mask or fixation cross. Our results showed that the evoked response in pupil size was smaller at the first peak, but not at the second peak, when participants were engaged in self-generated thought (t = [434–788 ms]) or other distractions (t = [333–939 ms]) compared to being on-task. For task-related interference we found no significant difference in the task-evoked pupil response from on-task.

With respect to the stickiness of thought, we found that the task-evoked response in pupil size was smaller when participants experienced sticky thought compared to neutral thought (t = [283–1419 ms]), as well as non-sticky thought (t = [611–737; 965–1167 ms]). However, the task-evoked response during non-sticky thought was not found to different from the response during neutral thought.

No-go trials

Similar to the go trials, we found that the evoked response in pupil size on no-go trials was characterized by two peaks occurring at around the same time points as we observed for the go trials. Self-generated thought was found to be associated with a substantially smaller response in pupil size compared to on-task for the majority of the response (t = [384–2000 ms]). For the other distracted states, we found no significant differences in pupil size compared to being on-task.

When participants reported having sticky thoughts, we found that the task-evoked response in pupil size was significantly smaller compared to neutral thought (t = [510–864; 914–1672 ms]). Also for non-sticky thoughts we found that the evoked response in pupil size was smaller compared to neutral thought, but this difference only reached significance at the second peak (t = [1192–1823 ms]). The difference in evoked response between sticky and non-sticky thoughts was not found to be significant at any timepoint.

Discussion

The goal of this research was to explore the “stickiness” dimension of ongoing thought, which reflects a participants’ experienced difficulty of disengaging from thought ([2]; see also [3]). We investigated how self-reported stickiness was associated with the participant’s attentional state, how it influenced task performance, and how it influenced pupil size. We adopted a variation of a sustained attention to response task (SART), which has been shown to be sensitive to lapses of attention [7173]. Personal concerns of the participants were embedded in the SART to potentially increase the probability of observing sticky thought [16,50].

Correlates and insights for the stickiness dimension of thought

We found that when participants reported having sticky thoughts, they also frequently reported being disengaged from the task (see also [2]). Conversely, non-sticky thought (i.e., easy to disengage) and neutral thought (i.e., neither hard nor easy to disengage) were mostly associated with being focused on the task. Therefore, the results of the present experiment demonstrated that–at least in the context of sustained attention–ongoing thought is frequently experienced as difficult to disengage from off-task thought, but easy to withdraw from task focus.

On go trials, we found that reports of sticky thought could be discriminated from neutral or sticky thought in task-evoked pupil dilation, but not in behavioral indices. In contrast to earlier studies (see e.g., [2,31]), this research did not demonstrate faster response times (RT) and higher variance in response times (RTCV) to go trials when participants engaged in sticky, off-task, thinking. The absence of this effect was not an issue of power. Calculating Bayes factors separately for RT and RTCV demonstrated that the present study provides strong evidence for similar RT (BF01 = 37.3) and RTCV (BF01 = 26.2) across different degrees of sticky thinking. An explanation for the present results may be in the relationship between RTCV and the degree to which participants were disengaged from the task (see [73]). Increases in RTCV have been associated with a state of “tuning out” (see [74]), where attention is partially allocated away the task while awareness to the general task context remains. The transient disengagement from the task during tuning out results in slowing and speeding of response times could lead to higher RTCV. In this experiment, participants were likely to be more strongly disengaged from the task during sticky thoughts–a state of “zoning out” [74]. According to Cheyne et al. [73], zoning out is associated with reactive and automatic responding to the task. It could be that the response time patterns as a result of automatic responding are not (measurably) different from responding during task focus.

While behavioral indices were similar, task-evoked responses did differ. We observed a smaller task-evoked response in pupil size for go trials during episodes of sticky thought, suggesting that less attention is allocated to task processing compared to during episodes of neutral or non-sticky thought. Hence, sticky thought can be detected by looking for signs of perceptual decoupling in task-evoked pupil dilation, even at a time when behavior does not appear to suffer.

While behavioral indices could not distinguish between sticky and non-sticky thought in go trials, accuracy on no-go trials did discriminate between different levels of stickiness. Participants demonstrated a higher no-go accuracy (i.e., more often withheld a response) when they reported having non-sticky thought compared to neutral thought, but performed severely worse when they experienced sticky thought. In addition to the performance decrement with sticky thinking, we observed that task-evoked pupil responses were smaller on correct no-go trials. Together with the smaller evoked response to go trials with sticky though, this provides further evidence that sticky thought limits attention and exertion of cognitive control to external task processing (even when the response ends up being correct). In contrast, we could not discriminate non-sticky from neutral thought in task-evoked pupil dilation. Therefore, it is unclear whether cognitive processing leading to accurate performance differed when experiencing non-sticky or neutral thought, while average accuracy did differ. We argue that this indicates that participants could not reliably classify their thought as non-sticky or neutral. Instead, non-sticky reports may have been motivated by accuracy on the preceding no-go trial, explaining the better performance with non-sticky reports. Reports on sticky thought were likely not, or at least less, affected by no-go performance, since task-evoked pupil dilation was affected in correct trials.

The differences in no-go accuracy between sticky and neutral/non-sticky thought may provide some insight in how cognitive processing differs between these modes of thought. According to the literature on the SART, deliberate control is beneficial for performance on no-go trials. Deliberate control can be employed to sustain attention to the task (e.g., [75]), but also to support a controlled response strategy [7678]. Therefore, the reason why sticky thought was associated with lower performance compared to neutral/non-sticky thought may be because this mode of thought was associated with a lower level of deliberate control.

How deliberate control influences the stickiness of thought, as well as other mechanisms outside of deliberate control, may be explained by the dynamic framework of spontaneous thought (see [10]). This framework posits that the flow of content and orientation of thought can be constrained either through cognitive control (referred to as ‘deliberate constraints’), or through sensory and affective salience (referred to as ‘automatic constraints’). Deliberate constraints result in a neutral/non-sticky experience, because there is volitional control on the content and orientation of thought. On the other hand, strong automatic constraints (together with weak deliberate constraints) result in high stability of thought, which may make the thoughts difficult to disengage from.

The proposed relationship between the relative contribution of automatic and deliberate constraints to the stickiness of thought is supported by existing neuroimaging studies. Individuals with depression–a disorder marked by negatively valanced sticky thought [7,8]–have shown to have greater activation of the default network when engaged in experimental tasks compared to controls [79,80]. The default network is proposed to support spontaneous thought [8183]. The increased default network activity is, furthermore, accompanied with greater activation of (emotional) salience networks, while areas associated with cognitive control have reduced activation [84]. Therefore, these results are consistent with the idea that sticky thought is mostly constrained through salience, and less so through deliberate control.

Implications for future studies

The present study may have practical implications for future studies. To our knowledge, this study presents the first evidence that the influence of stickiness of thought on task processing can be investigated in pupillary measures shortly prior to self-report. This opens up research opportunities for research on related modes of thinking such as perseverative cognition. Research on perseverative cognition has currently primarily used questionnaires to measure participant’s general tendency to engage in rumination or worry [3,7,22], and/or to retrospectively measure whether a participant engaged (at some point during the experiment) in such thought [20]. Arguably, this method is relatively imprecise. Embedding thought probes in a task, combined with continuous pupil size measurement, could allow to investigate more precisely how rumination and worry influence cognitive processing in a task.

Triangulating between thought probe reports, behavior, and task-evoked pupil dilation demonstrated that episodes of sticky thought involve reduced attention towards the ongoing task. The reduced attention to the task may point to perceptual decoupling. While the present experiment was not designed to investigate perceptual decoupling, follow-up research may further investigate the role of perceptual decoupling in sticky thought with concurrent measures. For example, future research may consider repeating the present study with EEG to examine whether sticky thinking has different neural correlates from non-sticky task-unrelated thought. Research on self-generated thought has indicated that episodes of off-task thinking reduce early task-evoked EEG components associated with visual processing (i.e., P1, N1; [85,86]), as well as later components such as the P3 (e.g., [14]). When used to study the influence of sticky thinking, this may help to gain understanding in to what extent perceptual decoupling modulates sensory processing (i.e., magnitude of P1, N1 components) and/or later cognitive processing (P3 component) during episodes of sticky thought.

While stickiness of thought was associated with the magnitude of task-evoked pupil dilation, it remains unclear how it affects baseline pupil size. In fact, the present study suggests that there may not be a direct relationship between the experienced stickiness and baseline pupil size at all, but rather that it is mediated by time-on-task influences. In line with other SART studies, we found that baseline pupil size declined over time ([27,72]; see also [87,88]). At the same time, the frequency of sticky thought increased when the task progressed. By consequence, one might easily arrive at the false conclusion that sticky thought is associated with a smaller baseline pupil size when time-on-task influences are not considered. As demonstrated in Fig 9, there were in fact no consistent differences in the baseline pupil size between the types of thought that was reported once time-on-task influences were considered. This mediating role of time-on-task on the relationship between baseline pupil size and the experienced thought has potential implications for existing research that has looked for correlates of different kinds of ongoing thought in baseline pupil size.

Finally, our research demonstrated that that exposure to personal concerns, indeed, increased the tendency to engage in self-generated thought. Along with personal concerns, also concerns of other participants increased the tendency for self-generated thinking. Yet, the increase in self-generated thought after exposure to personal concerns was significantly larger. This stands in contrast to previous research using a similar concern manipulation [2,50]. These studies did demonstrate an increase following processing concerns, but were not successful in finding a more potent effect of personal concerns compared to other concerns. A possible explanation is that in this study, we carefully selected the personal and other concerns to make sure that they were unique. In other words, we ensured that the other concern did not overlap with any personal concerns. In the previous reports on this task, we could not find a notion of similar practice. Hence, the discrepancy in findings may potentially be explained by the degree of overlap between the self and other concern conditions.

While exposure to personal concerns was found to locally increase self-generated thinking, it did not affect the stickiness of thought. Potentially, this may highlight that a sticky mode of thinking does not reliably result from processing personal concerns in a healthy population. Research has shown that people with a strong tendency to ruminate–a particular form of negative sticky thinking—have an attentional bias towards information that describes relevant, but negative, aspects of themselves (see e.g., [21,22]). Participants in our experiment potentially did not have a strong attentional bias towards their personal concerns. For future researchers, it may be interesting to investigate whether individuals with depression and/or individuals with high trait rumination, do engage more in sticky thought after exposure to their personal concerns.

Conclusions

To conclude, the present study found that sticky thinking is frequently experienced when we are (temporarily) disengaged from our ongoing task. Furthermore, sticky thinking was associated with a decrease in the ability to withhold a response on infrequent targets (no-go stimuli) and smaller responses in pupil size to task events. These results demonstrate, first of all, that individuals can report on the stickiness of their thought and that the experience can be traced in task-evoked pupil dilation. Secondly, the results indicate that attention is drawn away from the task when experiencing sticky thought. The observed attentional decoupling may be the result of reduced deliberate constraints on thought, in combination with increased automatic constraints on thought, resulting in the subjective experience of sticky thinking. Future research should investigate these claims more directly.

Acknowledgments

We would like to thank dr. Jacolien C. van Rij for her helpful suggestions and comments on our GAMM analysis.

Data Availability

The data, materials, and analysis code are publicly available online at the Open Science Framework (link to project: https://osf.io/m6ujg/).

Funding Statement

This research was supported by a grant from the European Research Council (MULTITASK - 283597; https://erc.europa.eu/) awarded to N.A. Taatgen. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Smallwood J, Schooler JW. The Science of Mind Wandering: Empirically Navigating the Stream of Consciousness. Annu Rev Psychol. 2015;66(1):487–518. 10.1146/annurev-psych-010814-015331 [DOI] [PubMed] [Google Scholar]
  • 2.van Vugt MK, Broers N. Self-reported stickiness of mind-wandering affects task performance. Front Psychol. 2016;7(MAY):1–8. 10.3389/fpsyg.2016.00732 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Joormann J, Levens SM, Gotlib IH. Sticky thoughts: Depression and rumination are associated with difficulties manipulating emotional material in working memory. Psychol Sci. 2011;22(8):979–83. 10.1177/0956797611415539 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Nolen-Hoeksema S, Morrow J. A prospective study of depression and distress following a natural disaster: The 1989 Loma Prieta earthquake. J Pers Soc Psychol [Internet]. 1991;61(1):105–21. Available from: https://s3.amazonaws.com/academia.edu.documents/47730159/A_prospective_study_of_depression_and_po20160802-14036-1uqvis7.pdf?AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&Expires=1554423556&Signature=Zq2iKf6ThiI5lG3lfmpb6DXi3I0%3D&response-content-disposition=inline. 10.1037//0022-3514.61.1.105 [DOI] [PubMed] [Google Scholar]
  • 5.Kavanagh DJ, Andrade J, May J. Imaginary Relish and Exquisite Torture: The Elaborated Intrusion Theory of Desire David. Psychol Rev. 2005;112(2):446–67. 10.1037/0033-295X.112.2.446 [DOI] [PubMed] [Google Scholar]
  • 6.Papies E, Stroebe W, Aarts H. Pleasure in the mind: Restrained eating and spontaneous hedonic thoughts about food. J Exp Soc Psychol. 2007;43(5):810–7. [Google Scholar]
  • 7.Nolen-Hoeksema S, Wisco BE, Lyubomirsky S. Rethinking Rumination. Perspect Psychol Sci [Internet]. 2008;3(5):400–24. Available from: http://pps.sagepub.com/lookup/doi/10.1111/j.1745-6924.2008.00088.x. [DOI] [PubMed] [Google Scholar]
  • 8.Nolen-Hoeksema S. The role of rumination in depressive disorders and mixed anxiety/depressive symptoms. J Abnorm Psychol. 2000;109(3):504–11. [PubMed] [Google Scholar]
  • 9.Barlow DH. Anxiety and Its Disorders: The Nature and Treatment of Anxiety and Panic. 2nd ed New York, NY: Guilford Press; 2002. [Google Scholar]
  • 10.Christoff K, Irving ZC, Fox KCR, Spreng NR, Andrews-Hanna JR. Mind-wandering as spontaneous thought: a dynamic framework. Nat Rev Neurosci [Internet]. 2016;1–44. Available from: 10.1038/nrn.2015.17 [DOI] [PubMed] [Google Scholar]
  • 11.Mills C, Raffaelli Q, Irving ZC, Stan D, Christoff K. Is an off-task mind a freely-moving mind? Examining the relationship between different dimensions of thought. Conscious Cogn. 2018;58(May 2017):20–33. 10.1016/j.concog.2017.10.003 [DOI] [PubMed] [Google Scholar]
  • 12.Mooneyham BW, Schooler JW. The costs and benefits of mind-wandering: A review. Can J Exp Psychol Can Psychol expérimentale. 2013;67(1):11–8. 10.1037/a0031569 [DOI] [PubMed] [Google Scholar]
  • 13.Marchetti I, Koster EHW, Klinger E, Alloy LB. Spontaneous thought and vulnerability to mood disorders: The dark side of the wandering mind. Clin Psychol Sci. 2016;4(5):835–57. 10.1177/2167702615622383 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kam JWY, Dao E, Farley J, Fitzpatrick K, Smallwood J, Schooler JW, et al. Slow Fluctuations in Attentional Control of Sensory Cortex. J Cogn Neurosci. 2011;23(2):460–70. 10.1162/jocn.2010.21443 [DOI] [PubMed] [Google Scholar]
  • 15.Smallwood J, Brown KS, Tipper C, Giesbrecht B, Franklin MS, Mrazek MD, et al. Pupillometric evidence for the decoupling of attention from perceptual input during offline thought. PLoS One. 2011;6(3). 10.1371/journal.pone.0018298 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Klinger E. Goal commitments and the content of thoughts and dreams: Basic principles. Front Psychol. 2013;4(JUL):1–17. 10.3389/fpsyg.2013.00415 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Verkuil B, Brosschot J, Gebhardt W, Thayer J. When Worries Make You Sick: A Review of Perseverative Cognition, the Default Stress Response and Somatic Health. J Exp Psychopathol. 2010;1(1):87–118. [Google Scholar]
  • 18.Denson TF, Spanovic M, Miller N. Cognitive Appraisals and Emotions Predict Cortisol and Immune Responses: A Meta-Analysis of Acute Laboratory Social Stressors and Emotion Inductions. Psychol Bull. 2009;135(6):823–53. 10.1037/a0016909 [DOI] [PubMed] [Google Scholar]
  • 19.Ottaviani C, Shapiro D, Couyoumdjian A. Flexibility as the key for somatic health: From mind wandering to perseverative cognition. Biol Psychol [Internet]. 2013;94(1):38–43. Available from: 10.1016/j.biopsycho.2013.05.003 [DOI] [PubMed] [Google Scholar]
  • 20.Ottaviani C, Shapiro D, Fitzgerald L. Rumination in the laboratory: What happens when you go back to everyday life? Psychophysiology. 2011;48(4):453–61. 10.1111/j.1469-8986.2010.01122.x [DOI] [PubMed] [Google Scholar]
  • 21.Koster EHW, De Lissnyder E, De Raedt R. Rumination is characterized by valence-specific impairments in switching of attention. Acta Psychol (Amst) [Internet]. 2013;144(3):563–70. Available from: 10.1016/j.actpsy.2013.09.008 [DOI] [PubMed] [Google Scholar]
  • 22.Beckwé M, Deroost N. Attentional biases in ruminators and worriers. Psychol Res. 2016;80(6):952–62. 10.1007/s00426-015-0703-8 [DOI] [PubMed] [Google Scholar]
  • 23.Scollon CN, Kim-Prieto C, Diener E. Experience Sampling: Promises and Pitfalls, Strengths and Weaknesses. J Happiness Stud. 2003;4(1):5–34. [Google Scholar]
  • 24.Schwarz N. Retrospective and Concurrent Self-Reports: The Rationale for Real-Time Data Capture In: The science of real-time data capture: Self-reports in health research. New York, NY, US: Oxford University Press; 2007. p. 11–26. [Google Scholar]
  • 25.Kane MJ, Gross GM, Chun CA, Smeekens BA, Meier ME, Silvia PJ, et al. For Whom the Mind Wanders, and When, Varies Across Laboratory and Daily-Life Settings. Psychol Sci [Internet]. 2017;28(9):1271–89. Available from: http://journals.sagepub.com/doi/10.1177/0956797617706086. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Weinstein Y. Mind-wandering, how do I measure thee with probes? Let me count the ways. Behav Res Methods. 2018;50(2):642–61. 10.3758/s13428-017-0891-9 [DOI] [PubMed] [Google Scholar]
  • 27.Unsworth N, Robison MK. Pupillary correlates of lapses of sustained attention. Cogn Affect Behav Neurosci. 2016;16(4):601–15. 10.3758/s13415-016-0417-4 [DOI] [PubMed] [Google Scholar]
  • 28.Robertson IH, Manly T, Andrade J, Baddeley BT, Yiend J. “Oops!”: Performance correlates of everyday attentional failures in traumatic brain injured and normal subjects. Neuropsychologia. 1997;35(6):747–58. 10.1016/s0028-3932(97)00015-8 [DOI] [PubMed] [Google Scholar]
  • 29.Seli P, Cheyne JA, Smilek D. Wandering minds and wavering rhythms: Linking mind wandering and behavioral variability. J Exp Psychol Hum Percept Perform. 2013;39(1):1–5. 10.1037/a0030954 [DOI] [PubMed] [Google Scholar]
  • 30.McVay JC, Kane MJ. Drifting from slow to “D’oh!”: working memory capacity and mind wandering predict extreme reaction times and executive control errors. J Exp Psychol Learn Mem Cogn. 2012;38(3):525–49. 10.1037/a0025896 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Bastian M, Sackur J. Mind wandering at the fingertips: Automatic parsing of subjective states based on response time variability. Front Psychol. 2013;4(SEP):1–11. 10.3389/fpsyg.2013.00573 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Braver TS. The variable nature of cognitive control: A dual mechanisms framework. Trends Cogn Sci [Internet]. 2012;16(2):106–13. Available from: 10.1016/j.tics.2011.12.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Unsworth N, Robison MK. Pupillary correlates of lapses of sustained attention. Cogn Affect Behav Neurosci. 2016;16(4):601–15. 10.3758/s13415-016-0417-4 [DOI] [PubMed] [Google Scholar]
  • 34.Huijser S, van Vugt MK, Taatgen NA. The wandering self: Tracking distracting self-generated thought in a cognitively demanding context. Conscious Cogn. 2018;58. [DOI] [PubMed] [Google Scholar]
  • 35.Hoeks B, Levelt WJM. Pupillary dilation as a measure of attention: A quantitative system analysis. Behav Res Methods, Instruments, Comput. 1993;25(1):16–26. [Google Scholar]
  • 36.Wierda SM, Van Rijn H, Taatgen NA, Martens S. Pupil dilation deconvolution reveals the dynamics of attention at high temporal resolution. Proc Natl Acad Sci U S A. 2012;109(22):8456–60. 10.1073/pnas.1201858109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Gilzenrat MS, Nieuwenhuis S, Jepma M, Cohen JD. Pupil diameter tracks changes in control state predicted by the adaptive gain theory. Cogn Affect Behav Neurosci. 2010;10(2):252–69. 10.3758/CABN.10.2.252 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Unsworth N, Robison MK. Individual differences in the allocation of attention to items in working memory: Evidence from pupillometry. Psychon Bull Rev. 2014;757–65. [DOI] [PubMed] [Google Scholar]
  • 39.Mittner M, Boekel W, Tuckera M, Turner BM, Heathcote A, Forstmann BU. When the Brain Takes a Break: A Model-Based Analysis of Mind Wandering. J Neurosci. 2014;34(July):16286–95. 10.1523/JNEUROSCI.2062-14.2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Grandchamp R, Braboszcz C, Delorme A. Oculometric variations during mind wandering. Front Psychol. 2014;5(FEB). 10.3389/fpsyg.2014.00031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Jubera-García E, Gevers W, Opstal F Van. Influence of content and intensity of thought on behavioral and pupil changes during active mind-wandering, off-focus, and on-task states. Attention, Perception, Psychophys. 2019;1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Murphy PR, O’ Connell RG, O’ Sullivan M, Robertson IH, Balsters JH. Pupil diameter covaries with BOLD activity in human locus coeruleus. Hum Brain Mapp. 2014;35(8):4140–54. 10.1002/hbm.22466 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Murphy PR, Robertson IH, Balsters JH, O’Connell RG. Pupillometry and p3 index the locus coeruleus-noradrenergic arousal function in humans. Psychophysiology. 2011;48(11):1532–43. 10.1111/j.1469-8986.2011.01226.x [DOI] [PubMed] [Google Scholar]
  • 44.Aston-Jones G, Cohen JD. An Integrative Theory of Locus Coeruleus-Norepinephrine Funtion: Adaptive Gain and Optimal Performance. Annu Rev Neurosci. 2005;28:403–50. 10.1146/annurev.neuro.28.061604.135709 [DOI] [PubMed] [Google Scholar]
  • 45.Siegle GJ, Steinhauer SR, Carter CS, Ramel W, Thase ME. Do the seconds turn into hours? Relationships between sustained pupil dilation in response to emotional information and self-reported rumination. Cognit Ther Res. 2003;27(3):365–82. [Google Scholar]
  • 46.Duque A, Sanchez A, Vazquez C. Gaze-fixation and pupil dilation in the processing of emotional faces: The role of rumination. Cogn Emot [Internet]. 2014;28(8):1347–66. Available from: 10.1080/02699931.2014.881327 [DOI] [PubMed] [Google Scholar]
  • 47.Konishi M, Brown K, Battaglini L, Smallwood J. When attention wanders: Pupillometric signatures of fluctuations in external attention. Cognition [Internet]. 2017;168:16–26. Available from: 10.1016/j.cognition.2017.06.006. [DOI] [PubMed] [Google Scholar]
  • 48.Harrison NA, Singer T, Rotshtein P, Dolan RJ, Critchley HD. Pupillary contagion: central mechanisms engaged in sadness processing. Soc Cogn Affect Neurosci. 2006;1(1):5–17. 10.1093/scan/nsl006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Franklin MS, Broadway JM, Mrazek MD, Smallwood J, Schooler JW. Window to the wandering mind: pupillometry of spontaneous thought while reading. Q J Exp Psychol (Hove) [Internet]. 2013;66(May 2015):2289–94. Available from: http://www.ncbi.nlm.nih.gov/pubmed/24313285. [DOI] [PubMed] [Google Scholar]
  • 50.McVay JC, Kane MJ. Dispatching the wandering mind? Toward a laboratory method for cuing “spontaneous” off-task thought. Front Psychol. 2013;4(SEP):0–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Klinger E, Cox WM. Motivation and the Goal Theory of Current Concerns In: Cox WM, Klinger E, editors. Handbook of Motivational Counceling: Goal-Based Approaches to Assessment and Intervention with Addiction and Other Problems. Chichester: John Wiley & Sons; 2004. p. 3–29. [Google Scholar]
  • 52.Carver CS, White TL. Behavioral inhibition, behavioral activation, and affective responses to impending reward and punishment: The BIS/BAS Scales. J Pers Soc Psychol. 1994;67:319–33. [Google Scholar]
  • 53.Verplanken B, Friborg O, Wang CE, Trafimow D, Woolf K. Mental habits: Metacognitive reflection on negative self-thinking. J Pers Soc Psychol. 2007;92(3):526–41. 10.1037/0022-3514.92.3.526 [DOI] [PubMed] [Google Scholar]
  • 54.Keuleers E, Brysbaert M, New B. SUBTLEX-NL: A new measure for Dutch word frequency based on film subtitles. Behav Res Methods. 2010;42(3):643–50. 10.3758/BRM.42.3.643 [DOI] [PubMed] [Google Scholar]
  • 55.Baayen RH, Piepenbrock R, van Rijn H. The CELEX Lexical Database [CD-ROM]. Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania; 1993. [Google Scholar]
  • 56.Peirce J, Gray JR, Simpson S, MacAskill M, Höchenberger R, Sogo H, et al. PsychoPy2: Experiments in behavior made easy. Behav Res Methods. 2019;51(1):195–203. 10.3758/s13428-018-01193-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Barr DJ, Levy R, Scheepers C, Tily HJ. Random effects structure for confirmatory hypothesis testing: Keep it maximal. J Mem Lang [Internet]. 2013;68(3):255–78. Available from: 10.1016/j.jml.2012.11.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Bates D, Kliegl R, Vasishth S, Baayen H. Parsimonious mixed models. arXiv Prepr 150604967. 2015. [Google Scholar]
  • 59.Matuschek H, Kliegl R, Vasishth S, Baayen H, Bates D. Balancing Type I error and power in linear mixed effects models. J Mem Lang. 2017;64:305–15. [Google Scholar]
  • 60.Wood SN. Generalized additive models: An introduction with R. 2nd ed Boca Raton, FL: Chapman and Hall/CRC; 2017. [Google Scholar]
  • 61.Wood SN. Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J R Stat Soc. 2011;73(1):3–36. [Google Scholar]
  • 62.Mittner M, Hawkins GE, Boekel W, Forstmann BU. A Neural Model of Mind Wandering. Trends Cogn Sci [Internet]. 2016;20(8):570–8. Available from: 10.1016/j.tics.2016.06.004 [DOI] [PubMed] [Google Scholar]
  • 63.Brisson J, Mainville M, Mailloux D, Beaulieu C, Serres J, Sirois S. Pupil diameter measurement errors as a function of gaze direction in corneal reflection eyetrackers. Behav Res Methods [Internet]. 2013;45(4):1322–31. Available from: http://www.ncbi.nlm.nih.gov/pubmed/23468182. 10.3758/s13428-013-0327-0 [DOI] [PubMed] [Google Scholar]
  • 64.Gagl B, Hawelka S, Hutzler F. Systematic influence of gaze position on pupil size measurement: Analysis and correction. Behav Res Methods. 2011;43(4):1171–81. 10.3758/s13428-011-0109-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Wood SN, Pya N, Saefken B. Smoothing parameter and model selection for general smooth models. J Am Stat Assoc. 2016;111:1548–75. [Google Scholar]
  • 66.Wood SN, Goude Y, Shaw S. Generalized additive models for large data sets. J R Stat Soc Ser C Appl Stat. 2015;64(1):139–55. [Google Scholar]
  • 67.van Rij J, Hendriks P, van Rijn H, Baayen RH, Wood SN. Analyzing the Time Course of Pupillometric Data. Trends Hear. 2019;23:1–22. 10.1177/2331216519832483 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Bates D, Maechler M, Bolker B, Walker S. lme4: Linear mixed-effects models using Eigen and S4. R package version 1.1–19. 2018. [Google Scholar]
  • 69.Wood SN. mgcv. Mixed gam computation vehicle with automatic smoothness estimation [Internet]. Comprehensive R Archive Network, CRAN; 2017. Available from: https://cran.r-project.org/web/packages/mgcv. [Google Scholar]
  • 70.van Rij J, Wieling M, Baayen RH, Van Rijn H. itsadug: Interpreting Time Series and Autocorrelated Data using GAMMs [Internet]. Comprehensive R Archive Network, CRAN; 2017. Available from: https://cran.r-project.org/web/packages/itsadug. [Google Scholar]
  • 71.Smallwood J, Davies JB, Heim D, Finnigan F, Sudberry M, O’Connor R, et al. Subjective experience and the attentional lapse: Task engagement and disengagement during sustained attention. Conscious Cogn. 2004;13(4):657–90. 10.1016/j.concog.2004.06.003 [DOI] [PubMed] [Google Scholar]
  • 72.Van Den Brink RL, Murphy PR, Nieuwenhuis S. Pupil diameter tracks lapses of attention. PLoS One. 2016;11(10):1–16. 10.1371/journal.pone.0165274 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Cheyne AJ, Solman GJF, Carriere JSA, Smilek D. Anatomy of an error: A bidirectional state model of task engagement/disengagement and attention-related errors. Cognition. 2009;111(1):98–113. 10.1016/j.cognition.2008.12.009 [DOI] [PubMed] [Google Scholar]
  • 74.Smallwood J, Schooler JW. The restless mind. Psychol Bull. 2006;132(6):946–58. 10.1037/0033-2909.132.6.946 [DOI] [PubMed] [Google Scholar]
  • 75.Manly T, Robertson IH, Galloway M, Hawkins K. The absent mind: Further investigations of sustained attention to response. Neuropsychologia. 1999;37(6):661–70. 10.1016/s0028-3932(98)00127-4 [DOI] [PubMed] [Google Scholar]
  • 76.Dang JS, Figueroa IJ, Helton WS. You are measuring the decision to be fast, not inattention: the Sustained Attention to Response Task does not measure sustained attention. Exp Brain Res [Internet]. 2018;236(8):2255–62. Available from: 10.1007/s00221-018-5291-6 [DOI] [PubMed] [Google Scholar]
  • 77.Finkbeiner KM, Wilson KM, Russell PN, Helton WS. The effects of warning cues and attention-capturing stimuli on the sustained attention to response task. Exp Brain Res. 2015;233(4):1061–8. 10.1007/s00221-014-4179-3 [DOI] [PubMed] [Google Scholar]
  • 78.Hiatt LM, Trafton JG. A Computational Model of Mind Wandering. 2013;914–9. [Google Scholar]
  • 79.Whitfield-Gabrieli S, Ford JM. Default Mode Network Activity and Connectivity in Psychopathology. Annu Rev Clin Psychol. 2012;8(1):49–76. 10.1146/annurev-clinpsy-032511-143049 [DOI] [PubMed] [Google Scholar]
  • 80.Anticevic A, Cole MW, Murray JD, Corlett PR, Wang X-J, Krystal JH. The Role of Default Network Deactivation in Cognition and Disease. Trends Cogn Sci. 2013;16(12):584–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Ellamil M, Fox KCR, Dixon ML, Pritchard S, Todd RM, Thompson E, et al. Dynamics of neural recruitment surrounding the spontaneous arising of thoughts in experienced mindfulness practitioners. Neuroimage [Internet]. 2016;136:186–96. Available from: 10.1016/j.neuroimage.2016.04.034 [DOI] [PubMed] [Google Scholar]
  • 82.Christoff K. Undirected thought: Neural determinants and correlates. Brain Res. 2012;1428:51–9. 10.1016/j.brainres.2011.09.060 [DOI] [PubMed] [Google Scholar]
  • 83.Fox KCR, Nathan Spreng R, Ellamil M, Andrews-Hanna JR, Christoff K. The wandering brain: Meta-analysis of functional neuroimaging studies of mind-wandering and related spontaneous thought processes. Neuroimage. 2015;111:611–21. 10.1016/j.neuroimage.2015.02.039 [DOI] [PubMed] [Google Scholar]
  • 84.Hamilton JP, Etkin A, Furman DJ, Lemus MG, Johnson RF, Gotlib IH. Functional Neuroimaging of Major Depressive Disorder: A Meta-Analysis and New Integration of Baseline Activation and Neural Response Data. Am J Psychiatry. 2012;169(7):693–703. 10.1176/appi.ajp.2012.11071105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Kam JWY, Handy TC. The neurocognitive consequences of the wandering mind: a mechanistic account of sensory-motor decoupling. Front Psychol [Internet]. 2013;4(October):725 Available from: http://www.ncbi.nlm.nih.gov/pubmed/24133472. 10.3389/fpsyg.2013.00725 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Jin CY, Borst JP, van Vugt MK. Predicting task-general mind-wandering with EEG. Cogn Affect Behav Neurosci. 2019;(March):1059–73. 10.3758/s13415-019-00707-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Morad Y, Lemberg H, Yofe N, Dagan Y. Pupillography as an objective indicator of fatigue. Curr Eye Res. 2000;21(1):535–42. [PubMed] [Google Scholar]
  • 88.Wilhelm B, Giedke H, Lüdtke H, Bittner E, Hofmann A, Wilhelm H. Daytime variations in central nervous system activation measured by a pupillographic sleepiness test. J Sleep Res. 2001;10(1):1–7. 10.1046/j.1365-2869.2001.00239.x [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Myrthe Faber

23 Jan 2020

PONE-D-19-32012

Captivated by thought: "sticky" thinking leaves traces of perceptual decoupling in task-evoked pupil size

PLOS ONE

Dear Mr. Huijser,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

As you can see below, I have received two expert reviews. Each reviewer raises a number of important points that need to be addressed. In particular, both reviewers indicate that the interpretations and conclusions with regard to the processes at work here might not be appropriately based on the data presented (see Reviewer 1, main point 2, and Reviewer 2, main point 2). This is an important issue, as PLOS ONE specifically requires that “the data presented in the manuscript must support the conclusions drawn. Submissions will be rejected if the interpretation of results is unjustified or inappropriate, so authors should avoid overstating their conclusions. Authors may discuss possible implications for their results as long as these are clearly identified as hypotheses instead of conclusions.” It is therefore essential to address these issues in a revision.

Furthermore, both reviewers have questions with regard to the sample size, the validity of the sticky thought measure, and analytical choices concerning this measure and other measures. PLOS ONE requires that sample sizes must be large enough to produce robust results, so it is necessary to address this. It is also necessary to address the validity of the measure (also see Reviewer 1, main point 1 and below), the number of comparisons, and other analytical choices, and it is necessary to report model data for all comparisons and descriptive statistics for all variables.

Both reviewers express concerns with regard to the theoretical validity of the concept of sticky thoughts. They point out potential overlap with other types of thoughts defined elsewhere in the literature, and give excellent suggestions for further improving the theoretical clarity of your manuscript. Finally, both reviewers provide an extensive list of minor issues (clarifications, typos, missing items/legends in figures, etc.) that need to be addressed.

I would like to invite a revision that addresses these issues.

We would appreciate receiving your revised manuscript by Mar 08 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Myrthe Faber

Academic Editor

PLOS ONE

Journal Requirements:

1. When submitting your revision, we need you to address these additional requirements.

Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The manuscript describes a study investigating the pupillometric and behavioral performance correlates of a category of thought dubbed "sticky thinking", a form of perseverative thought that it is hard to disengage from. I think that the study is interesting and there is value to study this particular type of thinking; I also liked the transparency with which the methods and analyses are described, and I think that GAMMs are very promising to describe this type of data. I also have some major and minor comments that, from my point of view, could help improve the paper.

My main comments are about the two central topics in the manuscript: sticky thought, and perceptual decoupling. Regarding sticky thought, I see the authors introducing this concept as a (somehow) new category of thought. However it is not very clear (especially in the Introduction) how it differs from other types of mind-wandering or self-generated thought, and how it relates to them. The whole mind-wandering literature is plagued with a problem of semantics: researchers in the field often use different words to refer to the same thing, or the same words for different things. I think that in this context, conceptual clarity is even more important. For how it is described in the Introduction, sticky thought seems greatly conflated with general mind-wandering: e.g., the descriptions in line 39 to 43 could easily be used for mind-wandering. Sticky thought also seems conflated with rumination, a negative form of perseverative thought common in psychopathologies like major depressive disorder or anxiety disorders. I think that, if such a new concept is to be introduced in the literature, it should be clearly defined relative to existing concepts: for example, the authors say that it is a form of rigid, and narrow-focused thought process that it is hard to disengage from, but how is it different from rumination? I imagine that the authors implicitly assume that the stickiness of thought is a category of thought that is orthogonal to the valence of the thought (so that it can also be not negative), but in this case "neutral" and "positive" sticky thought should exist. Is this the case? Is there research on this, or could the authors provide examples? As the debate on the categories of thought is an ongoing one (see Seli et al., 2018, TiCS; Christoff et al., 2018, TiCS), I think it would be helpful to better define the concept of sticky thought relative to other types of thought. Moreover, for example the evidence provided in lines 46-54, or lines 65-67, is all related to rumination/worry, but the authors’ conclusions generalise to sticky thoughts: this does not seem warranted. While the idea of a “sticky” factor of thought is interesting, as it is now, I was not 100% convinced that it exists outside of rumination-like thoughts.

My other main comment is on the concept of perceptual decoupling. This refers to a hypothesized process by the brain, aimed at insulating the mental stream of thought from external (perceptual) distractions. It is an interesting hypothesis, but it is very much open to debate if such a process exists in the brain, and evidence is still limited. One previous study (Smallwood et al., 2011, Plos One) somewhat linked this hypothesized process to a reduction in pupil sizes during on-task and off-task periods of a sustained attention task; however, this is far from proving that small pupils are an index of perceptual decoupling. While reading the manuscript, I feel that the authors imply that, finding smaller task-evoked pupil responses, automatically signals a perceptual decoupling process in action (e.g. lines 145,146, or 627-632). This seems like a form of reverse inference to me. The fact that perceptual decoupling (if it exists) might be linked to smaller pupils, does not necessarily mean that finding small pupils means that the brain is decoupling from the external environment. If this hypothesized process is really in act, we should observe other concurrent measures of reduced processing of the external environment: I don't think that the current paradigm in the study allows to do that, and additionally, there appears to be evidence for no differences in behavioral indexes of RT and RT variability between sticky and non-sticky thinking (e.g., line 623). While the stickiness factor did discriminate in no-go accuracy, this is a finding also common for mind-wandering thoughts in general, and is not intrinsically linked to perceptual decoupling. It is also not clear what the mechanism that would link smaller pupil task-evoked responses to a perceptual decoupling process would be. All in all, I do not think that the current study provides enough evidence to warrant strong affirmations such as those in lines 627-632, or lines 686-688. I don't know if the authors agree, but my suggestion would be to use a much more cautious language throughout the manuscript.

Some other comments that I have that I hope the authors will find useful:

Lines 58-84: This is true but obviously not limited to sticky thought, but is one of the main obstacles in the field of mind-wandering and of consciousness research in general. I feel like the whole paragraph could be shortened if the authors have an interest in doing so, given that thought probes are pretty much the standard way in the field to study these types of ongoing thoughts.

Lines 93-95: I have read the cited study and I couldn’t find some of the results here described (e.g. off-task thought, but not stickiness specifically, seemed to be related to task accuracy). Could the authors double-check? My apologies in advance if I missed or misread something.

Line 97: One key reference that could be added here is Seli, Cheyne & Smilek (2013), JEP:HPP.

Lines 141-142: This is slightly misleading as sticky thought is a novel concept. I do believe that there is some pupillometric research on rumination (e.g. Siegle, Steinhauser, Carter, Ramel & Thase, 2003), which could be added here, if the authors think it would be interesting for their argument.

Lines 147-149: I was confused by this sentence, as it is possible to dissociate, and measure separately, baseline to task-evoked pupil responses. Could the authors clarify?

Lines 149-151: Another study that could be discussed here and potentially in other parts of the manuscript is Konishi, Brown, Battaglini & Smallwood (2017), in which the authors find smaller baseline pupils for off-task thought, and specifically for negatively valenced and intrusive thoughts. As an author of that study I have a conflict of interest in pointing to it, but it seems to have some obvious links to the present manuscript (e.g. the concepts of intrusive and negative thoughts seems very close to those of rumination/worry).

Lines 154-158: Indeed both smaller and larger baseline pupil sizes have been found in the literature (e.g. Gilzenrat, Niewenhuis, Jepma, Cohen, 2010; Smallwood et al., 2011, 2012, Van den Brink, Murphy & Niewenhuis, 2016; Van Orden, Jung & Makeig, 2000; Konishi et al., 2017; Unsworth & Robison, 2016), and it is still not 100% clear what factors account for these differences.

Lines 163-165: This seems a little bit like a filler sentence. Maybe some concrete examples could be provided.

Line 166: a “which” seems to be missing in the phrase “in we embedded”.

Lines 170-174: To me it appears that these sentences construct a false dichotomy between sticky thoughts and general off-task thought/mind-wandering. For example, all the predictions described here can relate to off-task thought too.

Lines 182-183: How was the number of participants decided?

Lines 231-232: I think I missed how the experiment included 16 personal concern triplets, if the authors selected the 2 main concerns for each participant, and then translated each into a triplet of words. Sorry in advance if I misunderstood.

Lines 249-251: Was this second question always presented, even if for example, participants reported to be on-task or externally distracted in the first question? Were the authors still analysing such cases?

Figure 2: There’s a small typo in the third question (maters instead of matters).

Line 263: A reference could be added for PsychoPy (the most recent study is Peirce, Gray, Simpson, et al., 2019, Behavior Research Methods).

Line 304-305: It seems like the fixed order would cue participants to know in advance when a thought-probe would be presented. This could be seen as an issue, do the authors have any opinions on this?

Lines 322-324: How were these cut-offs decided?

Lines 364-365: This decision seems a bit strange in the context of pupil size and arousal, as external distraction is likely arousing, the opposite of inalertness.

Lines 367-368: What was the reason? What was the point of including a 6 point scale if it’s not used in the analyses?

Lines 481-483: Apologies in advance as I’m not sure if I’ve missed it, but was this result also taking into account the overall proportion of mind-wandering reports? If not, this result would be confounded by that, as they also increase over time.

Figures 5 and 6: The legend for the plot on the right could be a bit clearer (or is missing).

Lines 511-512: There is a serious and common problem in these designs, in which a thought-probe is presented always after a target stimulus. Following a mistake, participants might confabulate and report they were not on-task. This is a hard problem to overcome, but I think that this possibility should be discussed, given that the whole field relies on self-reports.

Lines 516-519: Again, I am not 100% sure if I missed this, but was this analysis also taking into account the underlying proportion of mind-wandering thoughts?

Lines 534-535: Did the authors analyse baseline pupil sizes across attentional states with simpler models such as LMMs? Given the amount of previous research on this, I’m just wondering if the null result depends somewhat on the GAMMs.

Lines 565-567: It seems a bit strange that the difference is smaller between sticky and non-sticky than between sticky and neutral. If these are opposite of a continuous state, one would expect the difference to be bigger between sticky and non-sticky. Do the authors have any comments on this?

Figure 10: The legend here could be clearer. Maybe it’s better to use colored lines instead of different patterns. It could also be that it wasn’t very clear because the image included in the review was lo-res.

Line 654: Minor typo, missing a “to” in “compared neutral/non-sticky”.

General: Would it be possible to have a table or a description of the best fitting models for each analysis conducted?

General: Did the authors measure and check the response times to the thought-probes? It sometimes happens that some participants will start to respond very quickly to those probes (because they want to finish the experiment faster). Such fast responses should be discarded, as it is debatable if the participants can introspect and report on their previous mental states so quickly.

Reviewer #2: Thank you for inviting me to review this paper. I thought it was generally well-written and easily understandable. The concept of sticky thought also seems important and relevant to the field of thought in general. For the most part, I think the methods, procedures, and results were described well. At the same time, I have some significant concerns that I think should be addressed before publication. Many of central issues (detailed below) deal with conceptual framing, theoretical importance, and analytical choices can potentially be addressed with a substantial revision. However, at least one of the main issues pertains to sample size (here only 34 participants were used); I believe the authors may want to consider a replication or additional data collection.

One of the main concerns I have is the limited literature covered in the Introduction. Although I think the authors have written concisely, key literature is missing. For example, an expanded discussion on the relationship between sticky thought, possibly negative valence, and arousal might help; it is only briefly mentioned now. Part of this relates to the authors choice not to make a hypothesis about the direction of sticky thought with baseline activity. This choice is completely fine, but the review of relevant concepts is still missing to make this case. There are many studies regarding rumination, etc. More directly related may be Christoff et al.’s (2016) paper which explicitly discusses concepts related to sticky thought, and some of Smallwood’s multi-dimensional experience sampling studies may also prove useful.

I also found the discussion section to be somewhat speculative without many clear theoretical implications. The authors attempt to explain some of their results with relatively shallow explanations – e.g., why they did not replicate past studies and why sticky thought was not influenced by personal concerns.

Sample size is a major concern given there is no a prior power analysis mentioned. The authors cite a previous study on sticky thought which may have been used to estimate an effect size. This field typically sees small (to medium) effect sizes, and these are not necessarily addressed by using lmer. The authors do mention computing BF in the discussion for one of the results that did not replicate, but I do not think this necessarily justifies the sample size for all the various relationships tested given the relatively low effect sizes seen for mind wandering (and other related dimensions) and performance in the literature.

One of the other main concerns is on the validity of measuring sticky thought. Figure 2 and the in text wording do not match. How were participants trained/instructed about this question? The authors also note in the discussion people may not be able to discern different levels of sticky thought using their method. While I appreciate the honesty, it seems like it could be problem with the study design/materials used since the research question was not to test a measure of sticky thought but rather the findings/interpretation depend on a reliable measure.

I was surprised to see that the authors chose to treat the sticky thoughts as categorical from the outset and also their choice to bin into three categories. There also did not appear to be a consideration for the responses in other categories as part of their models. Moreover, although the authors did not dichotomize, Seli, Beaty, et al. recently made a case to avoid binning thought dimensions because it can artificially inflate rates.

Related, this also increases the number of comparisons made when binning. In general, there were a large number of tests computed here. Were the number of tests considered in the calculating significance?

During the statistical analyses section and at the outset of the results section, the authors mention time/timeseries as an important factor to consider. Based on how this shaped their entire analytical approach, I think it might be factored into the Introduction and theoretical motivation a bit earlier. The authors also bring up their goal to assess whether they could induce sticky thought, as if this was one of their main questions. However, the paper was not framed to address this as a main question from the Introduction, so I suggest making this more explicit from the outset.

Please report all descriptive statistics for all variables.

Please also provide model data on all those constructed in a Table and mention them in the text. For example, the model comparison for sticky vs non sticky in terms of task evoked pupil size is not mentioned.

Why were go and no go trials analyzed separately?

The authors also do not mention other related metrics assessing dynamic measures of thought such as freely-moving thought. The authors may also consider making a point about the fact that sticky thought appears to be different from task unrelated thought, making it an important dimension to study.

Figure 1’s caption mentions performance but it is not in the figure.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Mahiko Konishi

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Review_ Captivated by thought.pdf

PLoS One. 2020 Dec 9;15(12):e0243532. doi: 10.1371/journal.pone.0243532.r002

Author response to Decision Letter 0


21 Jul 2020

Reviewer 1

My main comments are about the two central topics in the manuscript: sticky thought, and perceptual decoupling.

• Regarding sticky thought, I see the authors introducing this concept as a (somehow) new category of thought. However it is not very clear (especially in the Introduction) how it differs from other types of mind-wandering or self-generated thought, and how it relates to them. The whole mind-wandering literature is plagued with a problem of semantics: researchers in the field often use different words to refer to the same thing, or the same words for different things. I think that in this context, conceptual clarity is even more important. For how it is described in the Introduction, sticky thought seems greatly conflated with general mind-wandering: e.g., the descriptions in line 39 to 43 could easily be used for mind-wandering. Sticky thought also seems conflated with rumination, a negative form of perseverative thought common in psychopathologies like major depressive disorder or anxiety disorders. I think that, if such a new concept is to be introduced in the literature, it should be clearly defined relative to existing concepts: for example, the authors say that it is a form of rigid, and narrow-focused thought process that it is hard to disengage from, but how is it different from rumination? I imagine that the authors implicitly assume that the stickiness of thought is a category of thought that is orthogonal to the valence of the thought (so that it can also be not negative), but in this case "neutral" and "positive" sticky thought should exist. Is this the case? Is there research on this, or could the authors provide examples? As the debate on the categories of thought is an ongoing one (see Seli et al., 2018, TiCS; Christoff et al., 2018, TiCS), I think it would be helpful to better define the concept of sticky thought relative to other types of thought. Moreover, for example the evidence provided in lines 46-54, or lines 65-67, is all related to rumination/worry, but the authors’ conclusions generalise to sticky thoughts: this does not seem warranted. While the idea of a “sticky” factor of thought is interesting, as it is now, I was not 100% convinced that it exists outside of rumination-like thoughts.

We completely agree that the literature is plagued by too many different kinds of thought. We have clarified the concept of “sticky thought” on p.3 (also copied below). In general, we conceive of sticky thought as being similar to rumination, but not necessarily restricted to clinical contexts. Hence, we have decided to not refer to it as “rumination” because that concept tends to be restricted to clinical contexts, and here we do not make any diagnoses nor do we work with clinical samples.

New text: “An extreme form of such sticky thought is rumination, a rigid and narrow-focused thought process that is hard to disengage from and often negative in valence and self-related (4). In general, rumination causes individuals to be unable to concentrate and devote their attention to tasks at hand because attention is focused internally instead (2). However, in contrast to rumination, sticky thoughts could also have a positive valence, for example when we are caught up in a pleasant fantasy that we do not want to let go of, or thoughts with desire for a delicious cookie keep recurring in our minds (5,6). Another related term for sticky thought is perseverative cognition. Perseverative cognition has been associated with activation of the physiological stress system, and has been proposed to play a key role in the onset and maintenance of depression (7) and anxiety (8,9). Finally, sticky thought is closely related to the concept of constrained mind-wandering (10,11). Constrained mind-wandering is a form of mind-wandering in which thoughts cannot move freely but instead are restricted to cycling through the same narrow sets of thoughts again and again. It is different in the question that is posed to the participant—while sticky refers to the experience of the participant that it is difficult to drop the thought, constrained refers to participants’ experience of thinking the same thought again and again. “

• My other main comment is on the concept of perceptual decoupling. This refers to a hypothesized process by the brain, aimed at insulating the mental stream of thought from external (perceptual) distractions. It is an interesting hypothesis, but it is very much open to debate if such a process exists in the brain, and evidence is still limited. One previous study (Smallwood et al., 2011, Plos One) somewhat linked this hypothesized process to a reduction in pupil sizes during on-task and off-task periods of a sustained attention task; however, this is far from proving that small pupils are an index of perceptual decoupling. While reading the manuscript, I feel that the authors imply that, finding smaller task-evoked pupil responses, automatically signals a perceptual decoupling process in action (e.g. lines 145,146, or 627-632). This seems like a form of reverse inference to me. The fact that perceptual decoupling (if it exists) might be linked to smaller pupils, does not necessarily mean that finding small pupils means that the brain is decoupling from the external environment. If this hypothesized process is really in act, we should observe other concurrent measures of reduced processing of the external environment: I don't think that the current paradigm in the study allows to do that, and additionally, there appears to be evidence for no differences in behavioral indexes of RT and RT variability between sticky and non-sticky thinking (e.g., line 623). While the stickiness factor did discriminate in no-go accuracy, this is a finding also common for mind-wandering thoughts in general, and is not intrinsically linked to perceptual decoupling. It is also not clear what the mechanism that would link smaller pupil task-evoked responses to a perceptual decoupling process would be. All in all, I do not think that the current study provides enough evidence to warrant strong affirmations such as those in lines 627-632, or lines 686-688. I don't know if the authors agree, but my suggestion would be to use a much more cautious language throughout the manuscript.

We made the link between task-evoked pupil size and perceptual decoupling because the magnitude of task-evoked responses in pupil size (TERPs) have been strongly linked with allocation of attention. Hence, smaller TERPs would indicate reduced allocation of attention to external stimuli and, therefore, a decoupling of attention. However, after reading this comment of the reviewer, we agree that more cautious language is warranted. The hypothesized process of perceptual decoupling is more than just allocating less attention to surroundings, but also assumes inhibitory mechanisms to protect the internal state. This, as far as we know, cannot be measured with TERPs. We changed our wording in the referenced sentences in the Introduction and Discussion section (see p. 28, 30).

New text:

While behavioral indices were similar, task evoked responses did differ. We observed a smaller task-evoked response in pupil size for go trials during episodes of sticky thought, suggesting that less attention is allocated to task processing compared to during episodes of neutral or non-sticky thought. Hence, sticky thought can be detected by looking for signs of reduced task processing in task-evoked pupil dilation, even while behavior appears similar. (p. 28)

Triangulating between thought probe reports, behavior, and task-evoked pupil dilation demonstrated that episodes of sticky thought involve reduced attention towards the ongoing task. The reduced attention to the task may point a process called perceptual decoupling. Perceptual decoupling is a hypothesized process that functions to insulate internal thought from external (perceptual) distractions, possibly through inhibitory mechanisms (15,85,86). While the present experiment was not designed to investigate this process, follow-up research may further investigate the role of perceptual decoupling in sticky thought with concurrent measures. (p. 30)

• Lines 58-84: This is true but obviously not limited to sticky thought, but is one of the main obstacles in the field of mind-wandering and of consciousness research in general. I feel like the whole paragraph could be shortened if the authors have an interest in doing so, given that thought probes are pretty much the standard way in the field to study these types of ongoing thoughts.

We shortened the paragraph by removing lines 75 – 84 (see p. 5).

• Lines 93-95: I have read the cited study and I couldn’t find some of the results here described (e.g. off-task thought, but not stickiness specifically, seemed to be related to task accuracy). Could the authors double-check? My apologies in advance if I missed or misread something.

Thank you very much for noticing this error. Indeed, in this paper we only show relations between stickiness and response time variability, not accuracy. We have corrected this in the manuscript.

• Line 97: One key reference that could be added here is Seli, Cheyne & Smilek (2013), JEP:HPP.

We added the reference.

• Lines 141-142: This is slightly misleading as sticky thought is a novel concept. I do believe that there is some pupillometric research on rumination (e.g. Siegle, Steinhauser, Carter, Ramel & Thase, 2003), which could be added here, if the authors think it would be interesting for their argument.

In response to this comment we reformulated the referenced sentence and added the suggested article to the discussion (p. 8).

New text:

Since stickiness (i.e., the difficulty in disengaging from thought) is a novel topic, no studies have directly investigated how it is reflected in baseline and task-evoked pupil size. Nonetheless, predictions can be made based on related research and the adaptive gain curve. Given the disruptiveness of sticky thought to ongoing activities, we may expect that sticky thought, similar to self-generated thinking, is associated with smaller task-evoked responses in pupil size. As predicted by adaptive gain (see Fig 1 above), a smaller task-evoked response in pupil size with episodes of sticky thought would imply that the thought process is associated with either smaller or larger than average baseline pupil size. However, which one is open to debate. In clinical samples, Siegle et al. (45) found that rumination was associated with larger baseline pupil sizes. The researchers hypothesized that this larger baseline pupil size reflected sustained emotional processing (46). In contrast, Konishi et al. (47) found that in non-clinical samples, negative and intrusive thoughts were associated with smaller baseline pupil size (48). (p. 8)

• Lines 147-149: I was confused by this sentence, as it is possible to dissociate, and measure separately, baseline to task-evoked pupil responses. Could the authors clarify?

It is indeed true that both can be measured separately, however, they are not completely unrelated. As explained in the aforementioned paragraph, the relationship between baseline (tonic) and task-evoked pupil size (phasic) is suggested to follow an adaptive gain curve. To clarify this, we added ‘As predicted by adaptive gain’ in front of the referenced sentence (p. 7).

• Lines 149-151: Another study that could be discussed here and potentially in other parts of the manuscript is Konishi, Brown, Battaglini & Smallwood (2017), in which the authors find smaller baseline pupils for off-task thought, and specifically for negatively valenced and intrusive thoughts. As an author of that study I have a conflict of interest in pointing to it, but it seems to have some obvious links to the present manuscript (e.g. the concepts of intrusive and negative thoughts seems very close to those of rumination/worry).

We thank you for the suggested article and have included a reference to it in the discussion (p. 7-8).

• Lines 154-158: Indeed both smaller and larger baseline pupil sizes have been found in the literature (e.g. Gilzenrat, Niewenhuis, Jepma, Cohen, 2010; Smallwood et al., 2011, 2012, Van den Brink, Murphy & Niewenhuis, 2016; Van Orden, Jung & Makeig, 2000; Konishi et al., 2017; Unsworth & Robison, 2016), and it is still not 100% clear what factors account for these differences.

We agree.

• Lines 163-165: This seems a little bit like a filler sentence. Maybe some concrete examples could be provided

We decided to remove sentence.

• Line 166: a “which” seems to be missing in the phrase “in we embedded”.

We added “which” prior to the “in we embedded”.

• Lines 170-174: To me it appears that these sentences construct a false dichotomy between sticky thoughts and general off-task thought/mind-wandering. For example, all the predictions described here can relate to off-task thought too.

This may indeed appear so, however, we don’t believe there is a false dichotomy here. The main reason for that is that stickiness is a dimension that deals with the dynamics of thought, whereas the on-task/off-task dimension deals with attentional state/thought content. The predictions are similar, because sticky thoughts are likely also off-task thoughts. However, that does not necessarily imply that the dimensions are not two different things.

• Lines 182-183: How was the number of participants decided?

No method was used, but we aimed for at least 30 to have sufficient numbers for computation of reliable means and to reliable estimate the random effects.

• Lines 231-232: I think I missed how the experiment included 16 personal concern triplets, if the authors selected the 2 main concerns for each participant, and then translated each into a triplet of words. Sorry in advance if I misunderstood.

There were 16 concern triplets because each of the eight blocks contained two concern triplets. We understand the confusion, because the number of blocks and triplets per block have not been discussed yet in that section. We decided to remove the concerning sentence.

• Lines 249-251: Was this second question always presented, even if for example, participants reported to be on-task or externally distracted in the first question? Were the authors still analysing such cases?

The question was always presented, irrespective of the reported thought content. All reports were analyzed. Although reporting the stickiness of thought on on-task focus and external distraction may be less trivial for a participant, we think it is still possible to judge to how sticky/difficult it is to disengage from the focus on the task. Analyzing mind-wandering reports separately would have been an interesting analysis, but the low amount of observations within and between participants precludes that.

• Figure 2: There’s a small typo in the third question (maters instead of matters).

Thank you. We corrected the typo.

• Line 263: A reference could be added for PsychoPy (the most recent study is Peirce, Gray, Simpson, et al., 2019, Behavior Research Methods).

Good suggestion. We added the reference to the text.

• Line 304-305: It seems like the fixed order would cue participants to know in advance when a thought-probe would be presented. This could be seen as an issue, do the authors have any opinions on this?

We think it is unlikely that participants could have predicted the thought probes. Although thought probes always followed five trials after a concern triplet, there were also two thought probes that were randomly inserted in the stimulus sequences. The position of the concern triplet, and consequently, thought probes in the stimulus sequences differed for each block. Furthermore, the stimulus words were different and the inserted concerns alternated between blocks. Because of all these differences between blocks, we do not think that participants recognized the pattern.

• Lines 322-324: How were these cut-offs decided?

We first the cut-off to a reasonable number, which was 0.05 in this case. Subsequently, we visually inspected the marked the segments of the data that would be removed with this cut-off. We concluded that this cut-off was sensitive enough to remove the jumps, but not so sensitive that it would also discard ‘normal’ increases in pupil dilation.

We added this explanation to the text in the manuscript (p. 16)

• Lines 364-365: This decision seems a bit strange in the context of pupil size and arousal, as external distraction is likely arousing, the opposite of inalertness.

True, the predicted arousal for these attentional states is different. However, these states were not so relevant for our research question, and moreover, comprised such a small subset of events that individual analysis was not feasible. Hence, we grouped them in ‘Other’.

• Lines 367-368: What was the reason? What was the point of including a 6 point scale if it’s not used in the analyses?

We decided to use the six-point scale, because this scale had already been used by Unsworth and Robison (2016) to study the relationship between attentional states and pupil size. Using the same scale would allow us to compare our results to theirs.

• Lines 481-483: Apologies in advance as I’m not sure if I’ve missed it, but was this result also taking into account the overall proportion of mind-wandering reports? If not, this result would be confounded by that, as they also increase over time.

No, we did not. However, you are right that these results can be explained by the differences in the proportions of thought content. That said, we did not take the proportion of mind wandering into account there, because at that point in the analysis, it was our aim to investigate whether sticky thinking in general became more prevalent. Later in the paragraph ‘Relationship between attentional state and stickiness level’ and Fig. 7 we discussed the relationship between stickiness and thought content.

• Figures 5 and 6: The legend for the plot on the right could be a bit clearer (or is missing).

The legend on the left also applies to the plot on the right. However, we understand that this may not be obvious. We added a separate legend for the right plots in Fig 5 and 6.

• Lines 511-512: There is a serious and common problem in these designs, in which a thought-probe is presented always after a target stimulus. Following a mistake, participants might confabulate and report they were not on-task. This is a hard problem to overcome, but I think that this possibility should be discussed, given that the whole field relies on self-reports.

We agree. Thank you for reminding us to mention it in the manuscript. We discussed the possibility of confabulated responses in the materials section of the Methods section (p. 15).

New text:

As shown in Fig 3 (top), concern triplets were always followed by four go (no concern) trials, one no-go trial, and one thought-probe. The thought probe questions always immediately followed the no-go trial to ensure that the reported thought content and its stickiness could be reliably attributed to the trials before it. We are aware that a limitation of this design is that participants may confabulate their answer to the thought probe as being off-task when an error has been made on the no-go trial. Nevertheless, since this is the procedure used many prior studies on which we based our work, we kept this design.

• Lines 516-519: Again, I am not 100% sure if I missed this, but was this analysis also taking into account the underlying proportion of mind-wandering thoughts?

We did not. Similar to our analysis described on line 481-483 (previous comment), we did not intent to. However, after reading your comment we agree with you that it does make a lot of sense to do it here. Therefore, we re-analyzed the influence of stickiness on task performance (ACC,RT,RTCV) by fitted a LME model with both stickiness and attentional state as predictor (reference level Neutral and SGT). We added the following text to the manuscript on p. 24:

“When attentional state was added to the LME model as an additional categorical factor alongside stickiness, we found that stickiness remained a significant predictor of no-go accuracy (�2(2) = 82.99, p < .001), but not RT (�2(2) = 1.14, p = .57) or RTCV (�2(2) = 0.83, p = .66). This suggests that stickiness exerts unique influence on no-go accuracy on top of attentional state. The model predicted that participants were 23% more accurate when self-generated thinking was non-sticky (� = + 0.23, z = 5.35, p < .001) compared to neutral (intercept � = 0.48, z = -0.25, p = .80). Participants were 17% less accurate when self-generated thought was reported as sticky (� = - 0.17, z = -4.63, p < .001).”

• Lines 534-535: Did the authors analyse baseline pupil sizes across attentional states with simpler models such as LMMs? Given the amount of previous research on this, I’m just wondering if the null result depends somewhat on the GAMMs.

Yes, we did and the conclusions for the LMMs and GAMMs are the same. Specifically, we fitted LMMs that did control for time-on-task effects and models that did not. The LMMs that did not account for time-on-task effects demonstrated a significantly smaller baseline pupil size for self-generated thoughts and sticky thoughts. However, when accounting for time-on-task effects these results disappeared. Model comparisons with likelihood ratio tests were in favor of the LMMs with time-on-task included. Since time-on-task turned out to be relevant, we decided to report the results of GAMMs in the manuscript. GAMMs do not assume that the relationship between baseline pupil size and time-on-task effects is linear.

• Lines 565-567: It seems a bit strange that the difference is smaller between sticky and non-sticky than between sticky and neutral. If these are opposite of a continuous state, one would expect the difference to be bigger between sticky and non-sticky. Do the authors have any comments on this?

As mentioned in the Discussion, we think that participants could not reliably classify their thought as either non-sticky or neutral. Instead, the classification was made based on the accuracy of the no-go trial prior to the thought probe.

• Figure 10: The legend here could be clearer. Maybe it’s better to use colored lines instead of different patterns. It could also be that it wasn’t very clear because the image included in the review was lo-res.

The plots were saved at 300 dpi, so we do not expect this to be an issue for the final article.

• Line 654: Minor typo, missing a “to” in “compared neutral/non-sticky”.

Thank you. We corrected the typo.

• General: Would it be possible to have a table or a description of the best fitting models for each analysis conducted?

We decided not to have the tables or descriptions in the manuscript, however, all the requested information can be found in the R markdown file on OSF. In addition, we uploaded a html version of the markdown file, which allows the reader to inspect all the code used for the analysis + the output without the need of running the code. Link: https://osf.io/m6ujg/

• General: Did the authors measure and check the response times to the thought-probes? It sometimes happens that some participants will start to respond very quickly to those probes (because they want to finish the experiment faster). Such fast responses should be discarded, as it is debatable if the participants can introspect and report on their previous mental states so quickly.

Yes, we did. However, we did not check it prior to the analysis. Our thanks for pointing this out. In response to this comment, we checked the response times to the first (thought content) and second question (stickiness rating). We found that 55 responses were shorter than 1 second (M = 636 ms, min = 385 ms) accounting for 3% of all responses (N = 1631). Notable is that almost all of the responses were ‘on-task’ and ‘neutral’ (53 out of 55). We decided not the remove these observations for two reasons. 1) We expect that participants can quickly report whether they were focused on the task. 2) It is only a very small amount of observations; hence, it is unlikely that it would influence the results.

Reviewer 2

• One of the main concerns I have is the limited literature covered in the Introduction. Although I think the authors have written concisely, key literature is missing. For example, an expanded discussion on the relationship between sticky thought, possibly negative valence, and arousal might help; it is only briefly mentioned now. Part of this relates to the authors choice not to make a hypothesis about the direction of sticky thought with baseline activity. This choice is completely fine, but the review of relevant concepts is still missing to make this case. There are many studies regarding rumination, etc. More directly related may be Christoff et al.’s (2016) paper which explicitly discusses concepts related to sticky thought, and some of Smallwood’s multi-dimensional experience sampling studies may also prove useful.

We reviewed more literature in the Introduction addressing the similarities and differences between the concept of sticky thought and rumination, constrained mind wandering etc (p. 3-5). In addition, we expanded the discussion on the relationship between sticky thought and pupil size by reviewing a study on rumination and a study on intrusive and negative thought (p.7-8).

• I also found the discussion section to be somewhat speculative without many clear theoretical implications. The authors attempt to explain some of their results with relatively shallow explanations – e.g., why they did not replicate past studies and why sticky thought was not influenced by personal concerns.

In response to this comment, we expanded the discussion on the absence of an effect on RTCV/RT (p.28) and why stickiness was not influenced by personal concerns (p. 32-33).

New text:

(p. 28): The absence of this effect was not an issue of power. Calculating Bayes factors separately for RT and RTCV demonstrated that the present study provides strong evidence for similar RT (BF01 = 37.3) and RTCV (BF01 = 26.2) across different degrees of sticky thinking. An explanation for the present results may be in the relationship between RTCV and the degree to which participants were disengaged from the task (see (73)). Increases in RTCV have been associated with a state of “tuning out” (see (74)), where attention is partially allocated away the task while awareness to the general task context remains. The transient disengagement from the task during tuning out results in slowing and speeding of response times could lead to higher RTCV. In this experiment, participants were likely to be more strongly disengaged from the task during sticky thoughts – a state of “zoning out” (74). According to Cheyne et al. (73), zoning out is associated with reactive and automatic responding to the task. It could be that the response time patterns as a result of automatic responding are not (measurably) different from responding during task focus.

(p. 32-33): Potentially, this may highlight that a sticky mode of thinking does not reliably result from processing personal concerns in a healthy population. Research has shown that people with a strong tendency to ruminate – a particular form of negative sticky thinking - have an attentional bias towards information that describes relevant, but negative, aspects of themselves (see e.g., (21,22)). Participants in our experiment potentially did not have a strong attentional bias towards their personal concerns. For future researchers, it may be interesting to investigate whether individuals with depression and/or individuals with high trait rumination, do engage more in sticky thought after exposure to their personal concerns.

• Sample size is a major concern given there is no a prior power analysis mentioned. The authors cite a previous study on sticky thought which may have been used to estimate an effect size. This field typically sees small (to medium) effect sizes, and these are not necessarily addressed by using lmer. The authors do mention computing BF in the discussion for one of the results that did not replicate, but I do not think this necessarily justifies the sample size for all the various relationships tested given the relatively low effect sizes seen for mind wandering (and other related dimensions) and performance in the literature.

We agree that small sample sizes can be issue when effect sizes are small (such as here), however, we do think that our results are reliable. One of the key reasons we decided to use GAMMs was not necessarily the possibility to model time series, but also because these models have been used frequently to model small effects. For example, the technique is popularly used in linguistics (e.g., Lõo, van Rij, Järvikivi & Baayen, 2016; Vogelzang, Hendriks, & van Rijn, 2016; Wieling, 2018), a field in which effect sizes are usually very small. To ensure that GAMMs did not only detect small effects but at the same time was not overfitting, we used models with complex random effects and an autoregressive model to combat autocorrelation (as suggested by van Rij, 2019). This makes the models very conservative, giving us confidence that the results are reliable and replicable.

Lõo, K., van Rij, J., Järvikivi, J., & Baayen, R. H. (2016). Individual differences in pupil dilation during naming task. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Conference of the Cognitive Science Society (pp. 550–555). Austin, TX: Cognitive Science Society.

van Rij, J., Hendriks, P., van Rijn, H., Baayen, R. H., & Wood, S. N. (2019). Analyzing the time course of pupillometric data. Trends in hearing, 23, 2331216519832483.

Vogelzang, M., Hendriks, P., & van Rijn, H. (2016). Pupillary responses reflect ambiguity resolution in pronoun processing. Language, Cognition and Neuroscience, 31(7), 876–885. doi: 10.1080/23273798.2016.1155718

Wieling, M. (2018). Analyzing dynamic phonetic data using generalized additive mixed modeling: A tutorial focusing on articulatory differences between l1 and l2 speakers of English. Journal of Phonetics, 70, 86–116. doi: 10.1016/ j.wocn.2018.03.002

• One of the other main concerns is on the validity of measuring sticky thought. Figure 2 and the in text wording do not match. How were participants trained/instructed about this question? The authors also note in the discussion people may not be able to discern different levels of sticky thought using their method. While I appreciate the honesty, it seems like it could be problem with the study design/materials used since the research question was not to test a measure of sticky thought but rather the findings/interpretation depend on a reliable measure.

We think that participants are able to report on the stickiness of their thoughts with comparable accuracy to other thought probe responses. If this weren’t true, we would not be able to find significant relationships between task performance and sticky thought responses. We also rely on Christoff et al (2018), who showed that participants’ assessment of the extent to which their thoughts were constrained, a concept similar to our stickiness, correlated significantly with external reviewers’ assessments.

• I was surprised to see that the authors chose to treat the sticky thoughts as categorical from the outset and also their choice to bin into three categories. There also did not appear to be a consideration for the responses in other categories as part of their models. Moreover, although the authors did not dichotomize, Seli, Beaty, et al. recently made a case to avoid binning thought dimensions because it can artificially inflate rates.

We think it is questionable whether people can reliably rate the stickiness of their thought on a scale. Therefore, measuring stickiness on a few categories is a good alternative. From the outset, we did not plan to bin the categories (for stickiness or for thought content). However, since some levels contained few observations, we needed to bin some categories to obtain reliable estimates from the GAMMs analysis.

• Related, this also increases the number of comparisons made when binning. In general, there were a large number of tests computed here. Were the number of tests considered in the calculating significance?

No, however, the GAMM and LME’s were fitted with maximized random effects structures (supported by the data). This minimizes Type 1 error.

• During the statistical analyses section and at the outset of the results section, the authors mention time/timeseries as an important factor to consider. Based on how this shaped their entire analytical approach, I think it might be factored into the Introduction and theoretical motivation a bit earlier. The authors also bring up their goal to assess whether they could induce sticky thought, as if this was one of their main questions. However, the paper was not framed to address this as a main question from the Introduction, so I suggest making this more explicit from the outset.

In response to this comment, we added more information about the importance of time-on-task influences in the Introduction by discussing the results of Unsworth and Robison (2016) (see p. 5).

Next text at p. 5: Thought probes are short self-report questionnaires that are embedded in a task to measure the content and dynamics of current thought (25,26). They have the advantage that experiences can be caught close to when they arise. Furthermore, they allow for repeated measures of experienced thought making it possible to investigate changes in thought content over the course of the experiment. For example, Unsworth and Robison (27) used thought probes to investigate how different attentional states, such as mind-wandering and external distraction, correlated with task performance and pupil size measures in sustained attention task. The researchers observed that task performance decreased and pupil size became smaller with time-on-task. Also, they found that reports of mind-wandering were more frequent when the experiment progressed. This demonstrates that time-on-task influences are important to consider when studying self-generated thinking.

Increasing the tendency to engage in sticky thought with the concern manipulation was not a goal of this research. However, it was of course the reason why we used the manipulation. To prevent confusion about this, we changed the wording of a specific line in the Results section (p. 20) to

“We first assessed whether embedding participant’s personal concerns influenced the tendency to engage in sticky, off-task, thinking.”

• Please report all descriptive statistics for all variables.

We added descriptive statistics (mean and standard deviation) for thought report frequencies (p. 20-21) and task performance (p.23) to the manuscript.

• Please also provide model data on all those constructed in a Table and mention them in the text. For example, the model comparison for sticky vs non sticky in terms of task evoked pupil size is not mentioned.

To ensure the readability of the Results sections, we decide against providing the model data in the text. However, we do provide a markdown document of the Results section on OSF. There, all the model data and output can be found along with the analysis code.

• Why were go and no go trials analyzed separately?

For two reasons. First, it is common to analyze the go and no-go trials separately in SART tasks. Therefore, doing it here as well would make it easier to compare the results with other studies. Secondly, go and no-go trials (may) require different cognitive processes to respond correctly. Specifically, no-go trials require inhibiting a habitual response.

• The authors also do not mention other related metrics assessing dynamic measures of thought such as freely-moving thought. The authors may also consider making a point about the fact that sticky thought appears to be different from task unrelated thought, making it an important dimension to study.

On p. 5, we added a more elaborate discussion on what sticky thought is and how it relates to related dynamic measures of thought (such as constrained mind-wandering).

• Figure 1’s caption mentions performance but it is not in the figure.

We removed the concerning sentences from the description.

Attachment

Submitted filename: Responses to ReviewersMvV.docx

Decision Letter 1

Myrthe Faber

8 Oct 2020

PONE-D-19-32012R1

Captivated by thought: "sticky" thinking leaves traces of perceptual decoupling in task-evoked pupil size

PLOS ONE

Dear Dr. Huijser,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

As you can see below, both reviewers positively evaluated your revised manuscript. Reviewer 2 still has a couple of important clarification questions about conceptual, methodological, and analytical details. I would like to invite a revision that addresses these points. If you address these points comprehensively in your revision, the manuscript should be acceptable for publication. Note, however, that the final decision of course depends on the quality and clarity of the invited revision.

Please submit your revised manuscript by Nov 22 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Myrthe Faber

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: All my previous comments have been adressed in a satisfactory manner by the authors.

Congratulations for an interesting paper!

Reviewer #2: Thank you for the opportunity to review the revised manuscript. In general, I am supportive of this paper and think the authors have made substantial progress. At the same time, there are still unresolved issues that need more attention and clarification before recommending publication.

More details are still needed on the choice to group sticky thought into three categories. 1) what were the distributions that made the authors choose to group this way; 2) can the authors confirm the model treated these as categorical rather than continuous?; 3) The way the models are reported currently makes additional aspects unclear. For example, if they were categorical, a single chi-sq/p-value may not capture the full set comparisons because R defaults to a reference group for categorical variables based on my understanding. Was the reference group sticky? More detailed model descriptions and results could help this confusion, as mentioned in my previous review.

The revision has improved the paper, but there are still some unresolved issues with terminology. I suggest the authors avoid using the term “constrained mind wandering” which may cause further confusion with respect to terminology. The papers cited for this term also do not use this term, and they may actually argue against the use of this term given their proposed definition (i.e. Christoff et al 2016). Rather, constrained “thinking” might be more appropriate given these recent, and currently unresolved debates. Moreover, the authors actually do assess what they call mind wandering in other places in a separate question, leading to further confusion.

The last sentence of the first paragraph is confusing – i.e. the one that references the differences in their methods and theory of constraints.

I still think it is relevant to mention in the methods how participants were trained/instructed to use the scale. This is critical information. Perhaps the answer is none, which would still be important to know.

I also do not think including important model comparisons on OSF is entirely sufficient. For example, p values and effect sizes are missing throughout some of the analyses. Some additional models may also be critically important to the main results, such as sticky vs non-sticky thought for evoked pupil size. However, I do appreciate the inclusion of open materials.

Why were only correct trials analyzed?

The authors have added some descriptive statistics. I suggest a table with the M an SD for all key variables, which are still missing.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Mahiko Konishi

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Dec 9;15(12):e0243532. doi: 10.1371/journal.pone.0243532.r004

Author response to Decision Letter 1


21 Nov 2020

More details are still needed on the choice to group sticky thought into three categories. 1) what were the distributions that made the authors choose to group this way;

Answer:We chose to reduce the number of categories from five to three because there was only a low number of observations for the extreme categories per subject. Grouping these small categories allowed us to render our data more reliable. To make this more transparent, we added two tables to the manuscript: Table 1 reports the frequency of responses for each category of the attentional state question, and Table 2 reports the frequency of each response for the stickiness question (p. 18)

Added Tables:

Table 1. Distribution of responses to attentional state question. Average number of responses (out of N = 48) to each answer option on the attentional state question per subject (second column). Relative frequencies, expressed in percentages, are presented in the third column.

Answer option Frequency (out of N = 48 per subject) Percentage

On-task 22.3 46.5

Task-related interference 10.5 21.8

Current concerns 4.9 10.1

External distraction 5.2 10.8

Mind wandering 3.9 8.0

Inalertness 1.3 2.3

Table 2. Distribution of responses to stickiness question. Average number of responses (out of N = 48) to each answer option on the stickiness question per subject (second column). Relative frequencies, expressed in percentages, are presented in the third column.

Answer option Frequency (out of N = 48 per subject) Percentage

Very sticky 3.4 7.4

Sticky 12.4 25.9

Neutral 19.8 41.1

Non-sticky 7.9 16.5

Very non-sticky 4.5 9.4

2) can the authors confirm the model treated these as categorical rather than continuous?;

Answer: Stickiness level was used in the LME analysis as a categorial independent variable/predictor in all but one of the analyses. Once, stickiness was treated as a continuous (dependent) variable, to assess the influence of attentional state on stickiness (see p. 34 for the latter). Inspired by your comment, we revisited this part of the analysis. Instead of a Gaussian LME, we fitted an order categorial LME. In this analysis, stickiness level is now treated as an (ordered) categorical dependent variable. The conclusions remained unaltered.

3) The way the models are reported currently makes additional aspects unclear. For example, if they were categorical, a single chi-sq/p-value may not capture the full set comparisons because R defaults to a reference group for categorical variables based on my understanding. Was the reference group sticky? More detailed model descriptions and results could help this confusion, as mentioned in my previous review.

Answer: Although we see the added value of describing each fitted model in the Results section, the amount of fitted models would make it unreadable. That is why we tried to carefully explain the entire data analysis procedure in the Methods section and include all the performed analyses in the associated R markdown notebook.

Indeed we failed to mention yet what is reflected by the chi-sq/p-values for the analyses with categorical predictors. In response to your comment, we added a clarification on this to our explanation of the statistical analysis in the methods section (p. 19):

“ Statistical significance of individual predictors in the fitted LMEs were determined using chi-square log-likelihood ratio tests, testing the model including the predictor against an intercept-only model. Interactions were tested by comparing a model with the interaction against a model with only the main effects. Predictors in the LMEs were categorical. Consequently, the test statistics only reflect comparisons to a reference group of the categorical predictor(s). The reference group for attentional state was ‘on-task’, for stickiness of thought ‘neutral’, and for the current concerns the ‘no concern’ condition. Regression estimates (i.e., intercept and slopes) of individual LMEs were transformed back to the original scale to enhance interpretation. For Gaussian LMEs we did not determine p-values, but we report t-statistics to indicate statistical significance (|t| ≥ 2).” (p. 19)

The revision has improved the paper, but there are still some unresolved issues with terminology. I suggest the authors avoid using the term “constrained mind wandering” which may cause further confusion with respect to terminology. The papers cited for this term also do not use this term, and they may actually argue against the use of this term given their proposed definition (i.e. Christoff et al 2016). Rather, constrained “thinking” might be more appropriate given these recent, and currently unresolved debates. Moreover, the authors actually do assess what they call mind wandering in other places in a separate question, leading to further confusion.

Answer: Thank you for the critical analysis of the terminology. We read the cited papers again and agree with you that it is more appropriate to speak about constrained thinking instead of constrained mind wandering. In our manuscript, we considered stickiness as an independent dimension of thought alongside attentional state. Hence, sticky thought does not necessarily imply mind wandering, but could be any kind of thought. Similarly, Mills et al (2018) concluded from their study that freedom of movement is independent from attentional state. Claiming that sticky thought is related to constrained mind wandering creates unnecessary confusion and conflicts with the cited work.

New text: Finally, sticky thought is closely related to the concept of constrained thinking (10,11). Constrained thinking refers to an experience in which thoughts do not move freely but instead are focused on a narrow set of content. (p. 3)

The last sentence of the first paragraph is confusing – i.e. the one that references the differences in their methods and theory of constraints.

Answer: What we try to explain there is that our concept of stickiness refers to the experienced difficulty of dropping a current stream of thought, whereas constrained refers to the experience of having a stream of thought that is – deliberately or not – focused on a narrow set of content. We revised the sentence to make this clearer.

New text (p. 3): It is different from our concept of sticky thinking in the question that is posed to the participant—while sticky refers to the experience of the participant that it is difficult to drop the current stream of thought, constrained refers to participants’ experience of having a stream of thought that is – deliberately or not – restricted to a narrow set of content.

I still think it is relevant to mention in the methods how participants were trained/instructed to use the scale. This is critical information. Perhaps the answer is none, which would still be important to know.

Answer: The participants were not trained on how to use the thought probes, but they were shown the thought probe questions prior to the experiment including instructions on how to report an answer. We revised the text on p.14 to include this information.

New text: Following calibration and validation, the instructions for the experiment were presented on the screen. The instructions on how to perform the SART were presented first, including one example of a go and a no-go trail. Afterwards, participants were informed that they will be periodically asked to report their current thoughts. The questions for attentional state and stickiness of thought were presented on the screen, including the instructions on how to report their answer. The participants were not otherwise instructed or trained on how to use the thought probes. A short practice session followed the instructions, consisting of ten SART trials (including one no-go trial) and one thought probe. Participants were encouraged to ask questions when something was unclear.

I also do not think including important model comparisons on OSF is entirely sufficient. For example, p values and effect sizes are missing throughout some of the analyses. Some additional models may also be critically important to the main results, such as sticky vs non-sticky thought for evoked pupil size. However, I do appreciate the inclusion of open materials.

Answer: P-values are indeed not reported for some of the analyses. In most cases that is because the used statistical technique does not provide them (i.e., for the Gaussian LMEs and timeseries analyses with GAMMs). How we determined statistical significance for these techniques is explained in the methods section. In other cases we did not report individual p-values (and/or regression estimates) to prevent visual clutter. For example when all contrasts were significant, or when all were not. However, to accommodate your concerns we added the individual p-values (or t-statistics for Gaussian LMEs) for these latter cases.

New text (p. 23): Our results indicated that only the smooth terms for on-task and self-generated thought were significant (on-task: F = 8.00, p = .005; self-generated thought: F = 10.66; p = .001; task-related interference: F = 2.90, p = 0.09; other: F = 1.33, p = 0.31). Therefore, we can (only) conclude for on-task and self-generated thought that the amount of reports on this type of thinking changed over time.

New text (p. 23): We then asked how stickiness of thought changed over the course of the experiment. We fitted an ordered-categorical GAMM to test how time-on-task influenced the likelihood of reporting having neutral, sticky, or non-sticky thoughts. For this analysis we included the reported answer options as an ordinal dependent variable (1 being non-sticky, 2 neutral, and 3 being sticky). Block number was included as continuous predictor reflecting time-on-task. The results showed that the smooth term for block number was significant (�2 = 12.11, p < .001), indicating that the reported level of stickiness changed over the course of the experiment. To inspect how the likelihood of reporting the different levels of stickiness changed over time, we obtained the predicted probability estimates from the model and plotted these in Fig 6 (right).

New text (p. 24): To test whether distracted states were experienced as stickier, we fitted an ordered categorical (ordinal) LME predicting stickiness level by attentional state. The model indicated that all attentional states were reported as stickier than on-task (on-task: intercept � = -0.40 (transformed), t = -1.79; self-generated thought: � = + 1.53 (transformed), t = 9.85; task-related interference: � = + 1.54 (transformed), t = 11.07; other: � = + 1.77 (transformed), t = 10.11).

New text (p. 25): As expected, we found that all ‘distracted’ attentional states were associated with a lower accuracy on no-go trials compared to on-task (�2(3) = 216.08, p < .001; on-task: intercept � = 0.80, z = 5.92, p < .001; self-generated thought: � = - 0.36, z = -9.47, p < .001; task-related interference: � = - 0.42, z = -11.54, p < .001; other: � = - 0.42, z = -10.08, p < .001).

We also added the results for the comparison between non-sticky and sticky thoughts for evoked response size.

New text (p. 28): The difference in evoked response between sticky and non-sticky thoughts was not found to be significant at any timepoint.

Why were only correct trials analyzed?

We only analyzed correct trials because we were interested only in investigating how being on-task or being engaged in self-generated thoughts/sticky thinking influenced task processing. Incorrect responses had a very different task-evoked response measured in pupil size. As shown in the Figure below*, there is a much higher peak in pupil size during incorrect trials versus correct trials. We are not sure what causes the higher peak, but it could be an error monitoring process. Incorrect responses were relatively infrequent, and therefore could not reliably be incorporated in the analyses. (p. 14)

*R script to generate the Figure has been added to OSF in the folder Analysis as: ‘TERP_incorrect_correct_nogo.Rmd’. Link to OSF repository: https://osf.io/m6ujg/ .

The authors have added some descriptive statistics. I suggest a table with the M an SD for all key variables, which are still missing.

Answer: For the previous revision, we added the descriptive statistics for all key variables in the text including: frequency of different attentional states (p. 22), frequency of different stickiness levels (p. 22), no-go ACC (p. 25), no-go RT (p. 25), no-go RTCV (p. 25). Having a table alongside that is redundant.

Decision Letter 2

Myrthe Faber

24 Nov 2020

Captivated by thought: "sticky" thinking leaves traces of perceptual decoupling in task-evoked pupil size

PONE-D-19-32012R2

Dear Dr. Huijser,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Myrthe Faber

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Myrthe Faber

27 Nov 2020

PONE-D-19-32012R2

Captivated by thought: “sticky” thinking leaves traces of perceptual decoupling in task-evoked pupil size

Dear Dr. Huijser:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Myrthe Faber

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: Review_ Captivated by thought.pdf

    Attachment

    Submitted filename: Responses to ReviewersMvV.docx

    Data Availability Statement

    The data, materials, and analysis code are publicly available online at the Open Science Framework (link to project: https://osf.io/m6ujg/).


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES