Learning how to reason and deciding when to decide

Senne Braem; Leslie Held; Amitai Shenhav; Romy Frömer

doi:10.1017/S0140525X22003090

. Author manuscript; available in PMC: 2023 Oct 24.

Published in final edited form as: Behav Brain Sci. 2023 Jul 18;46:e115. doi: 10.1017/S0140525X22003090

Learning how to reason and deciding when to decide

Senne Braem ¹, Leslie Held ², Amitai Shenhav ³, Romy Frömer ⁴

PMCID: PMC10597599 NIHMSID: NIHMS1937067 PMID: 37462203

Abstract

Research on human reasoning has both popularized and struggled with the idea that intuitive and deliberate thoughts stem from two different systems, raising the question how people switch between them. Inspired by research on cognitive control and conflict monitoring, we argue that detecting the need for further thought relies on an intuitive, context-sensitive process that is learned in itself.

Research on reasoning about moral dilemmas or logical problems has traditionally dissociated fast, intuitive modes of responding from slow, deliberate response strategies, often referred to as System 1 versus System 2. For example, when deciding to take the plane versus train, our System 1 might make us decide to take the former because of its speed, while our System 2 could lead to deliberations on its environmental impact and decide for the train. De Neys (in press) proposes a new working model wherein both intuitive and deliberate reasoning are thought to originate from initial “System 1”-intuitions whose activations build up over time and potentially trigger an uncertainty signal. When this uncertainty signal reaches a certain threshold, it can trigger the need for deliberate reasoning, upon which deliberate thought or “System 2”, is called upon to further resolve the reasoning problem. Here, we question the need for assuming a separate, deliberate system, that is activated only conditional upon uncertainty detection. While we are sympathetic to the idea that uncertainty is being monitored and can trigger changes in the thought process, we believe these changes may result from adaptations in decision boundaries (i.e., deciding when to decide) or other control parameters, rather than invoking qualitatively different thought strategies.

Research on cognitive control often focuses on how goal-directed control processes can help us correct, inhibit, or switch away from interfering action tendencies, such as those originating from overtrained associations (Diamond, 2013; Miller & Cohen, 2001). For example, when deciding between the train or plane, our prior habit of taking the plane might trigger the same decision at first, while our current goal to be more environment-friendly should lead us to the train. Importantly, recent theories on cognitive control have emphasized how these goal representations and control processes should not be considered as separate “higher” order processes studied in isolation, but that they are deeply embedded in the same associative network that hosts habits and overtrained responses. That is, goals and control functions can be learned, triggered, and regulated, by the same learning principles that govern other forms of behavior (Abrahamse et al., 2016; Braem & Egner, 2018; Doebel, 2020; Lieder et al., 2018; Logan, 1988). For example, much like the value of simple actions, the value of control functions can be learned (Braem, 2017; Bustamante et al., 2021; Grahek et al., 2022; Yang et al., 2022; Otto et al., 2022; Shenhav et al., 2013). This way, similar to De Neys’ suggestion that we can learn intuitions for the alleged System 1 and 2 responses (or habitual versus goal-directed responses), we argue that people also learn intuitions for different control functions or parameters (see below).

One popular way to study the dynamic interaction between goal-directed and more automatic, habitual response strategies is through the use of evidence accumulation models. In these models, decisions are often thought to be the product of a noisy evidence accumulation process that triggers a certain response once a predetermined decision boundary is reached (Bogacz et al., 2007; Ratcliff et al., 2016; Shadlen & Shohamy, 2016). However, this accumulation of evidence does not qualitatively distinguish between the activation of intuitions versus goal-directed or “controlled” deliberation. Instead, both processes start accumulating evidence at the same time, although potentially from different starting points (e.g., biased towards previous choices or goals) or at different rates (e.g., Ulrich et al., 2015). Depending on how high a decision maker sets their decision boundary, that is, how cautious versus impulsive they are, the goal-directed process will sometimes be too slow to shape, or merely slow down, the decision. These models have been successfully applied to social decision making problems (e.g., Hutcherson, Bushong, & Rangel, 2015; Son, Bhandari, & FeldmanHall, 2019).

In line with the proposal by De Neys (in press), we agree that competing evidence accumulation processes could trigger an uncertainty signal (e.g., directional deviations in drift rate), once uncertainty reaches a certain threshold, similar to how it has been formalized in the seminal conflict monitoring theory (Botvinick et al., 2001), on their turn inspired by Berlyne (1960). However, in our view, the resolution of said signal does not require the activation of an independent system but rather induces controlled changes in parameter settings. Thus, unlike activating a System 2 that provides answers by using a different strategy, cognitive control changes the parameters of the ongoing decision process (for a similar argument, see Shenhav, 2017). For example, it could evoke a simple increase in decision boundary, allowing for the evidence accumulation process to take more time before making a decision (e.g., Cavanagh et al., 2011; Frömer & Shenhav, 2022; Ratcliff & Frank, 2012). The second-order parameters that determine these adaptive control processes (e.g., how high one’s uncertainty threshold should be before calling for adaptations, or how much one should increase their boundary) do not need to be made in the moment, but can be learned (e.g., Abrahamse et al., 2016).

Although we focused on the boundary as closely mapping onto fast and slow processing, we believe other process parameters can be altered too. For example, the response to uncertainty may require or could be aided by directed attention (Callaway, Rangel, & Griffiths, 2021; Jang, Sharma, & Drugowitsch, 2021; Smith & Krajbich, 2019), the memory of previous computations (Dasgupta & Gershman, 2021), learned higher-order strategies (Griffiths et al., 2019; Wang, 2021), or the parsing of a problem into different (evidence accumulation) subprocesses (Hunt et al., 2021). Moreover, a decision maker might even mentally simulate several similar decisions to evaluate one’s (un)certainty before making a response (e.g., by covertly solving the same problem multiple times, Gershman, 2021). In sum, we argue that both intuitive and deliberative reasoning result from similar evidence accumulation processes whose parameter adjustments rely on immanent conflict monitoring and the learning from previous experiences.

Funding statement

This work was supported by an ERC Starting grant awarded to S.B. (European Union's Horizon 2020 research and innovation program, Grant agreement 852570), and grant R01MH124849 from the National Institute of Mental Health awarded to A.S.

Footnotes

Competing interest statement

The authors declare no competing interests.

References

Abrahamse E, Braem S, Notebaert W, & Verguts T (2016). Grounding cognitive control in associative learning. Psychological Bulletin, 142, 693–728. [DOI] [PubMed] [Google Scholar]
Berlyne D. (1960). Conflict, Arousal, and Curiosity. New York: McGraw-Hill Inc [Google Scholar]
Bogacz R, Brown E, Moehlis J, Holmes P, & Cohen JD (2006). The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. Psychological review, 113(4), 700. [DOI] [PubMed] [Google Scholar]
Botvinick MM, Braver TS, Barch DM, Carter CS, & Cohen JD (2001). Conflict monitoring and cognitive control. Psychological review, 108(3), 624. [DOI] [PubMed] [Google Scholar]
Braem S. (2017). Conditioning task switching behavior. Cognition, 166, 272–276. [DOI] [PubMed] [Google Scholar]
Braem S, & Egner T (2018). Getting a grip on cognitive flexibility. Current Directions in Psychological Science, 27(6), 470–476. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bustamante L, Lieder F, Musslick S, Shenhav A, & Cohen J (2021). Learning to overexert cognitive control in a Stroop task. Cognitive, Affective, & Behavioral Neuroscience, 21(3), 453–471. [DOI] [PMC free article] [PubMed] [Google Scholar]
Callaway F, Rangel A, & Griffiths TL (2021). Fixation patterns in simple choice reflect optimal information sampling. PLoS computational biology, 17(3), e1008863. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cavanagh JF, Wiecki TV, Cohen MX, Figueroa CM, Samanta J, Sherman SJ, & Frank MJ (2011). Subthalamic nucleus stimulation reverses mediofrontal influence over decision threshold. Nature neuroscience, 14(11), 1462–1467. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dasgupta I, & Gershman SJ (2021). Memory as a computational resource. Trends in Cognitive Sciences, 25(3), 240–251. [DOI] [PubMed] [Google Scholar]
Diamond A. (2013). Executive functions. Annual review of psychology, 64, 135. [DOI] [PMC free article] [PubMed] [Google Scholar]
Doebel S. (2020). Rethinking executive function and its development. Perspectives on Psychological Science, 15(4), 942–956. [DOI] [PubMed] [Google Scholar]
Fine JM, & Hayden BY (2022). The whole prefrontal cortex is premotor cortex. Philosophical Transactions of the Royal Society B, 377 (1844), 20200524. [DOI] [PMC free article] [PubMed] [Google Scholar]
Frömer R, & Shenhav A (2022). Filling the gaps: Cognitive control as a critical lens for understanding mechanisms of value-based decision-making. Neuroscience & Biobehavioral Reviews. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gershman S. (2021). What makes us smart: The computational logic of human cognition. Princeton University Press. [Google Scholar]
Grahek I, Frömer R, Fahey MP, & Shenhav A (2022). Learning when effort matters: Neural dynamics underlying updating and adaptation to changes in performance efficacy. Cerebral Cortex. [DOI] [PMC free article] [PubMed] [Google Scholar]
Griffiths TL, Callaway F, Chang MB, Grant E, Krueger PM, & Lieder F (2019). Doing more with less: meta-reasoning and meta-learning in humans and machines. Current Opinion in Behavioral Sciences, 29, 24–30. [Google Scholar]
Hunt LT, Daw ND, Kaanders P, MacIver MA, Mugan U, Procyk E, … & Kolling N (2021). Formalizing planning and information search in naturalistic decision-making. Nature neuroscience, 24(8), 1051–1064. [DOI] [PubMed] [Google Scholar]
Hutcherson CA, Bushong B, & Rangel A (2015). A neurocomputational model of altruistic choice and its implications. Neuron, 87(2), 451–462. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jang AI, Sharma R, & Drugowitsch J (2021). Optimal policy for attention-modulated decisions explains human fixation behavior. Elife, 10, e63436. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lieder F, Shenhav A, Musslick S, & Griffiths TL (2018). Rational metareasoning and the plasticity of cognitive control. PLoS computational biology, 14(4), e1006043. [DOI] [PMC free article] [PubMed] [Google Scholar]
Logan GD (1988). Toward an instance theory of automatization. Psychological review, 95(4), 492. [Google Scholar]
Miller EK, & Cohen JD (2001). An integrative theory of prefrontal cortex function. Annual review of neuroscience, 24(1), 167–202. [DOI] [PubMed] [Google Scholar]
Otto AR, Braem S, Silvetti M, & Vassena E (2022). Is the juice worth the squeeze? Learning the marginal value of mental effort over time. Journal of Experimental Psychology: General. [DOI] [PubMed] [Google Scholar]
Ratcliff R, Smith PL, Brown SD, & McKoon G (2016). Diffusion decision model: Current issues and history. Trends in cognitive sciences, 20(4), 260–281. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ratcliff R, & Frank MJ (2012). Reinforcement-based decision making in corticostriatal circuits: mutual constraints by neurocomputational and diffusion models. Neural computation, 24(5), 1186–1229. [DOI] [PubMed] [Google Scholar]
Shadlen MN, & Shohamy D (2016). Decision making and sequential sampling from memory. Neuron, 90(5), 927–939. [DOI] [PMC free article] [PubMed] [Google Scholar]
Shenhav A. (2017). The perils of losing control: Why self-control is not just another value-based decision. Psychological Inquiry, 28(2-3), 148–152. [DOI] [PMC free article] [PubMed] [Google Scholar]
Shenhav A, Botvinick MM, & Cohen JD (2013). The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron, 79(2), 217–240. [DOI] [PMC free article] [PubMed] [Google Scholar]
Smith SM, & Krajbich I (2019). Gaze amplifies value in decision making. Psychological Science, 30(1), 116–128. [DOI] [PubMed] [Google Scholar]
Son JY, Bhandari A, & FeldmanHall O (2019). Crowdsourcing punishment: Individuals reference group preferences to inform their own punitive decisions. Scientific reports, 9(1), 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ulrich R, Schröter H, Leuthold H, & Birngruber T (2015). Automatic and controlled stimulus processing in conflict tasks: Superimposed diffusion processes and delta functions. Cognitive psychology, 78, 148–174. [DOI] [PubMed] [Google Scholar]
Yang Q, Xing J, Braem S, & Pourtois G (2022). The selective use of punishments on congruent versus incongruent trials in the Stroop task. Neurobiology of Learning and Memory, 193, 107654. [DOI] [PubMed] [Google Scholar]
Wang JX (2021). Meta-learning in natural and artificial intelligence. Current Opinion in Behavioral Sciences, 38, 90–95. [Google Scholar]

[R1] Abrahamse E, Braem S, Notebaert W, & Verguts T (2016). Grounding cognitive control in associative learning. Psychological Bulletin, 142, 693–728. [DOI] [PubMed] [Google Scholar]

[R2] Berlyne D. (1960). Conflict, Arousal, and Curiosity. New York: McGraw-Hill Inc [Google Scholar]

[R3] Bogacz R, Brown E, Moehlis J, Holmes P, & Cohen JD (2006). The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. Psychological review, 113(4), 700. [DOI] [PubMed] [Google Scholar]

[R4] Botvinick MM, Braver TS, Barch DM, Carter CS, & Cohen JD (2001). Conflict monitoring and cognitive control. Psychological review, 108(3), 624. [DOI] [PubMed] [Google Scholar]

[R5] Braem S. (2017). Conditioning task switching behavior. Cognition, 166, 272–276. [DOI] [PubMed] [Google Scholar]

[R6] Braem S, & Egner T (2018). Getting a grip on cognitive flexibility. Current Directions in Psychological Science, 27(6), 470–476. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Bustamante L, Lieder F, Musslick S, Shenhav A, & Cohen J (2021). Learning to overexert cognitive control in a Stroop task. Cognitive, Affective, & Behavioral Neuroscience, 21(3), 453–471. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Callaway F, Rangel A, & Griffiths TL (2021). Fixation patterns in simple choice reflect optimal information sampling. PLoS computational biology, 17(3), e1008863. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Cavanagh JF, Wiecki TV, Cohen MX, Figueroa CM, Samanta J, Sherman SJ, & Frank MJ (2011). Subthalamic nucleus stimulation reverses mediofrontal influence over decision threshold. Nature neuroscience, 14(11), 1462–1467. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] Dasgupta I, & Gershman SJ (2021). Memory as a computational resource. Trends in Cognitive Sciences, 25(3), 240–251. [DOI] [PubMed] [Google Scholar]

[R11] Diamond A. (2013). Executive functions. Annual review of psychology, 64, 135. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] Doebel S. (2020). Rethinking executive function and its development. Perspectives on Psychological Science, 15(4), 942–956. [DOI] [PubMed] [Google Scholar]

[R13] Fine JM, & Hayden BY (2022). The whole prefrontal cortex is premotor cortex. Philosophical Transactions of the Royal Society B, 377 (1844), 20200524. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Frömer R, & Shenhav A (2022). Filling the gaps: Cognitive control as a critical lens for understanding mechanisms of value-based decision-making. Neuroscience & Biobehavioral Reviews. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] Gershman S. (2021). What makes us smart: The computational logic of human cognition. Princeton University Press. [Google Scholar]

[R16] Grahek I, Frömer R, Fahey MP, & Shenhav A (2022). Learning when effort matters: Neural dynamics underlying updating and adaptation to changes in performance efficacy. Cerebral Cortex. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] Griffiths TL, Callaway F, Chang MB, Grant E, Krueger PM, & Lieder F (2019). Doing more with less: meta-reasoning and meta-learning in humans and machines. Current Opinion in Behavioral Sciences, 29, 24–30. [Google Scholar]

[R18] Hunt LT, Daw ND, Kaanders P, MacIver MA, Mugan U, Procyk E, … & Kolling N (2021). Formalizing planning and information search in naturalistic decision-making. Nature neuroscience, 24(8), 1051–1064. [DOI] [PubMed] [Google Scholar]

[R19] Hutcherson CA, Bushong B, & Rangel A (2015). A neurocomputational model of altruistic choice and its implications. Neuron, 87(2), 451–462. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Jang AI, Sharma R, & Drugowitsch J (2021). Optimal policy for attention-modulated decisions explains human fixation behavior. Elife, 10, e63436. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Lieder F, Shenhav A, Musslick S, & Griffiths TL (2018). Rational metareasoning and the plasticity of cognitive control. PLoS computational biology, 14(4), e1006043. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] Logan GD (1988). Toward an instance theory of automatization. Psychological review, 95(4), 492. [Google Scholar]

[R23] Miller EK, & Cohen JD (2001). An integrative theory of prefrontal cortex function. Annual review of neuroscience, 24(1), 167–202. [DOI] [PubMed] [Google Scholar]

[R24] Otto AR, Braem S, Silvetti M, & Vassena E (2022). Is the juice worth the squeeze? Learning the marginal value of mental effort over time. Journal of Experimental Psychology: General. [DOI] [PubMed] [Google Scholar]

[R25] Ratcliff R, Smith PL, Brown SD, & McKoon G (2016). Diffusion decision model: Current issues and history. Trends in cognitive sciences, 20(4), 260–281. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] Ratcliff R, & Frank MJ (2012). Reinforcement-based decision making in corticostriatal circuits: mutual constraints by neurocomputational and diffusion models. Neural computation, 24(5), 1186–1229. [DOI] [PubMed] [Google Scholar]

[R27] Shadlen MN, & Shohamy D (2016). Decision making and sequential sampling from memory. Neuron, 90(5), 927–939. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Shenhav A. (2017). The perils of losing control: Why self-control is not just another value-based decision. Psychological Inquiry, 28(2-3), 148–152. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] Shenhav A, Botvinick MM, & Cohen JD (2013). The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron, 79(2), 217–240. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] Smith SM, & Krajbich I (2019). Gaze amplifies value in decision making. Psychological Science, 30(1), 116–128. [DOI] [PubMed] [Google Scholar]

[R31] Son JY, Bhandari A, & FeldmanHall O (2019). Crowdsourcing punishment: Individuals reference group preferences to inform their own punitive decisions. Scientific reports, 9(1), 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] Ulrich R, Schröter H, Leuthold H, & Birngruber T (2015). Automatic and controlled stimulus processing in conflict tasks: Superimposed diffusion processes and delta functions. Cognitive psychology, 78, 148–174. [DOI] [PubMed] [Google Scholar]

[R33] Yang Q, Xing J, Braem S, & Pourtois G (2022). The selective use of punishments on congruent versus incongruent trials in the Stroop task. Neurobiology of Learning and Memory, 193, 107654. [DOI] [PubMed] [Google Scholar]

[R34] Wang JX (2021). Meta-learning in natural and artificial intelligence. Current Opinion in Behavioral Sciences, 38, 90–95. [Google Scholar]

PERMALINK

Learning how to reason and deciding when to decide

Senne Braem

Leslie Held

Amitai Shenhav

Romy Frömer

Abstract

Funding statement

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Learning how to reason and deciding when to decide

Senne Braem

Leslie Held

Amitai Shenhav

Romy Frömer

Abstract

Funding statement

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases