|
Probabilistic learning Probabilistic learning paradigms commonly consist of two stimuli or actions to choose from, and depending on the underlying probabilities or contingencies, the choice leads to either a positive (e.g., reward, absence of punishment) or a negative outcome (e.g., punishment, absence of reward) with some variance. Thus, even after the associations are learned, it is not possible to always experience the same (rewarding) outcome due to the noise or stochasticity in this environment. For example, in such a learning paradigm, choosing one option could lead to a reward 80% of the time, whereas choosing the other leads to a reward only 20% of the time. These outcomes for the two options can be either perfectly anticorrelated or independent. Reversal learning Reversal learning paradigms are generally used to study cognitive flexibility and appear similar to the probabilistic learning paradigms. However, they require participants to detect when the contingencies for different options are reversed after every few trials (e.g., a previously more rewarding option becomes less rewarding and vice versa). There are versions of reversal paradigms with deterministic and probabilistic outcomes. In deterministic reversal learning paradigms, the better option leads to the reward 100% of the time when chosen and surprising outcomes signal a reversal. In probabilistic reversal learning paradigms, the surprising outcomes may indicate that the stimulus or action associated with reward most of the time has changed or it might be a result of stochasticity. The frequent contingency reversals increase the volatility in these task environments. Some of these paradigms introduce reversals only after certain criteria are met (e.g., choosing the more rewarding option at least three times in a row; Weiss et al., 2021). Predictive inference Instead of probabilities, a task environment might depend on more continuous outcomes, such as points gained or the location of a hidden stimulus. The outcomes of actions vary around a mean value, which leads to stochasticity. While estimating the underlying mean value, learners should not update their expectations too much due to these random fluctuations (e.g., estimation and choice tasks in Jepma et al., 2020). However, there also might be changes in the mean value that are not due to stochasticity in which case learners should update their predictions more quickly. In a task with continuous outcomes, volatility would be reflected as a the rate of change in the generative mean value (e.g., changepoint task; Nassar et al., 2016). Risky decision-making Finally, there may be alternative paradigms that include an element of ambiguity (unknown probabilities) or risk (known probabilities) when learning. These experience-based, decision-making tasks require participants to make choices between risky and safe(r) options that are presented (Christakou et al., 2013; Jepma et al., 2022; Nussenbaum et al., 2022; Rodriguez Buritica et al., 2019). Risky options are generally operationalized as the ones with greater outcome variability, and consistently choosing such risky options can be either beneficial or detrimental in the long run based on their average value (Jepma et al., 2022; Nussenbaum et al., 2022). Thus, both the average expected outcome values and variability (akin to stochasticity) should be learned or estimated over time for different options. |