Humans can infer social preferences from decision speed alone

Sophie Bavard; Erik Stuchlý; Arkady Konovalov; Sebastian Gluth

doi:10.1371/journal.pbio.3002686

. 2024 Jun 20;22(6):e3002686. doi: 10.1371/journal.pbio.3002686

Humans can infer social preferences from decision speed alone

Sophie Bavard ^1,^*, Erik Stuchlý ¹, Arkady Konovalov ², Sebastian Gluth ¹

Editor: Thorsten Kahnt³

PMCID: PMC11189591 PMID: 38900903

Abstract

Humans are known to be capable of inferring hidden preferences and beliefs of their conspecifics when observing their decisions. While observational learning based on choices has been explored extensively, the question of how response times (RT) impact our learning of others’ social preferences has received little attention. Yet, while observing choices alone can inform us about the direction of preference, they reveal little about the strength of this preference. In contrast, RT provides a continuous measure of strength of preference with faster responses indicating stronger preferences and slower responses signaling hesitation or uncertainty. Here, we outline a preregistered orthogonal design to investigate the involvement of both choices and RT in learning and inferring other’s social preferences. Participants observed other people’s behavior in a social preferences task (Dictator Game), seeing either their choices, RT, both, or no information. By coupling behavioral analyses with computational modeling, we show that RT is predictive of social preferences and that observers were able to infer those preferences even when receiving only RT information. Based on these findings, we propose a novel observational reinforcement learning model that closely matches participants’ inferences in all relevant conditions. In contrast to previous literature suggesting that, from a Bayesian perspective, people should be able to learn equally well from choices and RT, we show that observers’ behavior substantially deviates from this prediction. Our study elucidates a hitherto unknown sophistication in human observational learning but also identifies important limitations to this ability.

When observing other people’s behavior, humans can learn about the observed person’s preferences by analyzing their choices. This study shows that observing someone else’s response time alone in a choice paradigm is sufficient to infer their preferences.

Introduction

Each person’s unique set of preferences shapes the decisions they make: by closely observing these decisions, one can gain valuable insights into their likes, dislikes, and priorities. Whether and how one can learn and understand the preferences of others from observing their choices has been well documented in the social and reinforcement learning literatures [1–7]. Yet, focusing solely on choices is often not sufficient to determine the strength of a person’s preference (i.e., the confidence with which the person has made their choice or how likely they are to make the same choice again). That is, a person would choose option A if they found it twice as valuable as option B, just as they would if they found option A 10 times as valuable. From the observer’s perspective, this leads to a many-to-one relationship between strengths of preference and choice, making it impossible to narrow down the strength of preference from choices alone (unless they can extrapolate from observing multiple choices). Fortunately, the decision-making process offers more than just choices as an output. It also generates response times (RTs), which have been found to decrease as the strength of preference increases. In other words, when faced with equally liked options, individuals tend to take more time to make their decisions. This negative relationship between RT and utility difference has been established in many value-based domains, including decisions under risk [8–14], intertemporal choices [13,15–17], food choices [18–27], happiness measurements [28], and social decision-making [13,16,29–32].

Despite the critical information on strength of preference provided by RT, their impact on learning about others’ preferences has received limited attention compared to the extensive study of learning from choices. It has recently been proposed that taking RT into account can be used to predict the choice of future unseen decisions, even when choices alone would fail to make correct out-of-sample predictions [14,22,29,33–36]. Most of these studies, however, do not use RT as information for humans to make inferences on someone else’s decision-making process, but rather as a tool to improve model fitting or model simulation in predicting future choices. On the other side, recent literature suggests that human adults [24,31,37–39] and children [40] do take RT into account when estimating someone else’s hidden preference or competence, in paradigms where observers were informed both about the decision-maker’s choices and RT. Importantly though, all these studies only use RT as a supplementary measure to choices, not as the sole piece of information available to the observer.

On theoretical grounds, this notion was taken even further and it has been proposed that RT alone (i.e., without observed choices) could be used to infer preferences and predict future choices. For example, Chabris and colleagues argued that RTs reveal key attributes of the cognitive processes that implement preferences in an intertemporal choice setting [15]. Konovalov and Krajbich used RT to infer an indifference point in risky choices, social decision-making, and intertemporal settings [13]. Schotter and Trevino showed that the most informative trial-based RT has out-of-sample predictive power for determining someone’s decision threshold in a social decision-making setting [29]. So, in principle, it should be possible to infer latent information or processes from RT alone, including preferences in value-based decisions. Yet, none of these studies have tested empirically whether individuals are capable of using RT information as effectively and to learn someone else’s preference by observing their RT alone.

To answer this question, we propose a preregistered orthogonal design to investigate the role of both choices and RT in learning and inferring others’ social preferences. In our lab study, participants (n = 46, here referred to as observers) observed other people’s decision process in a Dictator Game [41,42], where the decision makers (N = 16, here referred to as dictators) were asked to choose between different monetary allocations between themselves and another person. Based on their behavior in the Dictator Game, participants can be ranked on a scale from selfish (choosing the allocation with the higher number of points for themselves) to prosocial (choosing the allocation with the lower number of points for themselves). Consequently, we assume that the dictators’ position on this scale reflects their preferred allocation: their ideal ratio of points for themselves versus the other person. Therefore, a decision problem with 2 options equally distant from the preferred allocation represents a choice between 2 equally liked allocations, resulting in high decision difficulty and the expectation that RT should be very long. Conversely, if the options’ distances to the preferred allocation are unequal (in other words, one option is much closer to the preference), this results in low decision difficulty and the expectation that RT should be very short. In this framework, we varied the amount of information provided to the observers: choice and RT information was either hidden or revealed to observers in a 2-by-2 within-subject design. Behavioral analyses confirmed our hypothesis, as observers were able to learn the dictators’ social preferences when they could observe their choices, but also when they could only observe their RT. To gain mechanistic insights into these observational learning processes, we developed a reinforcement learning (RL) model that takes both choices and RT into account to infer the dictator’s social preference. This model closely captured the performance and learning curves of observers in the different conditions. On the other side, recent studies have proposed (inverted) Bayesian inference as the optimal framework underlying the cognitive process of social learning [24,43–47], and (quasi-)optimal Bayesian learning has been reported in various fields such as reward-based learning [48,49] or multisensory integration [50]. Motivated by this work, we designed a benchmark Bayes-optimal (BO) model in which the observer’s belief on the dictator’s social preferences and choice processes is updated using Bayes’ rule on prior and current observations. By comparing this BO model to the RL model, we show that, while observers’ learning is close to optimal when they can observe choices, they substantially deviate from optimality when they can only observe RT, suggesting that the underlying mechanisms are better captured by our approximate reinforcement learning model. Overall, our study proposes an innovative approach to investigate the role of RT in learning and inferring preferences, identifies a new sophistication in human social inferences, and highlights the importance of considering a greater extent of decision processes when investigating observational learning.

Results

Experimental protocol

To test whether people learn someone else’s social preference when observing only their RTs, we designed a two-task experiment involving a variant of the Dictator Game [41,42]. In this variant, participants were asked to choose between 2 two-color circles, each representing a proportion of points allocated to themselves (“self”) and to another person (“other,” Fig 1A). For the “Dictator task,” we recruited a sample of 16 participants, which will be referred to as dictators, and we recorded both their choices and RT. For the “Observer task,” we recruited a sample of 46 participants. These participants, which will be referred to as observers, were asked to first complete a shortened version of the Dictator task, before observing the (previously recorded) decisions of all 16 dictators. For the observation phase, we used a 2 × 2 within-subject orthogonal design, manipulating the amount of information provided to the observers: the dictator’s choices revealed or hidden, their RT revealed or hidden (Fig 1B). Before observing the decisions of a dictator, observers were informed that they were about to observe a new person’s decisions, and whether they would see their choices, RT, both, or no information. They were asked to estimate the social preference of this person (their most preferred allocation), once before observing any decision and then after every 4 trials, for a total of 4 estimations over the 12 observed trials per dictator (estimation trials, Fig 1C). After observing the 12 trials, observers were asked to predict what this person would choose in 4 previously unseen decision problems (prediction trials, Fig 1C). After these 4 prediction trials, the instruction screen for a new dictator was presented. Crucially, all observed and predicted trials were decision problems that observers completed for themselves in the Dictator Game task before observing the dictators.

Fig 1 — **(A)** Trial sequence of the Dictator Game part. **(B)** Orthogonal design of the observation part. Observers were presented with 4 successive conditions varying in the visibility of the choice information (with or without, represented with a black square around the chosen option) and the RT information (with or without, represented with a time interval between allocations onset and choice onset). Both allocations were displayed in all conditions. **(C)** Task design of the observation part. Observers were explicitly informed that they were about to observe a new dictator’s decisions, and in which condition. After observing all trials of a said dictator, they were asked to predict what this person would choose in previously unseen decision problems. **(D)** Regression coefficients with RT per trial as the dependent variable and trial difficulty as the independent variable, for the complete Dictator Game task performed by the dictators. The difficulty was estimated as the difference in subjective values between both allocations (**Eq 2**), using the preference fitted as a free parameter from RTs only (left), choices only (middle), or both (right). Int.: intercept. Diff.: difficulty. Points indicate individual average, shaded areas indicate probability density function, 95% confidence interval, and SEM. N = 16. **(E)** Dictators’ social preference fitted from the RTs alone as a function of their preference fitted from the choices alone, and their preference extracted from behavioral data in the Dictator Game task (**Eq 2**). ρ: Spearman’s coefficient. N = 16. ***p < 0.001. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

Dictator Game results

We first ascertained that the social preference could, in principle, be learned in all conditions, i.e., that the dictator’s choices and RT would be good predictors of their social preference. In our task, their social preference refers to the same construct as their preferred allocation, defined earlier as their ideal ratio of points for themselves versus the other person. On each trial, we calculated subjective values s(∙) for each option (left and right), using the social preference estimated as a free parameter (see Materials and methods):

s (l e f t) = {1 - (l e f t - P r e f)}^{2}

where left is the objective value of the left (resp. right) option (i.e., the number of points allocated to “self”) and Pref is the fitted social preference (see S1 Text for more details). Therefore, an option close to the social preference will have a higher subjective value. We regressed the RT with the difference in subjective values between both options and found a negative effect of this difference in all conditions for all 16 dictators, suggesting that decision problems with options of similar subjective values produce longer RT (Fig 1D). In addition, we found a significant positive correlation between the preference estimated from the choices only fitting a softmax rule, and the preference estimated from the RT only fitting a DDM (Spearman’s ρ(14) = 0.83, p < 0.0001, Fig 1E), replicating previous results [13]. This suggests that both information types are not only sufficient on their own to make inferences on someone else’s social preference, but also lead to inferring the same preference. Together, these results show that, in our Dictator task, RT is a good predictor of social preference as captured by the subjective values (Eq 2).

Observational learning results

After showing that social preference was a good indicator of how long it takes one to make a decision in our task, we turned to the main experiment. The main goal of this study is to investigate whether observers can effectively learn someone else’s social preference by observing their decisions, and more specifically either their RT alone, choices alone, or both (Fig 1D and 1E). To this end, we selected 12 trials per dictator to be observed by the observers (see Materials and methods and S2 Fig for more details on the trial selection). To assess learning during the task, observers were asked to estimate the dictator’s preference on several occasions: once before any observation, then after each 4 trials. First, in accordance with our preregistered analyses, we found significant correlations between the observers’ own preference and (1) their first estimation (before any observation; Spearman’s ρ(44) = 0.38, p = 0.0099, S2A Fig), as well as (2) their average estimation, depending on the amount of information provided to them (average estimation per condition; none: Spearman’s ρ(44) = 0.56, p < 0.0001; RT only: Spearman’s ρ(44) = 0.48, p = 0.0018; choice only: Spearman’s ρ(44) = 0.31, p = 0.038; both: Spearman’s ρ(44) = 0.27, p = 0.073, S2B Fig). Then, according to our main preregistered hypothesis, we analyzed observers’ accuracy in estimating the dictators’ preference (note that for readability, statistical tests of this paragraph are summarized in Table 1 rather than reported in the text). On average, observers were able to learn above the empirical chance level (see Materials and methods, t(45) = 22.59, p < 0.0001, d = 3.33), even in the “RT only” condition (Fig 2A). Surprisingly, observers’ accuracy was above the empirical chance level in the “none” condition as well (Fig 2A). However, the correlation between the dictators’ true preference and the observers’ last estimation was not significant in this condition (Spearman’s ρ(182) = 0.13, p = 0.071), whereas it was significant in all conditions where some information was provided (RT only: Spearman’s ρ(182) = 0.41, p < 0.0001; choice only: Spearman’s ρ(182) = 0.84, p < 0.0001; both: Spearman’s ρ(182) = 0.84, p < 0.0001). This suggests that observers learned to distinguish more prosocial from more selfish dictators in conditions with information but not in the “none” condition, where they mostly used their own preference (Figs 2B and S3). Furthermore, while accuracy was higher in the “both” condition than in the “RT only” condition, observers seemed to learn equally well in the “choice only” and “both” conditions (Fig 2A). The latter result was in contrast with our predictions. All statistical analyses across conditions are reported in Table 1. Finally, to get a more fine-grained understanding of learning dynamics, we amended the preregistered analysis and performed a 4 × 4 ANOVA with factors condition (“none,” “RT only,” “choice only,” “both”) x estimation number (1st, 2nd, 3rd, 4th). Consistent with our results so far, we found significant main effects of both conditions (F(3,135) = 36.84, p < 0.0001, η² = 0.45, Huynh–Feldt corrected) and estimation number (F(3,135) = 80.71, p < 0.0001, η² = 0.64, Huynh–Feldt corrected), and more interestingly a significant interaction (F(9,405) = 13.58, p < 0.0001, η² = 0.23, Huynh–Feldt corrected), suggesting that observers learned faster in the “choice only” and “both” conditions (Fig 2A).

Table 1. Pairwise comparisons of average accuracy per condition.

Emp: empirical, **p < 0.01, ***p < 0.001, Bonferroni-corrected for between-conditions pairwise comparisons. Data and analysis scripts underlying this table are available at https://github.com/sophiebavard/beyond-choices.

	none			RT only			choice only			both
	t-value	p-value	effect size	t-value	p-value	effect size	t-value	p-value	effect size	t-value	p-value	effect size
emp. chance level	6.89	<0.0001***	1.02	14.89	<0.0001***	2.20	20.04	<0.0001***	2.95	23.80	<0.0001***	3.51
none	-	-	-	3.45	0.0073**	0.64	8.37	<0.0001***	1.37	8.71	<0.0001***	1.51
RT only	-	-	-	-	-	-	4.54	0.00025**	0.90	5.88	<0.0001***	1.06
choice only	-	-	-	-	-	-	-	-	-	0.53	1.0	0.09

Open in a new tab

Fig 2 — **(A)** Observers’ accuracy for each estimation as a function of the condition (choice and RT visibility). Left: learning curves; right: average across all trials. Points indicate individual average, shaded areas indicate probability density function, 95% confidence interval, and SEM. N = 46. **(B)** Reported fourth and last estimation per observer per observed dictator, as a function of the true preference of each dictator, for each condition. N = 184. ρ: Spearman’s coefficient. In all panels, ns: p > 0.05, **p < 0.01, ***p < 0.001, Bonferroni-corrected for pairwise comparisons. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

Extrapolation to unseen decisions

After having observed all 12 trials of a dictator, we asked the observers to predict what the dictator’s choices would be in a series of 4 previously unseen trials (see Materials and methods for more details on the trial selection). From here on, in contrast with “choice only” and “RT only” conditions, we define “choice visibility” and “RT visibility” as orthogonal factors in our design representing whether or not the choice (resp. RT) information was displayed in each condition; for example, choice visibility is set to 1 in the “choice only” and “both” conditions, and to 0 in the “RT only” and “none” conditions. We first looked at the accuracy, i.e., whether the observer chose the same option as the dictator. In line with the estimation phase results, we found a main effect of choice visibility on prediction accuracy (F(1,45) = 91.52, p < 0.0001, $η_{p}^{2}$ = 0.67), but no effect of RT visibility (F(1,45) = 2.07, p = 0.16, $η_{p}^{2}$ = 0.04) and no interaction (F(1,45) = 2.20, p = 0.15, $η_{p}^{2}$ = 0.05, Fig 3A). We then looked at the consistency, i.e., whether the observer’s choice was consistent with their last preference estimation of the dictator. We found a small main effect of choice visibility (F(1,45) = 5.61, p = 0.022, $η_{p}^{2}$ = 0.11), but no effect of RT visibility (F(1,45) = 0.041, p = 0.84, $η_{p}^{2}$ = 0.00) and no interaction (F(1,45) = 0.12, p = 0.73, $η_{p}^{2}$ = 0.00, Fig 3B). Overall, both the average accuracy and consistency were higher than the chance level of 0.5 (accuracy: t(45) = 23.11, p < 0.0001, d = 3.41; consistency: t(45) = 36.01, p < 0.0001, d = 5.31), suggesting that observers were able to extrapolate their learning of the dictators’ social preference to previously unseen decision problems, and they did so in accordance with their last estimation.

Fig 3 — **(A)** Observers’ accuracy (correct choice rate, i.e., whether they chose the same allocation as the dictator) as a function of the condition (choice and RT visibility). **(B)** Observers’ consistency (choice rate, i.e., whether or not the chosen allocation is consistent with their last estimation for each particular dictator) as a function of the condition (choice and RT visibility). **(C)** Observers’ RT when predicting the dictators’ decision, as a function of the dictator’s RT for each condition. Top: average for all 16 dictators; middle: average for 8 similar dictators only; bottom: average for 8 dissimilar dictators only. **(D)** Observers’ RT when choosing for themselves, only in trials corresponding to the decisions they had to predict. Top: average over the corresponding trials of all 16 dictators; middle: average over the corresponding trials of the 8 similar dictators; bottom: average over the corresponding trials of the 8 dissimilar dictators. In all panels, points indicate individual average, shaded areas indicate probability density function, 95% confidence interval, and SEM. N = 46. ns: p > 0.05, ***p < 0.001, Bonferroni-corrected for pairwise comparisons. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

Observers’ prediction speed mimics dictators’ decision speed

The analyses of observers’ accuracy when predicting decisions confirm that they efficiently learned the dictators’ social preferences and were able to use this information to infer which future decisions might be made in previously unseen contexts. Yet, the inspection of their choices alone does not provide much information about the underlying mechanisms and dynamics of how observers predict others’ decisions. To dig deeper into these mechanisms, we analyzed observers’ RT during the prediction phase. Unbeknownst to the observers, they always predicted 2 easy decisions (where the dictator’s RT was fast) and 2 hard decisions (where the dictator’s RT was slow, S1 Fig). We ran a generalized linear mixed model (GLMM), regressing the observers’ RT onto the independent variables: choice visibility (in the estimation phase), RT visibility (in the estimation phase), and trial duration (i.e., whether the dictator’s RT was short or long). We found a significant main effect of choice visibility (estimate = −0.26, SE = 0.086, t = −3.04, p = 0.0024, Table 2), suggesting that observers made overall faster predictions when the choice information had been available in the estimation phase. The main effect of RT visibility was also significant (estimate = 0.19, SE = 0.091, t = 2.05, p = 0.041, Table 2), suggesting observers were overall slower to predict when the RT information had been available in the estimation phase. For interaction effects, please refer to Table 2.

Table 2. Results from GLMM fitted on observers’ RT in the prediction phase.

The GLMM (generalized linear mixed model with Gamma distribution and identity link function) was fitted on the observers’ RT, with choice visibility in the estimation phase, RT visibility in the estimation phase, and trial duration (i.e., whether the dictator’s RT was short or long), as independent variables. Denotation: Du = duration (fast or slow), Ch = choice visibility (displayed or not), RT = RT visibility (displayed or not), ***p < 0.001, **p < 0.01, *p < 0.05. Data and analysis scripts underlying this table are available at https://github.com/sophiebavard/beyond-choices.

	Predicting others’ decision				Similar dictators				Dissimilar dictators
Effect	Estimate	Std. Error	t-value	p-value	Estimate	Std. Error	t-value	p-value	Estimate	Std. Error	t-value	p-value
Intercept	1.54	0.11	14.54	<0.0001 ***	1.49	0.12	12.30	<0.0001 ***	1.61	0.12	13.54	<0.0001 ***
Duration	0.32	0.086	3.71	0.00021 ***	0.47	0.12	4.09	<0.0001 ***	0.15	0.099	1.54	0.12
Choice visibility	−0.26	0.086	−3.04	0.0024 **	−0.14	0.10	−1.33	0.18	−0.38	0.12	−3.16	0.0016 **
RT visibility	0.19	0.091	2.05	0.041 *	0.28	0.12	2.32	0.020 *	0.098	0.10	0.93	0.35
Du x Ch	0.27	0.082	3.25	0.0012 **	0.16	0.11	1.42	0.16	0.35	0.11	3.16	0.0016 **
Du x RT	0.37	0.090	4.01	<0.0001 ***	0.21	0.13	1.71	0.088	0.54	0.12	4.46	<0.0001 ***
Ch x RT	−0.038	0.076	−0.50	0.62	−0.055	0.10	−0.52	0.60	−0.046	0.10	−0.45	0.65
Du x Ch x RT	−0.28	0.12	−2.24	0.025 *	−0.21	0.17	−1.25	0.21	−0.30	0.17	−0.180	0.073

Open in a new tab

Critically, we also found a main effect of trial duration (estimate = 0.32, SE = 0.086, t = 3.71, p = 0.00021, Table 2), suggesting that a choice set that elicited a long RT for the dictator also elicited a long RT for the observer (Fig 3C, top). This main effect of trial duration is particularly interesting as it suggests that observers put themselves in the shoes of the dictator and predicted the decision in line with the dictator’s perceived difficulty. Under this assumption, one would expect observers to show a long RT when predicting decisions that were hard for the dictator, even if the observer themselves found the decision to be easy (and vice versa). Importantly, however, this pattern should only emerge in the 3 conditions, in which observers could learn inter-individual differences in social preferences, that is, in the “both,” “choice only,” and “RT only” conditions, but not in the “none” condition.

To test this hypothesis, we leveraged the fact that option sets in the “prediction” stage were a subset of option sets in the “self” stage. We then categorized all dictators based on how similar their social preferences were to each of the observers and performed the same regression as reported above on “prediction” RTs for the similar and dissimilar groups of dictators. As expected, we found that observers’ prediction RT were longer for long RT of similar as well as dissimilar dictators in the “both,” “choice only,” and “RT only” conditions (Fig 3C, middle and bottom). In the “none” condition, however, this effect was only seen for similar dictators. In line with these patterns, the regression analyses revealed a significant main effect of duration for similar dictators (as the effect was present in all 4 conditions) but significant interactions of duration with both choice and RT visibility for dissimilar dictators (as the effect was not present in the “none” condition”) (Table 2). These results are consistent with the notion that observers put themselves in the shoes of the dictator whenever they could learn the dictator’s individual social preferences. In the “none” condition, however, observers most likely used on their own preferences to make predictions (in line with the findings of the estimation phase; S2B Fig). Thus, because of the high match of easy versus difficult choice sets with similar but not dissimilar dictators, the duration effect on prediction RT was seen in the former but not the latter case.

To further substantiate this interpretation, we also applied the regression model to the “self” RT on the trials shared with both types of dictators. Here, we would expect the duration effect (short versus long) to be present for trials shared with the similar dictator in all conditions, but to be entirely absent for trials shared with the dissimilar dictator. Indeed, the duration effect was significant for the trials shared with similar dictators (estimate = 0.40, SE = 0.079, t = 5.01, p < 0.0001, Table 3 and Fig 3D, middle) but not for those shared with the dissimilar dictators (estimate = 0.066, SE = 0.037, t = 1.79, p = 0.074, Table 3 and Fig 3D, bottom).

Table 3. Results from GLMM fitted on observers’ RT in the Dictator Game task.

The GLMM (generalized linear mixed model with Gamma distribution and identity link function) was fitted on the observers’ RT when choosing for themselves in their Dictator Game task. Denotation: ***p < .001. Data and analysis scripts underlying this table are available at https://github.com/sophiebavard/beyond-choices.

	Choosing for self				Similar dictators				Dissimilar dictators
Effect	Estimate	Std. Error	t-value	p-value	Estimate	Std. Error	t-value	p-value	Estimate	Std. Error	t-value	p-value
Intercept	1.19	0.090	13.23	<0.0001 ***	1.19	0.089	13.35	<0.0001 ***	1.22	0.095	12.92	<0.0001 ***
Duration	0.23	0.043	5.23	<0.0001 ***	0.40	0.079	5.01	<0.0001 ***	0.066	0.037	1.79	0.074

Open in a new tab

Together, these results all converge to suggest that (1) observers were able to extrapolate the learned social preference to predict decisions for someone else, even if this person had dissimilar social preferences; (2) if a trial was difficult for the dictator, it was also difficult to predict for the observer; (3) whether or not the decision problem was difficult for the observer themselves did not impact how difficult it was for them to predict the dictator. To conclude, in our task, observers were not only able to learn other people’s social preference, but they also applied this information to make decisions for this individual that matched the person’s preferences and decision dynamics, even though they would behave differently when choosing for themselves.

Computational formalization of the behavioral results

Behavioral analyses confirmed our hypothesis: trial-by-trial, observers were able to learn the dictators’ social preferences when they could observe their choices, but also when they could only observe their RT. To gain a more thorough understanding of the mechanisms underlying social preference learning on the basis of observing different features of the decision process, we developed a modified version of a well-established reinforcement learning model [51,52]. To infer the dictator’s social preference, the model takes both choice and RT information (if available) into account, as well as features of the choice options. At each trial t, the estimated preference P is updated with a delta rule:

P_{t} = P_{t - 1} + α * δ_{t}

where α is the learning rate and δ_t is a prediction error term, calculated as the difference between the outcome O_t (defined below) and the current estimation:

δ_{t} = O_{t} - P_{t - 1}

The outcome O_t depends on the type and the amount of information provided to the observer (RL model, see Materials and methods). Intuitively, when only the choice information is available, the outcome is computed as whether or not the chosen option was the more selfish one. When the RT information is available, it is used to categorize the decision between fast and slow. In case of observing a slow decision, the outcome is always computed as the midpoint between the objective values of both options (see S1 Text for more details). In case of observing a fast decision, the outcome depends on whether or not the choice information was displayed. If yes, it is computed as whether or not the chosen option was the more selfish one. If not, the observer is assumed to believe that the option with the higher subjective value was chosen. Finally, when no information was displayed, the outcome was computed as the midpoint between the objective values of both options, implying that in case of receiving no information, the observer is assumed to believe that the dictator was asked to make very difficult decisions (and thus decisions that would be diagnostic of their social preference).

The model closely captures several key aspects of observers’ behavior. In particular, it matches observers’ accuracy in all conditions (Fig 4A and 4B), as well as their last estimation per dictator (S4C Fig). Besides matching accuracy, our model was able to reproduce the difficulty patterns (our best proxy for RT, which are not simulated by our model; Figs 4C and S4A). In addition, the RL model also captured observers’ choices in the prediction phase (S4B Fig). To compare the model and the empirical data of our study to an optimal benchmark, we designed a Bayes-optimal inference model (BO model) that learns the social preference by updating the posterior probability of the model’s parameters, given the available information (see Materials and methods). We found that, while observers’ learning is close to optimal when they can observe choices, they substantially deviate from optimality when they can only observe RT (BO model predictions versus behavioral last estimation: Spearman’s ρ(44) = 0.10, p = 0.52; RL model predictions versus behavioral last estimation: Spearman’s ρ(44) = 0.47, p = 0.0011; Fisher’s z = 1.89, p = 0.029; Fig 4D–4F). Actually, while it is able to match observers’ behavior when they predict dictators’ decisions, the BO model was unable to match observers’ accuracy in all conditions, contrary to the RL model (S5 Fig). Together, these modeling results suggest that the computational mechanisms underlying RT-based observational learning are better captured by our approximate RL model.

Discussion

Humans and other animals are known to learn not only by experiencing rewards and punishments themselves, but also by observing others’ actions and outcomes. On the one hand, this allows learning from punishments and losses without incurring these negative outcomes directly, which comes with obvious evolutionary benefits [7]. On the other hand, observing others can reveal information about their beliefs and preferences, which may be critical for future interactions [53]. So far, research on observational learning has focused on testing whether and how people learn from others’ choices but has largely ignored other sources of information. Here, we set out to fill this gap by studying the computational mechanisms of learning from observing (only) the speed with which decisions are made. We find that people are, indeed, capable of learning from observing RTs only, but that—contrary to previous assertions [24,31,37]—this ability falls short of an optimal Bayesian learner and is instead better described by an RL model.

In the Dictator Game, where one participant has the power to allocate money to another participant, the RT of the dictator can provide insights into their underlying social preferences. When individuals have a clear and strong preference for a particular allocation, they tend to respond quickly and assertively. However, when faced with a decision where their preferences are less well-defined or when considering 2 options with similar appeal, individuals often exhibit longer RT, indicating hesitation or conflict in their decision-making process. This illustrates how RT can serve as a window into the underlying social preferences of individuals: RT can be used as cues to infer other people’s social preferences. Their influence extends beyond the choices individuals make, as RTs are intimately related to the cognitive decision-making processes and reflect the complex interplay between preferences, beliefs, and social context. Building on previous studies, which either suggested RT to be an important source of information theoretically [13,15,29] or showed that humans do use RT to improve their predictions [24,31,37], we designed a task where participants observed someone else’s decisions and had to estimate their underlying preference and predict their future decisions. Combining a factorial design that systematically varied the available sources of information, with asking participants to observe, estimate, and predict individual dictators over repeated trials, allowed us to go beyond existing work in characterizing the computational mechanisms of observational learning from different decision process in great detail.

First, we showed that the dictators’ RT negatively correlated with the difficulty of the trial, i.e., the subjective value difference between the 2 options. In other words, difficult decisions tend to take more time in the social decision domain as well.

Second, we found that observers were able to learn the dictators’ preference in all conditions where they had relevant information, even when they could only observe the dictators’ RT. Interestingly, compared to RT only, participants learned faster and better when they could only observe the dictators’ choices or when they could observe both choices and RT, but their accuracy did not differ between the last 2 conditions. These results suggest that, in our task, participants used the RT information when no other piece of information was available, but they seemed to disregard RT when the choice information was available. We cannot rule out that there might have been an aleatoric uncertainty effect [54,55], already achieved in the choice only condition, meaning that natural constraints (such as some noise in the dictator’s responses, or some sort of representational noise on the observer’s end), prevented the addition of RT information on top of choice to improve participants’ performance beyond this limit. In any case, since this latter result is not in line with recent literature, which suggests that people sometimes use RT on top of choice-only information to improve their inferences and predictions [24,31,37], further research is needed to dig deeper into these mechanisms. For example, contrasting choice and RT as conflicting pieces of information would be more informative to answer this specific question, which was not the main goal of the current study.

Third, we found that participants were able to predict the dictators’ future decisions after having learned their preferences reaching a prediction accuracy that was higher than chance level. However, the arguably most interesting finding with respect to these predictions was that participants’ RT patterns when predicting someone else’s decisions matched the other person’s more than their own (Fig 3C versus Fig 3D). This result strongly suggests that people are able to put themselves into someone else’s shoes when predicting their decisions.

Another interesting finding is that participants showed improvement in their social preference estimation when no information (neither choice nor RT) was displayed, apart from the 2 options available to the other person. We believe that this unanticipated behavioral pattern might reflect a form of higher-order inference, where participants were able to extract information from observing the given options alone. Therefore, when no choice or RT information was given, we assume that the participant believes that the other person was asked to make very difficult decisions (and thus decisions that would be diagnostic of their social preference). Although we implemented this idea of higher-order inference in our specification of the RL model for the “none” condition and obtained support for it in our modeling results (Fig 4A), future research should investigate this further.

Over the past decades, many cognitive neuroscience studies in the field of learning and decision-making have used computational modeling to shed light on how people learn and make decisions in social contexts. Current theories suggest that 3 strategies are at play in this process [56]: vicarious RL, action imitation, and inference about others’ beliefs and intentions (see [57] for a review). Of note, this distinction has been extensively discussed in developmental and comparative psychology—also referred to as “imitation versus emulation” distinction (see [1] for a review). In opposition to vicarious RL where observers learn from others’ experienced outcomes, or from action imitation where observers learn from others’ actions, our task involves a more complex inference process about someone else’s hidden preferences. This framework usually assumes that observers update their beliefs about others’ goals and intentions in a Bayesian manner [24,43,44,47,58,59], combining their prior beliefs with evidence they get from observing others’ actions, both choices [45,46,57] and RTs [24,31,37]. To gain mechanistic insights into these observational learning processes, we compared such a Bayesian inference model against an RL model that takes both choices and RT into account to infer the dictator’s social preference. Instead of learning the value of options or actions, as in more conventional learning scenarios, the RL model seeks to learn the social preferences of others—in our case, the preferred allocation of money in the Dictator Game. When only choices are available, this allocation is updated in accordance with the choice (selfish versus prosocial). When only RTs are available, the updating rule depends on the speed of the decision. In case of slow decisions, the midpoint of the 2 options is used for updating. In the case of fast decisions, the observed agent is assumed to have chosen the higher-valued option, which strengthens any existing belief about the agent’s preferred allocation. In our view, this implementation offers a cognitively plausible approximation that allows inferring social preferences through repeated observations of choices or RT. Accordingly, our model closely captured the performance and learning curves of observers in all the different conditions.

When comparing the RL model to a Bayesian inference model adapted to learn social preference by updating the posterior distribution, qualitative model comparison suggests that, while our participants’ learning is close to optimal when they can observe choices, they substantially deviate from optimality when they can only observe RT. A potential reason why humans fall short of learning from RT in a Bayes-optimal way is its high computational complexity. The complete Bayesian solution requires one to possess an accurate generative model of the decisions and decision speed, such as a drift-diffusion model (DDM) that takes the preferred allocation, as well as the choice options into account to inform the drift rate. Furthermore, the belief distributions of a total of 5 parameters from this generative model must be updated after each observation in an accurate manner. It is conceivable that humans simplify the learning process (akin to our proposed RL model) to reduce the computational complexity and avoid getting lost in a curse of (parameter) dimensionality. A second potential reason for suboptimal performance in the RT only condition is the need to perceive the decision speed accurately for classifying an observed decision as being either fast or slow. Making incorrect classification or being uncertain in this regard will slow down learning substantially as it is likely to produce a substantial number of erroneous inferences.

Taken together, our work deviates from previous literature by challenging the expectation that, from a Bayesian perspective, people should be able to learn equally well from choices and RTs. While our empirical results are in line with the Bayesian prediction on a qualitative level, they diverge substantially from it on a quantitative level.

Our present work builds on a growing literature suggesting that RT alone should be sufficient to produce an accurate estimation of someone’s preference [13,15,29]. Konovalov and Krajbich recently used a DDM without the choice data to estimate individual preferences using subjective value functions in 3 different settings: risky choice, intertemporal choice, and social preferences. Our study replicates their findings, as we were able to accurately estimate the DDM-based preference parameter from RT alone in the first sample of participants. We then took this idea a step further and showed that a second sample of participants were able to provide an accurate estimation of others’ social preference when they observed their RT alone. To the best of our knowledge, this is the first time that this has been empirically tested and validated. Notably, other studies have attempted to increase out-of-sample predictive power with other indices of information processing, such as eye movements [18,60–63] or computer mouse movements [64–68]. In the neuroimaging literature, attempts have been made to move beyond brain–behavior correlations and to predict behavior from brain activity without choices [69–76] (see [77] for a review). Nevertheless, unlike eye movements or neural data, RT are easy to collect from the experimenter’s point of view, and have the benefit of being directly accessible to the actual observer, making them a stronger candidate than many of the other variables mentioned above. Altogether, these and our findings point toward the richness of process data in helping to better understand and predict behavior.

An open question for future research is to elucidate the neural mechanisms that underlie the remarkable ability to learn from observing decision speed and to use this information for making predictions. Historically, brain activity tracking social inference computations was found in regions that are known to be part of the Theory of Mind network, such as dorsomedial prefrontal cortex, temporoparietal junction, and posterior superior temporal sulcus [6,45,78–80]. Nonetheless, as stated above, taking decision speed into account requires an accurate estimation of time passage, suggesting that brain regions related to time perception, such as the pre-supplementary motor area and the intraparietal sulcus [81], should play a critical role. Furthermore, our modeling indicates that a prediction error signal, which quantifies the degree of mismatch (i.e., surprise) between the anticipated and observed decision speed, should play a critical role in the RT-based updating process. Interestingly, a recent EEG study has identified such a surprise signal when participants categorized stimulus durations as being either fast or slow, and modeled this EEG signal as reflecting the distance of a diffusion particle from the anticipated threshold in a DDM-like model [82]. It is tempting to speculate that people compare the observed decision speed with their own expectations in a similar way and that the ensuing (neural) surprise signal drives the social observational learning process. Future research will need to test these predictions to further promote our understanding of how people make sense of other people’s behavior.

To conclude, by investigating the relationship between RT and social preferences in the Dictator Game, we aim to contribute to the existing literature on decision-making, social cognition, and economic behavior. Our findings shed light on the intricate interplay between RT, learning and social preferences, expanding our understanding of the mechanisms underlying human decision-making in social contexts.

Materials and methods

Ethics statement

The research was approved by the Ethics Committee of the Faculty of Psychology and Human Movement Sciences of the University of Hamburg (approval number 2022_019) and carried out following the principles and guidelines for experiments including human participants provided in the Declaration of Helsinki (1964, revised in 2013).

Preregistration

Our recruitment methods, task design, and procedures were preregistered on the Open Science Framework (https://osf.io/tz4dq) prior to the completion of data collection. The preregistration protocol included a within-subjects design with 2 factors (provided information: RTs and choices) and 2 levels (with/without; Fig 1B). A power analysis computed in G*power [83] revealed that, in order to identify an effect with the size of 0.2 (small-to-medium Cohen’s f, ANOVA with repeated measures, within factors, 1 group, 4 measurements) with the power of 0.9, 46 participants should be recruited for this experiment, which uses a pure within-subject design. Preregistered hypotheses included (1) the average accuracy to be higher than chance in 3 out of 4 conditions; (2) the accuracy to increase with the amount of information provided to the participants; and (3) the accuracy to be positively correlated with their performance in the time perception task, especially in conditions where response times are displayed. For clarity of focus in our report, preregistered tests involving time perception have been described separately in S4 Text.

Participants

Dictator Game experiment

We recruited 16 participants from a student population at the University of Basel, Switzerland, via 2 internal participant recruitment platforms (one for psychology students who received course credits for participation, one for students of any field who received a monetary show-up fee of CHF 20 per hour). All participants gave written informed consent, and the study was approved by the local ethics committee (Ethikkommission Nordwest und Zentralschweiz).

Main experiment

We recruited 46 participants (38 females, 8 males, 0 N/A, aged 21.89 ± 4.05 years old) from the pool of psychology students at the University of Hamburg via an internal participant recruitment platform. Five additional participants were excluded because they did not understand the task (e.g., they did not change the preference estimation over the whole task). The Ethics Committee of the Faculty of Psychology and Human Movement Sciences of the University of Hamburg approved the study and participants provided written informed consent prior to their inclusion. To sustain motivation throughout the experiment, participants were given a monetary bonus, whose value was determined by randomly selected choices that the participants made throughout the different phases of the task.

Behavioral tasks

Dictator Game experiment

After reading and signing the consent form, participants received written instructions explaining how the task worked and that their final payoff would be affected by their choices in the task. The instructions were then followed by a short training session of 4 trials, aiming at familiarizing the participants with the response modalities. In our task, point allocations were indicated by colored circles divided into a blue and a red segment. Participants were informed that the blue (resp. red) segment represented their own points (ranging from 100% to 0% of the circle) and the red (resp. blue) segment represented the points allocated to another anonymous person (ranging from 0% to 100% of the circle). Blue and red segments summed to 100%, and the color allocated to self/other was counterbalanced across participants (Fig 1). On each trial, 2 cues were presented on different sides (left /right) on the screen. The position of a given cue was randomized, such that a given cue was presented an equal number of times on the left and on the right. The points allocations were determined by increasing the “self” proportion in 10% steps. The 11 generated allocations were presented in all possible binary combinations (55 in total, not including pairs formed by the same allocation). Each pair of cues was presented 3 times, leading to a total of 165 trials. On each trial, a small noise was added to each allocation (drawn from a truncated normal distribution with fixed mean μ = 0 and variance σ = 0.02, bounded between −0.05 and 0.05). Participants were required to select between the allocations by pressing one of 2 keys on a standard computer keyboard. The choice window was self-paced. After the key press, the cues disappeared and were replaced by an inter-trial fixation screen, whose duration randomly varied from 1 to 4 s. At the end of the experiment, a trial was randomly selected and participants received the “self” points corresponding to the allocation they chose on this trial. In addition, the “other” points from this trial were given to another participant of the same experiment. Hence, participants received 2 bonuses: the proportion of points they chose for themselves converted to money, and the ones they received from another participant.

Main experiment

The main experiment was divided into 2 parts: playing the Dictator Game and observing the Dictator Game. After reading and signing the consent form, participants received written instructions explaining how the task worked and that their final payoff would be affected by their choices in the task. The first part of the task was similar to the one described in the previous paragraph, with the exceptions that (1) the combinations of allocations were presented only once, leading to a total of 55 trials; and (2) only the bonus corresponding to their own points was paid out at the end of the experiment. In the second part of the task, participants were instructed that they would observe the decisions of other people who had performed the same (Dictator Game) task. After an instruction screen specifying which condition they were in (i.e., whether they would observe both choices and RT, only choices, only RT, or none), participants observed a total of 12 trials for each of the 16 dictators, in blocks of 4 dictators per condition. The order of the trials was not randomized within each dictator, but the order of the conditions was pseudo-randomized across participants, and the order of the dictators was pseudo-randomized so that each condition would display the widest range of social preferences that could be learned. For each dictator, participants were asked to estimate their social preference before observing anything, and after 4, 8, and 12 trials, leading to a total of 4 estimations per dictator. To do so, they were asked to move a tick on a slider, which simultaneously changed the visual proportion of an allocation displayed next to the slider (S8 Fig). After observing the 12 trials, participants were asked to indicate which allocation the dictator would have chosen in 4 previously unseen binary decision problems, presented as in the first phase of the experiment. At the end of the experiment, 1 estimation trial was randomly chosen, and an additional bonus was given to the participant, whose amount was proportional to their accuracy on this trial.

All experiments were programmed in Python using PsychoPy (www.psychopy.org).

Trial selection

In order to maximize the likelihood of observers learning the dictators’ social preferences trial-by-trial, we carefully selected which of the 165 trials would be displayed to the participants. To this end, we considered the trials which would be most informative, in terms of both choices and RT. We ran a linear regression on the RT with the subjective value distance as the independent variable (see Results section; S1A Fig):

R T = b_{0} + b_{1} * | s (l e f t) - s (r i g h t) |

(1)

We categorized the trials into slow trials (RT>b₀, the preference is at the midpoint between both available allocations), fast trials (RT<b₀+b₁*(left−right)², the preference is either between 0 and the most prosocial allocation, or between the most selfish allocation and 1), and uninformative trials (all the remaining trials; S1B Fig). Among the informative trials, we excluded all noisy trials where dictators made an inconsistent choice, i.e., choosing the allocation with the highest distance to their social preference. For each dictator in the estimation phase, we selected the 6 fastest trials and the 6 slowest trials whose options’ midpoint was the closest to the true preference (Fig 3B). To ensure that the trial order would not impair learning, we fitted the social preference in all possible order combinations of all possible blocks of 4 trials each. We selected the trial order which resulted in the smallest difference between the fitted and the actual preference, over all the conditions. Hence, the 12 trials of each dictator were presented in the same order for all participants. For the prediction phase, we selected the next 2 fastest trials and 2 slow “midpoint-optimizing” trials, which were presented in a random order for all participants.

Empirical chance level

We derived the empirical chance level in the estimation trials, i.e., the average accuracy one would reach if they randomly guessed the preference for each dictator. To do so, we drew 10,000 samples from a uniform distribution of possible allocations (ranging from 0 to 1) and calculated the corresponding accuracy; the accuracy averaged over samples represents the chance level for each dictator. The chance level averaged over dictators represents the empirical chance level (0.65).

Behavioral analyses

In the Dictator Game, we were interested in 2 variables reflecting the dictator’s/participant’s social preference: (1) the proportion of choices towards the more selfish allocation; and (2) the RT. In the estimation phase, we were interested in the accuracy of participant’s responses, i.e., the distance between the estimated preference and the dictator’s actual preference. In the prediction phase, we were interested in 3 different variables reflecting participants’ strategy: (1) the accuracy, i.e., whether they selected the same allocation as the dictator did; (2) the (internal) consistency, i.e., whether they chose the allocation with the lowest distance to the last preference estimation; and (3) the RT.

For choice analyses, statistical effects were assessed using repeated measures analyses of variance (ANOVAs) with choice visibility (displayed or not) and RT visibility (displayed or not) as within-participant factors. Post hoc tests were performed using one-sample t tests. We report the t statistic, p-value (Bonferroni-corrected when applicable), and Cohen’s d to estimate effect size. Given the large sample size (n = 46), the central limit theorem allows us to assume normal distribution of our overall performance data and apply properties of normal distribution in our statistical analyses, as well as sphericity hypotheses. Regarding the comparison of correlations from dependent samples, we report Fisher’s z test: z statistic and p-value. Regarding ANOVA analyses, we report Levene’s test for homogeneity of variance, the uncorrected statistical, as well as Huynh–Feldt correction for repeated measures ANOVA (when applicable), F statistic, p-value, partial eta-squared $η_{p}^{2}$ , and generalized eta-squared η² (when Huynh–Feldt correction is applied) to estimate effect size.

For RT analyses, to avoid statistical fallacies arising from the assumption of normal distribution and homoskedasticity for skewed datasets, we ran GLMMs on the winsorized RT (0.05th percentile), with a Gamma distribution of the response variable and an Identity link function, with duration (duration: fast or slow), choice visibility (infoCh: displayed or not; only when predicting other), and RT information (infoRT: displayed or not; only when predicting other) as within-participant factors [84]:

R T \sim d u r a t i o n * i n f o C h * i n f o R T + (1 + d u r a t i o n + i n f o C h + i n f o R T | o b s e r v e r s)

To analyze the trials where participants chose for themselves, we only included the duration factor, as choice and RT information of another person are not shown in this phase, and adding them to the GLMM did not significantly improve the fit (χ² (13) = 20.34, p = 0.087, S1 Table):

R T \sim d u r a t i o n + (1 + d u r a t i o n | o b s e r v e r s)

We report the estimates, standard error, t statistic, and p-value. Post hoc tests were performed using Wilcoxon signed rank tests, for which we report the Z statistic and p-value.

All statistical analyses were performed using MATLAB (www.mathworks.com) and R (www.r-project.org).

Computational models

Estimating social preference based on all data

For the Dictator Game task, we calculated the subjective values s(∙) for each option (left and right) at each trial t using the social preference, which was estimated as a free parameter:

s (l e f t_{t}) = {1 - ({l e f t}_{t} - P)}^{2}

s (r i g h t_{t}) = {1 - ({r i g h t}_{t} - P)}^{2},

(2)

where left (resp. right) is the objective value of the left (resp. right) option (i.e., the number of points allocated to “self”) and P is the estimated social preference (free parameter that is subject-specific).

Drift diffusion model

To estimate social preferences in the “RT only” condition, we used a DDM, where we assumed that the drift rate in every trial is a linear function of the difference in the subjective values of the 2 options. Intuitively, individual social preferences can be identified due to the fact that longer RT should be reflective of lower drift rates and thus smaller subjective-value differences [24]. Thus, we use DDM-based probability densities to estimate the preference parameter for each dictator, given the empirical distribution of RT. Because the decision is unknown, we maximize the RT likelihood function across both choice boundaries [13]:

l l_{R T} = \sum_{t} l o g (f (R T_{t}, {c h o i c e}_{t} = left | b, τ, v_{t})) + l o g (f (R T_{t}, {c h o i c e}_{t} = right | b, τ, v_{t})),

(3)

where f is the response time density function, RT_t is the response time on a specific trial t,choice_t is the choice the dictator could have made on specific trial t,b is the DDM decision boundary, τ is the non-decision time, and v_t is the drift rate on specific trial t, which depends on the difference in subjective values.

Choice-based softmax method

For the “choice only” condition, we estimate each dictator’s social preference with a softmax rule, where the probability of choosing the left option at trial t is a logistic function:

{P r o b}_{t} ({c h o o s e}_{l e f t}) = \frac{1}{1 + e^{β_{d} * (s ({r i g h t}_{t}) - s ({l e f t}_{t}))}},

(4)

where β_d>0 is the inverse temperature parameter for the dictator d. High temperatures (β_d→0) cause the action to be all (nearly) equiprobable. Low temperatures (β_d→+∞) cause a greater difference in selection probability for actions that differ in their value estimates [52]. The social preference and the temperature are free parameters that can be estimated for each dictator individually by maximizing a likelihood function [13]:

l l_{C h} = \sum_{t} \log ({P r o b}_{t} ({c h o o s e}_{l e f t})) ∙ 1 ({c h o i c e}_{t} = l e f t) + \log ({1 - P r o b}_{t} ({c h o o s e}_{l e f t})) ∙ 1 ({c h o i c e}_{t} = r i g h t),

(5)

where at each trial t,choice_t is the choice made by the dictator and 1(∙) is the indicator function. Notably, the close correspondence of the softmax (or logit) choice model and the DDM has been elaborated in previous work (e.g., [85]).

Learning social preference based on sequential observations

The goal of our learning models is to infer the observed dictator’s social preference over trials and to choose the best (i.e., subjective-value maximizing) allocation in the prediction phase. We compared 2 alternative computational models: an adapted reinforcement learning model which updates the subjective value of the social preference with a delta-rule and a Bayes-optimal model that integrates the posterior likelihood over a set of predefined parameter prior distributions.

Reinforcement learning models

To model participants’ behavior, we designed 2 modified versions of the standard RL model [52]. In both models, the initial estimated preference P₀ was included as a free parameter. At each trial t, the estimated preference P is updated with a delta rule [51]:

P_{t} = P_{t - 1} + α * δ_{t},

(6)

where α is the learning rate and δ_t is a prediction error term, calculated as the difference between the outcome O_t (defined below) and the current estimation:

δ_{t} = O_{t} - P_{t - 1} .

(7)

The outcome O_t depends on the amount of information provided to the participant.

When both choices and RT were displayed, the trial was categorized as either fast or slow, depending on whether the trial RT was longer than the average RT observed for this dictator. If the trial was slow, the outcome was computed as the midpoint between the objective values of both options (proportion of points for “self”), reflecting the intuition that both options were likely equidistant from the preferred allocation. If the trial was fast, the outcome was computed as 1 or 0, depending on whether the chosen allocation was the more selfish or the more prosocial one:

O_{t} = {\begin{matrix} \frac{l e f t_{t} + r i g h t_{t}}{2} i f R T_{t} > mean (R T_{1 : t}) \\ 1 i f {c h o i c e}_{t} = \max {{l e f t}_{t}, {r i g h t}_{t}} \\ 0 i f {c h o i c e}_{t} \neq \max {{l e f t}_{t}, {r i g h t}_{t}} \end{matrix}

(8)

When only the choices were displayed, the outcome was computed as 1 or 0, depending on whether the chosen allocation was the more selfish or the more prosocial one:

O_{t} = {\begin{cases} 1 if {c h o i c e}_{t} = \max {{l e f t}_{t}, {r i g h t}_{t}} \\ 0 if {c h o i c e}_{t} \neq \max {{l e f t}_{t}, {r i g h t}_{t}} \end{cases}

(9)

When only RT were displayed, the trial was categorized as either fast or slow as specified above. If the trial was slow, the outcome was computed as the midpoint between the objective values of both options (proportion of points for “self”). If the trial was fast, the outcome was computed as 1 or 0, assuming that the allocation with the highest subjective value s(∙), as given in Eq 2, was chosen:

O_{t} = {\begin{matrix} \frac{l e f t_{t} + r i g h t_{t}}{2} if R T_{t} > mean (R T_{1 : t}) \\ 1 i f \max {s ({l e f t}_{t}), s ({r i g h t}_{t})} = \max {{l e f t}_{t}, {r i g h t}_{t}} \\ 0 i f \max {s ({l e f t}_{t}), s ({r i g h t}_{t})} \neq \max {{l e f t}_{t}, {r i g h t}_{t}} \end{matrix}

(10)

Intuitively, this implies that in the case of observing a fast decision, the observer is assumed to believe that the option with the higher subjective value was chosen and to update the social preference accordingly. When no information was displayed, the outcome was computed as the midpoint between the objective values of both options (proportion of points for “self”) at each trial:

O_{t} = \frac{l e f t_{t} + r i g h t_{t}}{2}

(11)

Intuitively, this implies that in case of receiving no information, the observer is assumed to believe that the dictator was asked to make very difficult decisions (and thus decisions that would be diagnostic of their social preference). In the version of the RL model presented in the main text, the outcome was computed as a weighted sum between the choice-related information and the RT-related information, with the addition of a weight parameter 0<ω<1:

O_{t} = O_{C h, t} * (1 - ω) + O_{R T, t} * ω

(12)

To fit the model to the data, maximum likelihood estimation was applied by minimizing the deviance between the data and the model. For estimation trials, the negative log-likelihood was computed as follows:

l l_{t} = - \log (ϕ (D_{t}, P_{t}, σ))

(13)

where ϕ(D_t,P_t,σ) represents the value of a normal distribution at D_t (actual participant’s estimation) with mean P_t (model estimation) and standard deviation σ (fitted as a free parameter). For the prediction trials, the negative log-likelihood was computed as follows:

l l_{t} = - \frac{1}{1 + e^{β_{p} * (s ({u n c h o s e n}_{t}) - s ({c h o s e n}_{t}))}},

(14)

where β_p>0 is the inverse temperature parameter for the participant p and s(∙) represents the subjective value of an allocation, computed as in Eq 2. To avoid local minima, model fitting was performed with 50 different initial parameter values, randomly drawn from prior distributions, which we took to be Beta(1.1,1.1) for the learning rate α, Gamma(1.2,5) for the inverse temperature β_p, and a fixed number for the standard deviation σ [86].

We modeled participants’ choice behavior using a softmax decision rule representing the probability of a participant choosing the left allocation:

{P r o b}_{t} ({c h o o s e}_{l e f t}) = \frac{1}{1 + e^{β_{p} * (s ({r i g h t}_{t}) - s ({l e f t}_{t}))}},

(15)

where β_p>0 is the inverse temperature parameter for the participant p. High temperatures (β_p→0) cause the action to be all (nearly) equiprobable. Low temperatures (β_p→+∞) cause a greater difference in selection probability for actions that differ in their value estimates [52].

Bayes-optimal model

Based on prior work suggesting that observers might infer other people’s choice processes via Bayesian inferences [24,43–47], we tested a model that assumes observers estimate the dictator’s preference using such a Bayesian framework. More specifically, this framework suggests observers infer the optimal parameters by maximizing the posterior distribution of the parameter set, given all the evidence collected so far. In other words, we designed a benchmark Bayes-optimal model which assumes that observers seek to know what is the most likely set of parameters (of a model they assume to be the generative model of the decisions) for this dictator given the observation. To make such an inference, observers must have a generative model of the dictator’s decision-making process (or a model that can be expected to come reasonably close to the true generative model, which itself is unknown). Following previous work [24], we assumed this model to be a DDM, which indeed provides an excellent account of both choices and RT in our Dictator Game task (S9 Fig). In this framework, the drift rate depends on the social preferences and choice options as specified above. On each trial, we computed the likelihood of the observed behavior given a set of parameters from the joint parameters space as follows.

When both choices and RT were displayed, the likelihood was computed as the probability density function of the Wiener first-passage time (WFPT) distribution [87], i.e., the diffusion process given the observed choice and RT (Fig 1D, rightmost panel). When only choices were displayed, the likelihood was computed as a softmax function (Eq 14 and Fig 1D, middle panel). When only RT were displayed, the likelihood was computed as the probability density function of the WFTP distribution given the RT, and integrated over both possible choices (Fig 1D, leftmost panel). When no information was displayed, the likelihood did not differ from the priors for the joint parameter space, which we took to be Beta(3.5,3) for the estimated preference 0<P<1, Gamma(1.2,5) for the temperature 0<β<100, Gamma(2,2) for the boundary separation 0.1<α<10.1, Normal(0,5) for the drift rate 0<v<20, and a uniform distribution for the non-decision time 0.1<T_er<0.5. The posterior distribution was then computed and used as a prior for the next trial.

Supporting information

S1 Text. Trial selection prior to running the main experiment.

(DOCX)

pbio.3002686.s001.docx^{(32.4KB, docx)}

S2 Text. Observers own preference impact their uniformed guesses.

(DOCX)

pbio.3002686.s002.docx^{(27.6KB, docx)}

S3 Text. Qualitative model comparison favors the RL model.

(DOCX)

pbio.3002686.s003.docx^{(28.4KB, docx)}

S4 Text. Time perception and Social Value Orientation score.

(DOCX)

pbio.3002686.s004.docx^{(28.2KB, docx)}

S5 Text. Prediction phase’s GLMM sanity check.

(DOCX)

pbio.3002686.s005.docx^{(28.1KB, docx)}

S1 Fig. Trial selection procedure and corresponding RT.

(A) Illustration of the single-peaked model regression with 2 choice options A and S, with A>S without loss of generality. (B) Proportions of all 165 trials performed by the dictators in the Dictator Game experiment, categorized using the single-peaked model. (C) Dictators’ average RT in the selected 6 fast trials (“short RT”) and 6 slow trials (“long RT”), as seen by the observers in each of the conditions. Points indicate individual average, shaded areas indicate probability density function, 95% confidence interval, and SEM. N = 46. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(PDF)

pbio.3002686.s006.pdf^{(156.1KB, pdf)}

S2 Fig. Observers’ estimations as a function of their own preference.

(A) Observers’ average first estimation as a function of their own social preference. N = 46. (B) Observers’ average estimation per condition as a function of their own social preference. ρ: Spearman’s coefficient. N = 48. In all panels, ns: p > 0.05, *p < 0.05, **p < 0.01, ***p < 0.001. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(PDF)

pbio.3002686.s007.pdf^{(88.8KB, pdf)}

S3 Fig. Additional behavioral results in the estimation phase.

Reported fourth and last estimation per dictator, averaged over observers, as a function of the true preference of each dictator, for each condition. ρ: Spearman’s coefficient. N = 16. In all panels, ns: p > 0.05, **p < 0.01, ***p < 0.001. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(PDF)

pbio.3002686.s008.pdf^{(59KB, pdf)}

S4 Fig. RL model predictions.

(A) Estimated difficulty extracted from RL model predictions as a function of the estimated difficulty from behavioral data from observers and dictators, after the estimation phase, for trials from the prediction phase. Each point represents one average trial difficulty for each duration (fast/slow) for each observer, averaged over dictators and conditions. N = 92. (B) RL model predictions for the proportion of choices towards the left option in the prediction phase, as a function of the behavioral data. (C) RL model predictions for the fourth and last estimation per observer per observed dictator, as a function of the reported fourth and last estimation, for each condition. N = 184. ρ: Spearman’s coefficient. In all panels, ***p < 0.001. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(PDF)

pbio.3002686.s009.pdf^{(136.7KB, pdf)}

S5 Fig. Additional qualitative model comparison.

Average accuracy for the last estimation predicted by the RL model (A) and BO model (B) for each condition, averaged over trials and dictators, as a function of the observers’ behavioral accuracy. In all panels, N = 46, ns: p > 0.05, *p < 0.05, **p < 0.01, ***p < 0.001. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(PDF)

pbio.3002686.s010.pdf^{(121.8KB, pdf)}

S6 Fig. Qualitative model comparison for full model space.

Top: Simulated data (colored dots) superimposed on behavioral data (colored curves) representing the accuracy in the estimation phase for the main RL model (A), the basic RL model (B), the BO model with informative priors (C), and the BO model with uninformative uniform priors (D) in each condition. Shaded areas represent SEM. N = 46. Bottom: accuracy predictions the main RL model (A), the basic RL model (B), the BO model with informative priors (C), and the BO model with uninformative uniform priors (D) as a function a behavioral accuracy in the estimation phase for the last estimation of each participant in each condition. Dashed line represents identity. N = 184. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(PDF)

pbio.3002686.s011.pdf^{(251.2KB, pdf)}

S7 Fig. Between-group comparisons in the estimation phase.

Subset of observers’ accuracy for each estimation as a function of the condition (choice and RT visibility). (A) Observers whose first condition was “none.” (B) Observers whose first condition was “RT.” (C) Observers whose first condition was “Ch.” (D) Observers whose first condition was “both.” (E) Observers whose last condition was “none.” (F) Observers whose last condition was “RT.” (G) Observers whose last condition was “Ch.” (H) Observers whose last condition was “both.” Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(PDF)

pbio.3002686.s012.pdf^{(143.6KB, pdf)}

S8 Fig. Visual representation of the estimation phase.

The figure represents the screen seen by observers to indicate what they thought the dictator’s preference was, by dragging-and-dropping a red (resp. blue for counterbalanced observers) tick on a slider. The < Continue with space bar > line was only displayed after they had made one first click, to avoid perseveration effects. Translated from German for illustration purposes. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(PDF)

pbio.3002686.s013.pdf^{(20.4KB, pdf)}

S9 Fig. DDM simulations on the Dictator Game task.

The DDM model was fitted on the Dictator Game data for the 16 dictators, here represented in an increasing order based on their social preference (estimated from their behavioral choices). The DDM is able to match dictators’ behavior both in terms of choices (top panels) and RT (bottom panels). Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(PDF)

pbio.3002686.s014.pdf^{(47.7KB, pdf)}

S10 Fig. Results of additional experiments.

(A) Observers’ accuracy in the time perception task as a function of the difficulty of the trial, i.e., the time interval between 2 stimuli. Points indicate individual average, shaded areas indicate probability density function, 95% confidence interval, and SEM. N = 46. (B) Observers’ post-task SVO score as a function of their pre-task SVO score and their social preference extracted from their choices in the Dictator Game (DG). Black dashed lines represent categorical boundaries: competitiveness/individualism/prosociality/altruism. Red dashed lines represent a change of category from pre- to post-task scores. SVO: Social Value Orientation scale; DG: Dictator Game; N = 46. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(PDF)

pbio.3002686.s015.pdf^{(94.8KB, pdf)}

S1 Table. Results from GLMM fitted on observers’ RT in the prediction phase.

The GLMM (generalized linear mixed model with Gamma distribution and identity link function) was fitted on the observers’ RT, with choice visibility in the estimation phase, RT visibility in the estimation phase, and trial duration (i.e., whether the dictator’s RT was short or long), as independent variables. Denotation: Du = duration (fast or slow), Ch = choice visibility (displayed or not), RT = RT visibility (displayed or not), ***p < 0.001. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(DOCX)

pbio.3002686.s016.docx^{(28.9KB, docx)}

Acknowledgments

We thank Marie Habermann, Julia Hecht, and Anne Kaufmann for their help in data collection.

Abbreviations

BO: Bayes-optimal
DDM: drift-diffusion model
GLMM: generalized linear mixed model
RL: reinforcement learning
RT: response time
WFPT: Wiener first-passage time

Data Availability

All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials, and are available from Github repository https://github.com/sophiebavard/beyond-choices (https://doi.org/10.5281/zenodo.11178632). All custom scripts have been made available.

Funding Statement

SG is supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant agreement No. 948545, https://cordis.europa.eu/project/id/948545). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Wu CM, Vélez N, Cushman FA, Dezza IC, Schulz E, Wu CM. Representational Exchange in Human Social Learning. The Drive for Knowledge. Published online. 2022:169–192. [Google Scholar]
2.Apps MAJ, Green R, Ramnani N. Reinforcement learning signals in the anterior cingulate cortex code for others’ false beliefs. NeuroImage. 2013;64:1–9. doi: 10.1016/j.neuroimage.2012.09.010 [DOI] [PubMed] [Google Scholar]
3.Burke CJ, Baddeley M, Tobler PN, Schultz W. Partial Adaptation of Obtained and Observed Value Signals Preserves Information about Gains and Losses. J Neurosci. 2016;36(39):10016–10025. doi: 10.1523/JNEUROSCI.0487-16.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Joiner J, Piva M, Turrin C, Chang SWC. Social learning through prediction error in the brain. NPJ Science Learn. 2017;2(1):1–9. doi: 10.1038/s41539-017-0009-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Najar A, Bonnet E, Bahrami B, Palminteri S. The actions of others act as a pseudo-reward to drive imitation in the context of social reinforcement learning. PLoS Biol. 2020;18(12):e3001028. doi: 10.1371/journal.pbio.3001028 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Charpentier CJ, Iigaya K, O’Doherty JP. A Neuro-computational Account of Arbitration between Choice Imitation and Goal Emulation during Human Observational Learning. Neuron. 2020;106(4):687–699.e7. doi: 10.1016/j.neuron.2020.02.028 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Burke CJ, Tobler PN, Baddeley M, Schultz W. Neural mechanisms of observational learning. Proc Natl Acad Sci U S A. 2010;107(32):14431–14436. doi: 10.1073/pnas.1003111107 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Busemeyer JR. Decision making under uncertainty: A comparison of simple scalability, fixed-sample, and sequential-sampling models. J Exp Psychol Learn Mem Cogn. 1985;11(3):538–564. doi: 10.1037//0278-7393.11.3.538 [DOI] [PubMed] [Google Scholar]
9.Moffatt PG. Stochastic Choice and the Allocation of Cognitive Effort. Exp Econ. 2005;8(4):369–388. doi: 10.1007/s10683-005-5375-6 [DOI] [Google Scholar]
10.Gabaix, Laibson D. Bounded Rationality and Directed Cognition [Working Paper].
11.Gabaix X, Laibson D, Moloche G, Weinberg S. Costly Information Acquisition: Experimental Analysis of a Boundedly Rational Model. Am Econ Rev. 2006;96(4):1043–1068. [Google Scholar]
12.Alós-Ferrer C, Granić ÐG, Kern J, Wagner AK. Preference reversals: Time and again. J Risk Uncertainty. 2016;52(1):65–97. doi: 10.1007/s11166-016-9233-z [DOI] [Google Scholar]
13.Konovalov A, Krajbich I. Revealed strength of preference: Inference from response times. Judgm Decis Mak. 2019;14(4):381–394. doi: 10.1017/S1930297500006082 [DOI] [Google Scholar]
14.Alós-Ferrer C, Garagnani M. Strength of Preference and Decisions under Risk. Department of Economics—University of Zurich; 2022. Accessed 2023 Sep 4. Available from: https://econpapers.repec.org/paper/zureconwp/330.htm. [Google Scholar]
15.Chabris CF, Laibson D, Morris CL, Schuldt JP, Taubinsky D. The allocation of time in decision-making. J Eur Econ Assoc. 2009;7(2):628–637. doi: 10.1162/jeea.2009.7.2-3.628 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Krajbich I, Bartling B, Hare T, Fehr E. Rethinking fast and slow based on a critique of reaction-time reverse inference. Nat Commun. 2015;6(1):7455. doi: 10.1038/ncomms8455 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Bulley A, Lempert KM, Conwell C, Irish M, Schacter DL. Intertemporal choice reflects value comparison rather than self-control: insights from confidence judgements. Philos Trans R Soc Lond B Biol Sci. 1866;2022(377):20210338. doi: 10.1098/rstb.2021.0338 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Krajbich I, Armel C, Rangel A. Visual fixations and the computation and comparison of value in simple choice. Nat Neurosci. 2010;13(10):1292–1298. doi: 10.1038/nn.2635 [DOI] [PubMed] [Google Scholar]
19.Krajbich I, Rangel A. Multialternative drift-diffusion model predicts the relationship between visual fixations and choice in value-based decisions. Proc Natl Acad Sci U S A. 2011;108(33):13852–13857. doi: 10.1073/pnas.1101328108 [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Krajbich I, Lu D, Camerer C, Rangel A. The Attentional Drift-Diffusion Model Extends to Simple Purchasing Decisions. Front Psychol. 2012;3. Accessed 2023 Sep 4. Available from: https://www.frontiersin.org/articles/10.3389/fpsyg.2012.00193. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Fisher G. An attentional drift diffusion model over binary-attribute choice. Cognition. 2017;168:34–45. doi: 10.1016/j.cognition.2017.06.007 [DOI] [PubMed] [Google Scholar]
22.Clithero JA. Improving out-of-sample predictions using response times and a model of the decision process. J Econ Behav Organ. 2018;148:344–375. doi: 10.1016/j.jebo.2018.02.007 [DOI] [Google Scholar]
23.Clithero JA. Response times in economics: Looking through the lens of sequential sampling models. J Econ Psychol. 2018;69:61–86. doi: 10.1016/j.joep.2018.09.008 [DOI] [Google Scholar]
24.Gates V, Callaway F, Ho MK, Griffiths TL. A rational model of people’s inferences about others’ preferences based on response times. Cognition. 2021;217:104885. doi: 10.1016/j.cognition.2021.104885 [DOI] [PubMed] [Google Scholar]
25.Gluth S, Sommer T, Rieskamp J, Büchel C. Effective connectivity between hippocampus and ventromedial prefrontal cortex controls preferential choices from memory. Neuron. 2015;86(4):1078–1090. doi: 10.1016/j.neuron.2015.04.023 [DOI] [PubMed] [Google Scholar]
26.Gluth S, Kern N, Kortmann M, Vitali CL. Value-based attention but not divisive normalization influences decisions with multiple alternatives. Nat Hum Behav. 2020;4(6):634–645. doi: 10.1038/s41562-020-0822-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Weilbächer RA, Krajbich I, Rieskamp J, Gluth S. The influence of visual attention on memory-based preferential choice. Cognition. 2021;215:104804. doi: 10.1016/j.cognition.2021.104804 [DOI] [PubMed] [Google Scholar]
28.Liu S, Netzer N. Happy Times: Measuring Happiness Using Response Times. Published online. 2023. doi: 10.2139/ssrn.4416789 [DOI] [Google Scholar]
29.Schotter A, Trevino I. Is response time predictive of choice? An experimental study of threshold strategies. Exp Econ. 2021;24(1):87–117. doi: 10.1007/s10683-020-09651-1 [DOI] [Google Scholar]
30.Cotet M, Krajbich I. Response Times in the Wild: eBay Sellers Take Hours Longer to Reject High Offers and Accept Low Offers. Published online. March 14, 2021. doi: 10.2139/ssrn.3804578 [DOI] [Google Scholar]
31.Frydman C, Krajbich I. Using Response Times to Infer Others’ Private Information: An Application to Information Cascades. Manag Sci. 2022;68(4):2970–2986. doi: 10.1287/mnsc.2021.3994 [DOI] [Google Scholar]
32.Hu J, Konovalov A, Ruff CC. A unified neural account of contextual and individual differences in altruism. O’Connell RG, Frank MJ, O’Connell RG, Hutcherson CA, editors. eLife. 2023;12:e80667. doi: 10.7554/eLife.80667 [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Krajbich I, Oud B, Fehr E. Benefits of Neuroeconomic Modeling: New Policy Interventions and Predictors of Preference. Am Econ Rev. 2014;104(5):501–506. [Google Scholar]
34.Spiliopoulos L, Ortmann A. The BCD of response time analysis in experimental economics. Exp Econ. 2018;21(2):383–433. doi: 10.1007/s10683-017-9528-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Alós-Ferrer C, Fehr E, Netzer N. Time Will Tell: Recovering Preferences When Choices Are Noisy. J Pol Econ. 2021;129(6):1828–1877. doi: 10.1086/713732 [DOI] [Google Scholar]
36.Konovalov A, Ruff CC. Enhancing models of social and strategic decision making with process tracing and neural data. WIREs Cogn Sci. 2022;13(1):e1559. doi: 10.1002/wcs.1559 [DOI] [PubMed] [Google Scholar]
37.Konovalov A, Krajbich I. Decision Times Reveal Private Information in Strategic Settings: Evidence from Bargaining Experiments1. Econ J. Published online. July 24, 2023:uead055. doi: 10.1093/ej/uead055 [DOI] [Google Scholar]
38.Evans AM, van de Calseyde PPFM. The effects of observed decision time on expectations of extremity and cooperation. J Exp Soc Psychol. 2017;68:50–59. doi: 10.1016/j.jesp.2016.05.009 [DOI] [Google Scholar]
39.Van de Calseyde PPFM, Keren G, Zeelenberg M. Decision time as information in judgment and choice. Organ Behav Hum Decis Process. 2014;125(2):113–122. doi: 10.1016/j.obhdp.2014.07.001 [DOI] [Google Scholar]
40.Richardson E, Keil FC. Thinking takes time: Children use agents’ response times to infer the source, quality, and complexity of their knowledge. Cognition. 2022;224:105073. doi: 10.1016/j.cognition.2022.105073 [DOI] [PubMed] [Google Scholar]
41.Kahneman D, Knetsch JL, Thaler RH. Fairness and the Assumptions of Economics. J Bus. 1986;59(4):S285–S300. [Google Scholar]
42.Forsythe R, Horowitz JL, Savin NE, Sefton M. Fairness in Simple Bargaining Experiments. Games Econ Behav. 1994;6(3):347–369. doi: 10.1006/game.1994.1021 [DOI] [Google Scholar]
43.Baker CL, Jara-Ettinger J, Saxe R, Tenenbaum JB. Rational quantitative attribution of beliefs, desires and percepts in human mentalizing. Nat Hum Behav. 2017;1(4):1–10. doi: 10.1038/s41562-017-0064 [DOI] [Google Scholar]
44.Jern A, Lucas CG, Kemp C. People learn other people’s preferences through inverse decision-making. Cognition. 2017;168:46–64. doi: 10.1016/j.cognition.2017.06.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Collette S, Pauli WM, Bossaerts P, O’Doherty J. Neural computations underlying inverse reinforcement learning in the human brain. Elife. 2017;6:e29718. doi: 10.7554/eLife.29718 [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Diaconescu AO, Mathys C, Weber LAE, Daunizeau J, Kasper L, Lomakina EI, et al. Inferring on the Intentions of Others by Hierarchical Bayesian Learning. PLoS Comput Biol. 2014;10(9):e1003810. doi: 10.1371/journal.pcbi.1003810 [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Lucas CG, Griffiths TL, Xu F, Fawcett C, Gopnik A, Kushnir T, et al. The Child as Econometrician: A Rational Model of Preference Understanding in Children. PLoS ONE. 2014;9(3):e92160. doi: 10.1371/journal.pone.0092160 [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS. Learning the value of information in an uncertain world. Nat Neurosci. 2007;10(9):1214–1221. doi: 10.1038/nn1954 [DOI] [PubMed] [Google Scholar]
49.McGuire JT, Nassar MR, Gold JI, Kable JW. Functionally Dissociable Influences on Learning Rate in a Dynamic Environment. Neuron. 2014;84(4):870–881. doi: 10.1016/j.neuron.2014.10.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Ernst MO, Banks MS. Humans integrate visual and haptic information in a statistically optimal fashion. Nature. 2002;415(6870):429–433. doi: 10.1038/415429a [DOI] [PubMed] [Google Scholar]
51.Rescorla RA, Wagner AR. A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. Classical conditioning II: Current research and theory. 1972;2:64–99. [Google Scholar]
52.Sutton RS, Barto AG. Reinforcement Learning: An Introduction. IEEE Trans Neural Netw. 1998;9(5):1054–1054. doi: 10.1109/TNN.1998.712192 [DOI] [Google Scholar]
53.Jordan JJ, Hoffman M, Nowak MA, Rand DG. Uncalculating cooperation is used to signal trustworthiness. Proc Natl Acad Sci U S A. 2016;113(31):8658–8663. doi: 10.1073/pnas.1601280113 [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Glimcher PW, Fehr E. Neuroeconomics: Decision Making and the Brain. Academic Press; 2013. [Google Scholar]
55.Lempert R, Nakicenovic N, Sarewitz D, Schlesinger M. Characterizing Climate-Change Uncertainties for Decision-Makers. An Editorial Essay. Climatic Change. 2004;65(1):1–9. doi: 10.1023/B:CLIM.0000037561.75281.b3 [DOI] [Google Scholar]
56.Dunne S O’Doherty JP. Insights from the application of computational neuroimaging to social neuroscience. Curr Opin Neurobiol. 2013;23(3):387–392. doi: 10.1016/j.conb.2013.02.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Charpentier CJ O’Doherty JP. The application of computational models to social neuroscience: promises and pitfalls. Soc Neurosci. 2018;13(6):637–647. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Devaine M, Daunizeau J. Learning about and from others’ prudence, impatience or laziness: The computational bases of attitude alignment. PLoS Comput Biol. 2017;13(3):e1005422. doi: 10.1371/journal.pcbi.1005422 [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Jara-Ettinger J, Gweon H, Schulz LE, Tenenbaum JB. The Naïve Utility Calculus: Computational Principles Underlying Commonsense Psychology. Trends Cogn Sci. 2016;20(8):589–604. doi: 10.1016/j.tics.2016.05.011 [DOI] [PubMed] [Google Scholar]
60.Devetag G, Di Guida S, Polonio L. An eye-tracking study of feature-based choice in one-shot games. Exp Econ. 2016;19(1):177–201. doi: 10.1007/s10683-015-9432-5 [DOI] [Google Scholar]
61.Polonio L, Coricelli G. Testing the level of consistency between choices and beliefs in games using eye-tracking. Games Econ Behav. 2019;113:566–586. doi: 10.1016/j.geb.2018.11.003 [DOI] [Google Scholar]
62.Hausfeld J, von Hesler K, Goldlücke S. Strategic gaze: an interactive eye-tracking study. Exp Econ. 2021;24(1):177–205. doi: 10.1007/s10683-020-09655-x [DOI] [Google Scholar]
63.Fischbacher U, Hausfeld J, Renerte B. Strategic incentives undermine gaze as a signal of prosocial motives. Games Econ Behav. 2022;136:63–91. doi: 10.1016/j.geb.2022.07.006 [DOI] [Google Scholar]
64.Costa-Gomes M, Crawford VP, Broseta B. Cognition and Behavior in Normal-Form Games: An Experimental Study. Econometrica. 2001;69(5):1193–1235. doi: 10.1111/1468-0262.00239 [DOI] [Google Scholar]
65.Johnson EJ, Camerer C, Sen S, Rymon T. Detecting Failures of Backward Induction: Monitoring Information Search in Sequential Bargaining. J Econ Theory. 2002;104(1):16–47. doi: 10.1006/jeth.2001.2850 [DOI] [Google Scholar]
66.Costa-Gomes MA, Crawford VP. Cognition and Behavior in Two-Person Guessing Games: An Experimental Study. Am Econ Rev. 2006;96(5):1737–1768. doi: 10.1257/aer.96.5.1737 [DOI] [Google Scholar]
67.Brocas I, Carrillo JD, Wang SW, Camerer CF. Imperfect Choice or Imperfect Attention? Understanding Strategic Thinking in Private Information Games. Rev Econ Stud. 2014;81(3):944–970. [Google Scholar]
68.Stillman PE, Shen X, Ferguson MJ. How Mouse-tracking Can Advance Social Cognitive Theory. Trends Cogn Sci. 2018;22(6):531–543. doi: 10.1016/j.tics.2018.03.012 [DOI] [PubMed] [Google Scholar]
69.McClure SM, Li J, Tomlin D, Cypert KS, Montague LM, Montague PR. Neural correlates of behavioral preference for culturally familiar drinks. Neuron. 2004;44(2):379–387. doi: 10.1016/j.neuron.2004.09.019 [DOI] [PubMed] [Google Scholar]
70.Berns GS, Capra CM, Chappelow J, Moore S, Noussair C. Nonlinear Neurobiological Probability Weighting Functions For Aversive Outcomes. Neuroimage. 2008;39(4):2047–2057. doi: 10.1016/j.neuroimage.2007.10.028 [DOI] [PMC free article] [PubMed] [Google Scholar]
71.Grosenick L, Greer S, Knutson B. Interpretable Classifiers for fMRI Improve Prediction of Purchases. IEEE Trans Neural Syst Rehabil Eng. 2008;16(6):539–548. doi: 10.1109/TNSRE.2008.926701 [DOI] [PubMed] [Google Scholar]
72.Lebreton M, Jorge S, Michel V, Thirion B, Pessiglione M. An automatic valuation system in the human brain: evidence from functional neuroimaging. Neuron. 2009;64(3):431–439. doi: 10.1016/j.neuron.2009.09.040 [DOI] [PubMed] [Google Scholar]
73.Levy I, Lazzaro SC, Rutledge RB, Glimcher PW. Choice from Non-Choice: Predicting Consumer Preferences from Blood Oxygenation Level-Dependent Signals Obtained during Passive Viewing. J Neurosci. 2011;31(1):118–125. doi: 10.1523/JNEUROSCI.3214-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
74.Smith A, Bernheim BD, Camerer CF, Rangel A. Neural Activity Reveals Preferences without Choices. Am Econ J Microecon. 2014;6(2):1–36. doi: 10.1257/mic.6.2.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
75.Tusche A, Böckler A, Kanske P, Trautwein FM, Singer T. Decoding the Charitable Brain: Empathy, Perspective Taking, and Attention Shifts Differentially Predict Altruistic Giving. J Neurosci. 2016;36(17):4719–4732. doi: 10.1523/JNEUROSCI.3392-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
76.Webb R, Levy I, Lazzaro SC, Rutledge RB, Glimcher PW. Neural random utility: Relating cardinal neural observables to stochastic choice behavior. J Neurosci Psychol Econ. 2019;12(1):45–72. doi: 10.1037/npe0000101 [DOI] [Google Scholar]
77.Haxby JV, Connolly AC, Guntupalli JS. Decoding Neural Representational Spaces Using Multivariate Pattern Analysis. Annu Rev Neurosci. 2014;37(1):435–456. doi: 10.1146/annurev-neuro-062012-170325 [DOI] [PubMed] [Google Scholar]
78.Boorman ED O’Doherty JP, Adolphs R, Rangel A. The behavioral and neural mechanisms underlying the tracking of expertise. Neuron. 2013;80(6):1558–1571. doi: 10.1016/j.neuron.2013.10.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
79.Hampton AN, Bossaerts P, O’Doherty JP. The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J Neurosci. 2006;26(32):8360–8367. doi: 10.1523/JNEUROSCI.1010-06.2006 [DOI] [PMC free article] [PubMed] [Google Scholar]
80.Wittmann MK, Kolling N, Faber NS, Scholl J, Nelissen N, Rushworth MFS. Self-Other Mergence in the Frontal Cortex during Cooperation and Competition. Neuron. 2016;91(2):482–493. doi: 10.1016/j.neuron.2016.06.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
81.Protopapa F, Hayashi MJ, Kulashekhar S, van der Zwaag W, Battistella G, Murray MM, et al. Chronotopic maps in human supplementary motor area. PLoS Biol. 2019;17(3):e3000026. doi: 10.1371/journal.pbio.3000026 [DOI] [PMC free article] [PubMed] [Google Scholar]
82.Ofir N, Landau AN. Neural signatures of evidence accumulation in temporal decisions. Curr Biol. 2022;32(18):4093–4100.e6. doi: 10.1016/j.cub.2022.08.006 [DOI] [PubMed] [Google Scholar]
83.Faul F, Erdfelder E, Lang AG, Buchner A. G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods. 2007;39(2):175–191. doi: 10.3758/bf03193146 [DOI] [PubMed] [Google Scholar]
84.Lo S, Andrews S. To transform or not to transform: using generalized linear mixed models to analyse reaction time data. Front Psychol. 2015;6. Accessed 2023 Sep 7. Available from: https://www.frontiersin.org/articles/10.3389/fpsyg.2015.01171. [DOI] [PMC free article] [PubMed] [Google Scholar]
85.Webb R. The (Neural) Dynamics of Stochastic Choice. Manag Sci. 2019;65(1):230–255. doi: 10.1287/mnsc.2017.2931 [DOI] [Google Scholar]
86.Daw N. Trial-by-trial data analysis using computational models. Affect, Learning and Decision Making, Attention and Performance XXIII. 2011;23. doi: 10.1093/acprof:oso/9780199600434.003.0001 [DOI] [Google Scholar]
87.Navarro DJ, Fuss IG. Fast and accurate calculations for first-passage times in Wiener diffusion models. J Math Psychol. 2009;53(4):222–230. doi: 10.1016/j.jmp.2009.02.003 [DOI] [Google Scholar]

PLoS Biol. doi: 10.1371/journal.pbio.3002686.r001

Decision Letter 0

Christian Schnell, PhD

5 Dec 2023

Dear Dr Bavard,

Thank you for submitting your manuscript entitled "Beyond choices: humans can infer social preferences from decision speed alone" for consideration as a Research Article by PLOS Biology.

Your manuscript has now been evaluated by the PLOS Biology editorial staff as well as by an academic editor with relevant expertise and I am writing to let you know that we would like to send your submission out for external peer review.

However, before we can send your manuscript to reviewers, we need you to complete your submission by providing the metadata that is required for full assessment. To this end, please login to Editorial Manager where you will find the paper in the 'Submissions Needing Revisions' folder on your homepage. Please click 'Revise Submission' from the Action Links and complete all additional questions in the submission questionnaire.

Once your full submission is complete, your paper will undergo a series of checks in preparation for peer review. After your manuscript has passed the checks it will be sent out for review. To provide the metadata for your submission, please Login to Editorial Manager (https://www.editorialmanager.com/pbiology) within two working days, i.e. by Dec 07 2023 11:59PM.

If your manuscript has been previously peer-reviewed at another journal, PLOS Biology is willing to work with those reviews in order to avoid re-starting the process. Submission of the previous reviews is entirely optional and our ability to use them effectively will depend on the willingness of the previous journal to confirm the content of the reports and share the reviewer identities. Please note that we reserve the right to invite additional reviewers if we consider that additional/independent reviewers are needed, although we aim to avoid this as far as possible. In our experience, working with previous reviews does save time.

If you would like us to consider previous reviewer reports, please edit your cover letter to let us know and include the name of the journal where the work was previously considered and the manuscript ID it was given. In addition, please upload a response to the reviews as a 'Prior Peer Review' file type, which should include the reports in full and a point-by-point reply detailing how you have or plan to address the reviewers' concerns.

During the process of completing your manuscript submission, you will be invited to opt-in to posting your pre-review manuscript as a bioRxiv preprint. Visit http://journals.plos.org/plosbiology/s/preprints for full details. If you consent to posting your current manuscript as a preprint, please upload a single Preprint PDF.

Feel free to email us at plosbiology@plos.org if you have any queries relating to your submission.

Kind regards,

Christian

Christian Schnell, PhD

Senior Editor

PLOS Biology

cschnell@plos.org

PLoS Biol. doi: 10.1371/journal.pbio.3002686.r002

Decision Letter 1

Christian Schnell, PhD

19 Jan 2024

Dear Dr Bavard,

Thank you for your patience while your manuscript "Beyond choices: humans can infer social preferences from decision speed alone" went through peer-review at PLOS Biology. Your manuscript has now been evaluated by the PLOS Biology editors, an Academic Editor with relevant expertise, and by several independent reviewers.

In light of the reviews, which you will find at the end of this email, we are pleased to offer you the opportunity to address the comments from the reviewers in a revision that we anticipate should not take you very long. We will then assess your revised manuscript and your response to the reviewers' comments with our Academic Editor. As you will see, one of the comments refers to the inaccessibility of the data repository and Reviewer 3 indicated that they would like to review the data and code for an in-depth review.

We expect to receive your revised manuscript within 1 month. Please email us (plosbiology@plos.org) if you have any questions or concerns, or would like to request an extension.

At this stage, your manuscript remains formally under active consideration at our journal; please notify us by email if you do not intend to submit a revision so that we withdraw the manuscript.

**IMPORTANT - SUBMITTING YOUR REVISION**

Your revisions should address the specific points made by each reviewer. Please submit the following files along with your revised manuscript:

1. A 'Response to Reviewers' file - this should detail your responses to the editorial requests, present a point-by-point response to all of the reviewers' comments, and indicate the changes made to the manuscript.

*NOTE: In your point-by-point response to the reviewers, please provide the full context of each review. Do not selectively quote paragraphs or sentences to reply to. The entire set of reviewer comments should be present in full and each specific point should be responded to individually.

You should also cite any additional relevant literature that has been published since the original submission and mention any additional citations in your response.

2. In addition to a clean copy of the manuscript, please also upload a 'track-changes' version of your manuscript that specifies the edits made. This should be uploaded as a "Revised Article with Changes Highlighted " file type.

*Resubmission Checklist*

When you are ready to resubmit your revised manuscript, please refer to this resubmission checklist: https://plos.io/Biology_Checklist

To submit a revised version of your manuscript, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' where you will find your submission record.

Please make sure to read the following important policies and guidelines while preparing your revision:

*Published Peer Review*

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

*PLOS Data Policy*

Please note that as a condition of publication PLOS' data policy (http://journals.plos.org/plosbiology/s/data-availability) requires that you make available all data used to draw the conclusions arrived at in your manuscript. If you have not already done so, you must include any data used in your manuscript either in appropriate repositories, within the body of the manuscript, or as supporting information (N.B. this includes any numerical values that were used to generate graphs, histograms etc.). For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5

*Blot and Gel Data Policy*

We require the original, uncropped and minimally adjusted images supporting all blot and gel results reported in an article's figures or Supporting Information files. We will require these files before a manuscript can be accepted so please prepare them now, if you have not already uploaded them. Please carefully read our guidelines for how to prepare and upload this data: https://journals.plos.org/plosbiology/s/figures#loc-blot-and-gel-reporting-requirements

*Protocols deposition*

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Thank you again for your submission to our journal. We hope that our editorial process has been constructive thus far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Christian

Christian Schnell, PhD

Senior Editor

PLOS Biology

cschnell@plos.org

----------------------------------------------------------------

REVIEWS:

Reviewer #1: Reviewer comments

Summary

The authors investigate the role of response times (RT) in how individuals learn about others' social preferences. The key novel contribution is the finding that RT is predictive of social choice preferences, meaning that observers could infer others' choices were more prosocial versus more selfish when provided only with information about RT (i.e., without information about choices). The authors argue that this deviates from the predictions of an optimal Bayesian model in which people are expected to learn equally well from choices and RT information.

Comments

1. Regarding the Bayesian optimal observer model, something that was unclear in the manuscript is why or in what sense is this model optimal? Does it maximize some performance measure (e.g., accuracy or reward rate per unit time)? A few sentences clarifying this would be beneficial.

2. From the description in the Materials and Methods section, the Bayesian optimal model seems to be comprised of several different models (e.g., Weiner first-passage time, soft-max, WFPT integrated over both choices, and the 'prior equals likelihood' model) which are applied depending on the type of data provided (RT/choices/both/none). The manuscript would benefit from giving a brief description of and motivation for this 'benchmark' model earlier on, such as when the model is first mentioned in the introduction. Doing so would help readers understand the reason for choosing these models as a benchmark for comparison.

3. It is stated in a few places (Abstract/Intro/Discussion) that "from a Bayesian perspective, people should be able to learn equally well from choices and RTs". I think this statement needs a bit more motivation/unpacking as to why choices and RTs should be equally informative when predicting choices. I would think that choice information would be more informative (weighted more heavily when updating a prior) when the task is to infer people's choice preferences. For example, if my task is to infer someone's choice preferences, and I'm given the option to either view their 10 previous choices or 10 previous RTs, I can't really think of a situation in which I'd choose the RT information. If, on the other hand, my task was to infer someone's mean RT, then the RT information would indeed be more informative, and I would disregard the choice information. The reported behavioural results appear to support this strategy: "These results suggest that, in our task, participants used the RT information when no other piece of information was available, but they seemed to disregard RT when the choice information was available" (p.12). Thus, the Bayesian optimal model seems to be somewhat of a 'straw man' rather than a serious competitor against the other models presented in the paper. This could be remedied by providing some further explanation and motivation for this model.

4. Aside from these minor points, I thought the paper was well written and provided a comprehensive analysis of an interesting dataset.

Reviewer #2: The authors examine whether or not people use different combinations of the speed and choices of other peoples decisions when making inferences about social preferences of others. Their main conclusions are the people are able to learn other peoples' social preferences even when they only observed the response time of others'. This is a very interesting finding and, as far as I know, the extent to which response time can be used to infer the social preferences of others has not been carefully examined, which makes this study a novel and exciting addition to the literature.

The manuscript is short and sweet -- they use a direct and straightforward design and show a clear main effect of response time in the accuracy of the inferences drawn about others' social preferences. They also use the response time to update a simple reinforcement learning model to allow the model to learn to predict subsequent choice and response times. The model is already fairly established in other applications, so simply adding response time in the current context is a natural step. The authors also use a Bayesian optimal model as a point of comparison and evaluating their RL model. Although I didn't examine this BO model closely, the authors show rather large qualitative (and quantitative) differences in the accuracy of the two models, with the RL model clearly outperforming the BO model.

Given the straightforward approach (in terms of the experimental design and computational modeling) as well as the very clear experimental and computational modeling results, I really don't have much to offer for suggested improvements. The authors provided a very accessible manuscript and have cited all the relevant literature that I can think of, so I would recommend acceptance.

Reviewer #3: The manuscript addresses an experiment in which participants observe information about others' decisions or response times to make decisions in the context of prosocial behavior, and finds that people can infer others' preferences from the process by which they make decisions. There is much to like about the paper, and although the details of the developed learning model are outside of my area of expertise to assess, I find myself to be enthusiastic overall.

However, although it appears that there is documentation of the materials, data and code available on github, the link provided lead me to a 404 error message. Regarding the OSF project related to the preregistration, I believe I did not have reading rights. Therefore, it was not possible for me to review the paper in depth. I would be happy to do so in another round of reviews. I did find and review the preregistration, which clearly stipulated the manipulated conditions and hypotheses but gave only sparse indications of the overall procedure or analyses to be expected. In the manuscript, it would be great if it was even easier to spot which results related to preregistered hypotheses. In any case, I find it highly commendable that the authors provided a preregistration.

In addition, I have a few comments about the behavioral results that I believe should be addressed in a revision of the manuscript.

Learning about Dictators' Preferences (Estimation Phase)

In the results section (p. 5, bottom paragraph), the authors report that observers' learning of the dictators' preferences was better than chance overall, and separately for the four conditions induced (RT only, choice only, both, none), in correspondence with the preregistered hypothesis that "(1) the mean accuracy of participants to predict the dictators' social preference will be higher than chance in three out of four conditions, [and] (2) this accuracy will increase with the amount of information provided to the participants". The paragraph is quite densely written, so that perhaps it might be helpful to add a bit of explanation back in. For instance: starting with line 178, do you report first the overall effect of learning (in all conditions), then the effect separately for the RT only condition, then a post-hoc contrast between the choice only and both conditions, and then the effect separately for the none condition? Perhaps a table would help summarize these results efficiently, and give you more space for elaborating in the text.

Moreover, I expected to see an interaction effect reported about the effect of condition x time of measurement (before first trial, after first trial, after second trial, etc.) on the perception of the dictators' preferences. It seems to me like the perceptions of the dictators' preferences would probably get better across the four trials, maybe particularly so in the both condition and perhaps not at all in the none condition.

Regarding the dictators' preference in the methods section, it didn't quite become clear to me how the participants indicated what they thought the dictators' preferences were.

Extrapolation to Unseen Decisions

Minor point: I noticed that the terminology and the perspective on the data changed a bit in this section. Where you referred to RT only condition before, you now refer to the RT visibility (which presumably also addressed the both condition). For consistency and to increase the ease with which readers can follow the presented work, perhaps it might be good to keep the perspective on the data constant throughout, or to alert the readers to the difference?

PLoS Biol. 2024 Jun 20;22(6):e3002686. doi: 10.1371/journal.pbio.3002686.r003

Author response to Decision Letter 1

13 Feb 2024

Attachment

Submitted filename: Rebuttal.docx

pbio.3002686.s017.docx^{(170.3KB, docx)}

PLoS Biol. doi: 10.1371/journal.pbio.3002686.r004

Decision Letter 2

Christian Schnell, PhD

19 Mar 2024

Dear Dr Bavard,

Thank you for your patience while we considered your revised manuscript "Beyond choices: humans can infer social preferences from decision speed alone" for consideration as a Research Article at PLOS Biology. Your revised study has now been evaluated by the PLOS Biology editors and the original reviewers.

In light of the reviews, which you will find at the end of this email, we are pleased to offer you the opportunity to address the remaining points from the reviewers in a revision that we anticipate should not take you very long. We will then assess your revised manuscript and your response to the reviewers' comments with our Academic Editor aiming to avoid further rounds of peer-review, although might need to consult with the reviewers, depending on the nature of the revisions.

As you can see, Reviewer 3 was not able to access the data to review them. One issue is that one of the OSF repositories does not seem to contain the required data (https://osf.io/fseyw/) and that the github repository contains scripts and data as matlab files, making them inaccessible for the reviewers (and some of our readers). Would you able to provide these data and scripts in a more accessible way? Please don't hesitate to contact me if you have any questions or would like to discuss alternative options.

We expect to receive your revised manuscript within 1 month. Please email us (plosbiology@plos.org) if you have any questions or concerns, or would like to request an extension.

At this stage, your manuscript remains formally under active consideration at our journal; please notify us by email if you do not intend to submit a revision so that we withdraw the manuscript.

**IMPORTANT - SUBMITTING YOUR REVISION**

Your revisions should address the specific points made by each reviewer. Please submit the following files along with your revised manuscript:

You should also cite any additional relevant literature that has been published since the original submission and mention any additional citations in your response.

*Resubmission Checklist*

When you are ready to resubmit your revised manuscript, please refer to this resubmission checklist: https://plos.io/Biology_Checklist

Please make sure to read the following important policies and guidelines while preparing your revision:

*Published Peer Review*

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

*PLOS Data Policy*

*Blot and Gel Data Policy*

*Protocols deposition*

Sincerely,

Christian

Christian Schnell, PhD

Senior Editor

PLOS Biology

cschnell@plos.org

----------------------------------------------------------------

REVIEWS:

Reviewer #1 (Russell J. Boag): The authors have addressed my comments satisfactorily.

Reviewer #3: To the degree that the manuscript coincides with my expertise, I found most of my comments addressed appropriately. However, I continue to quarrel with details relating to the Open practices for this manuscript.

Unfortunately, the linked OSF project was empty. Therefore, performing a review based on the data and materials that should presumably be contained therein was not possible. I think it would be great if materials, such as the program used to elicit responses from participants, were made available in this repository. Given that PLOS Biology is an outlet that is committed to transparency regarding research protocols, this seems like an important step to me.

In the github project, I could not locate a file that looked like it contained materials, too. I assume the data might be stored in .mat files (e.g., data_fig.mat), which makes them inaccessible to users like me who are not running Matlab. Might I suggest saving a more accessible version of the raw data, in line with the FAIR principle of interoperability?

PLoS Biol. 2024 Jun 20;22(6):e3002686. doi: 10.1371/journal.pbio.3002686.r005

Author response to Decision Letter 2

25 Mar 2024

Attachment

Submitted filename: Rebuttal2.docx

pbio.3002686.s018.docx^{(15.1KB, docx)}

PLoS Biol. doi: 10.1371/journal.pbio.3002686.r006

Decision Letter 3

Christian Schnell, PhD

24 Apr 2024

Dear Sophie,

Thank you for your patience while we considered your revised manuscript "Beyond choices: humans can infer social preferences from decision speed alone" for publication as a Research Article at PLOS Biology. This revised version of your manuscript has been evaluated by the PLOS Biology editors, the Academic Editor and one of the original reviewers.

Based on the reviews and on our Academic Editor's assessment of your revision, we are likely to accept this manuscript for publication, provided you satisfactorily address the following data and other policy-related requests.

* We would like to suggest a different title to improve readability: "Humans can infer social preferences from decision speed alone"

* Please add the links to the funding agencies in the Financial Disclosure statement in the manuscript details.

* All research involving human participants must have been approved by the authors' Institutional Review Board (IRB) or an equivalent committee, and must have been conducted according to the principles expressed in the Declaration of Helsinki. Please provide the approval number and a statement that your study has been conducted according to the principles expressed in the Declaration of Helsinki in the manuscript.

DATA POLICY:

You may be aware of the PLOS Data Policy, which requires that all data be made available without restriction: http://journals.plos.org/plosbiology/s/data-availability. For more information, please also see this editorial: http://dx.doi.org/10.1371/journal.pbio.1001797

Note that we do not require all raw data. Rather, we ask that all individual quantitative observations that underlie the data summarized in the figures and results of your paper be made available in one of the following forms:

1) Supplementary files (e.g., excel). Please ensure that all data files are uploaded as 'Supporting Information' and are invariably referred to (in the manuscript, figure legends, and the Description field when uploading your files) using the following format verbatim: S1 Data, S2 Data, etc. Multiple panels of a single or even several figures can be included as multiple sheets in one excel file that is saved using exactly the following convention: S1_Data.xlsx (using an underscore).

2) Deposition in a publicly available repository. Please also provide the accession code or a reviewer link so that we may view your data before publication.

Regardless of the method selected, please ensure that you provide the individual numerical values that underlie the summary data displayed in the following figure panels as they are essential for readers to assess your analysis and to reproduce it: 1D, 2A, 3ABCD.

NOTE: the numerical data provided should include all replicates AND the way in which the plotted mean and errors were derived (it should not present only the mean/average values).

Please also ensure that figure legends in your manuscript include information on where the underlying data can be found, and ensure your supplemental data file/s has a legend.

Please ensure that your Data Statement in the submission system accurately describes where your data can be found.

------------------------------------------------------------------------

CODE POLICY

Per journal policy, if you have generated any custom code during the curse of this investigation, please make it available without restrictions upon publication. Please ensure that the code is sufficiently well documented and reusable, and that your Data Statement in the Editorial Manager submission system accurately describes where your code can be found. As the code that you have generated to obtain the data is important to support the conclusions of your manuscript (as also mentioned by Reviewer 3), its deposition is required for acceptance.

Please note that we cannot accept sole deposition of code in GitHub, as this could be changed after publication. However, you can archive this version of your publicly available GitHub code to Zenodo. Once you do this, it will generate a DOI number, which you will need to provide in the Data Accessibility Statement (you are welcome to also provide the GitHub access information). See the process for doing this here: https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content

As you address these items, please take this last chance to review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the cover letter that accompanies your revised manuscript.

We expect to receive your revised manuscript within two weeks.

To submit your revision, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' to find your submission record. Your revised submission must include the following:

- a cover letter that should detail your responses to any editorial requests, if applicable, and whether changes have been made to the reference list

- a Response to Reviewers file that provides a detailed response to the reviewers' comments (if applicable, if not applicable please do not delete your existing 'Response to Reviewers' file.)

- a track-changes file indicating any changes that you have made to the manuscript.

NOTE: If Supporting Information files are included with your article, note that these are not copyedited and will be published as they are submitted. Please ensure that these files are legible and of high quality (at least 300 dpi) in an easily accessible file format. For this reason, please be aware that any references listed in an SI file will not be indexed. For more information, see our Supporting Information guidelines:

https://journals.plos.org/plosbiology/s/supporting-information

*Published Peer Review History*

Please note that you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details:

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

*Press*

Should you, your institution's press office or the journal office choose to press release your paper, please ensure you have opted out of Early Article Posting on the submission form. We ask that you notify us as soon as possible if you or your institution is planning to press release the article.

*Protocols deposition*

Please do not hesitate to contact me should you have any questions.

Sincerely,

Christian

Christian Schnell, PhD

Senior Editor

cschnell@plos.org

PLOS Biology

------------------------------------------------------------------------

Reviewer remarks:

Reviewer #3: Thanks for taking up my suggestions to improve the transparency of the documentation. It seems like you missed out on one request (to also share materials used to obtain the data, e.g., stimulus materials and instructions, program run to present stimuli). I hope these materials can be made available nevertheless.

PLoS Biol. 2024 Jun 20;22(6):e3002686. doi: 10.1371/journal.pbio.3002686.r007

Author response to Decision Letter 3

1 May 2024

Attachment

Submitted filename: Rebuttal2.docx

pbio.3002686.s019.docx^{(15.1KB, docx)}

PLoS Biol. doi: 10.1371/journal.pbio.3002686.r008

Decision Letter 4

Christian Schnell, PhD

3 May 2024

Dear Sophie,

Thank you for addressing most of the editorial requests. However, a few items were not fully addressed:

*) Title: I understand that you'd like to keep the choices in the title. However, we do not allow split titles for Research Articles. Would you be able to to suggest a title without the :? I've tried a couple of variants but they were all worse than the original suggestion. It would be helpful if you could send me your suggestions via email to cschnell@plos.org before submitting the revision, so we can discuss this without you going through another round via Editorial Manager.

*) Thank you for uploading the stimulation data to the github repository. However, the GitHub repository could be changed after publication. Therefore, can

you please archive this version of your publicly available GitHub repository to Zenodo? Please update the Data Accessibility Statement afterwards accordingly (you are welcome to also provide the GitHub access information). See the process for doing this here: https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content

*) Source data: We ask that all individual quantitative observations that underlie the data summarized in the figures and results of your paper be made available in one of the following forms:

2) Deposition in a publicly available repository. Please also provide the accession code or a reviewer link so that we may view your data before publication.

NOTE: the numerical data provided should include all replicates AND the way in which the plotted mean and errors were derived (it should not present only the mean/average values).

Please also ensure that figure legends in your manuscript include information on where the underlying data can be found, and ensure your supplemental data file/s has a legend.

Please ensure that your Data Statement in the submission system accurately describes where your data can be found.

You can see an example of this in this paper: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3002591#pbio.3002591.s012

We expect to receive your revised manuscript within two weeks.

- a cover letter that should detail your responses to any editorial requests, if applicable, and whether changes have been made to the reference list

- a Response to Reviewers file that provides a detailed response to the reviewers' comments (if applicable, if not applicable please do not delete your existing 'Response to Reviewers' file.)

- a track-changes file indicating any changes that you have made to the manuscript.

https://journals.plos.org/plosbiology/s/supporting-information

*Published Peer Review History*

https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/

*Press*

*Protocols deposition*

Please do not hesitate to contact me should you have any questions.

Sincerely,

Christian

Christian Schnell, PhD

Senior Editor

cschnell@plos.org

PLOS Biology

PLoS Biol. 2024 Jun 20;22(6):e3002686. doi: 10.1371/journal.pbio.3002686.r009

Author response to Decision Letter 4

13 May 2024

Attachment

Submitted filename: Rebuttal2.docx

pbio.3002686.s020.docx^{(15.1KB, docx)}

PLoS Biol. doi: 10.1371/journal.pbio.3002686.r010

Decision Letter 5

Christian Schnell, PhD

21 May 2024

Dear Dr Bavard,

Thank you for the submission of your revised Research Article "Humans can infer social preferences from decision speed alone" for publication in PLOS Biology. On behalf of my colleagues and the Academic Editor, Thorsten Kahnt, I am pleased to say that we can in principle accept your manuscript for publication, provided you address any remaining formatting and reporting issues. These will be detailed in an email you should receive within 2-3 business days from our colleagues in the journal operations team; no action is required from you until then. Please note that we will not be able to formally accept your manuscript and schedule it for publication until you have completed any requested changes.

Please take a minute to log into Editorial Manager at http://www.editorialmanager.com/pbiology/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production process.

PRESS

We frequently collaborate with press offices. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximise its impact. If the press office is planning to promote your findings, we would be grateful if they could coordinate with biologypress@plos.org. If you have previously opted in to the early version process, we ask that you notify us immediately of any press plans so that we may opt out on your behalf.

We also ask that you take this opportunity to read our Embargo Policy regarding the discussion, promotion and media coverage of work that is yet to be published by PLOS. As your manuscript is not yet published, it is bound by the conditions of our Embargo Policy. Please be aware that this policy is in place both to ensure that any press coverage of your article is fully substantiated and to provide a direct link between such coverage and the published work. For full details of our Embargo Policy, please visit http://www.plos.org/about/media-inquiries/embargo-policy/.

Thank you again for choosing PLOS Biology for publication and supporting Open Access publishing. We look forward to publishing your study.

Sincerely,

Christian

Christian Schnell, PhD

Senior Editor

PLOS Biology

cschnell@plos.org

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Text. Trial selection prior to running the main experiment.

(DOCX)

pbio.3002686.s001.docx^{(32.4KB, docx)}

S2 Text. Observers own preference impact their uniformed guesses.

(DOCX)

pbio.3002686.s002.docx^{(27.6KB, docx)}

S3 Text. Qualitative model comparison favors the RL model.

(DOCX)

pbio.3002686.s003.docx^{(28.4KB, docx)}

S4 Text. Time perception and Social Value Orientation score.

(DOCX)

pbio.3002686.s004.docx^{(28.2KB, docx)}

S5 Text. Prediction phase’s GLMM sanity check.

(DOCX)

pbio.3002686.s005.docx^{(28.1KB, docx)}

S1 Fig. Trial selection procedure and corresponding RT.

(PDF)

pbio.3002686.s006.pdf^{(156.1KB, pdf)}

S2 Fig. Observers’ estimations as a function of their own preference.

(PDF)

pbio.3002686.s007.pdf^{(88.8KB, pdf)}

S3 Fig. Additional behavioral results in the estimation phase.

(PDF)

pbio.3002686.s008.pdf^{(59KB, pdf)}

S4 Fig. RL model predictions.

(PDF)

pbio.3002686.s009.pdf^{(136.7KB, pdf)}

S5 Fig. Additional qualitative model comparison.

(PDF)

pbio.3002686.s010.pdf^{(121.8KB, pdf)}

S6 Fig. Qualitative model comparison for full model space.

(PDF)

pbio.3002686.s011.pdf^{(251.2KB, pdf)}

S7 Fig. Between-group comparisons in the estimation phase.

(PDF)

pbio.3002686.s012.pdf^{(143.6KB, pdf)}

S8 Fig. Visual representation of the estimation phase.

(PDF)

pbio.3002686.s013.pdf^{(20.4KB, pdf)}

S9 Fig. DDM simulations on the Dictator Game task.

(PDF)

pbio.3002686.s014.pdf^{(47.7KB, pdf)}

S10 Fig. Results of additional experiments.

(PDF)

pbio.3002686.s015.pdf^{(94.8KB, pdf)}

S1 Table. Results from GLMM fitted on observers’ RT in the prediction phase.

The GLMM (generalized linear mixed model with Gamma distribution and identity link function) was fitted on the observers’ RT, with choice visibility in the estimation phase, RT visibility in the estimation phase, and trial duration (i.e., whether the dictator’s RT was short or long), as independent variables. Denotation: Du = duration (fast or slow), Ch = choice visibility (displayed or not), RT = RT visibility (displayed or not), ***p < 0.001. Data and analysis scripts underlying this figure are available at https://github.com/sophiebavard/beyond-choices.

(DOCX)

pbio.3002686.s016.docx^{(28.9KB, docx)}

Attachment

Submitted filename: Rebuttal.docx

pbio.3002686.s017.docx^{(170.3KB, docx)}

Attachment

Submitted filename: Rebuttal2.docx

pbio.3002686.s018.docx^{(15.1KB, docx)}

Attachment

Submitted filename: Rebuttal2.docx

pbio.3002686.s019.docx^{(15.1KB, docx)}

Attachment

Submitted filename: Rebuttal2.docx

pbio.3002686.s020.docx^{(15.1KB, docx)}

Data Availability Statement

[pbio.3002686.ref001] 1.Wu CM, Vélez N, Cushman FA, Dezza IC, Schulz E, Wu CM. Representational Exchange in Human Social Learning. The Drive for Knowledge. Published online. 2022:169–192. [Google Scholar]

[pbio.3002686.ref002] 2.Apps MAJ, Green R, Ramnani N. Reinforcement learning signals in the anterior cingulate cortex code for others’ false beliefs. NeuroImage. 2013;64:1–9. doi: 10.1016/j.neuroimage.2012.09.010 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref003] 3.Burke CJ, Baddeley M, Tobler PN, Schultz W. Partial Adaptation of Obtained and Observed Value Signals Preserves Information about Gains and Losses. J Neurosci. 2016;36(39):10016–10025. doi: 10.1523/JNEUROSCI.0487-16.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref004] 4.Joiner J, Piva M, Turrin C, Chang SWC. Social learning through prediction error in the brain. NPJ Science Learn. 2017;2(1):1–9. doi: 10.1038/s41539-017-0009-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref005] 5.Najar A, Bonnet E, Bahrami B, Palminteri S. The actions of others act as a pseudo-reward to drive imitation in the context of social reinforcement learning. PLoS Biol. 2020;18(12):e3001028. doi: 10.1371/journal.pbio.3001028 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref006] 6.Charpentier CJ, Iigaya K, O’Doherty JP. A Neuro-computational Account of Arbitration between Choice Imitation and Goal Emulation during Human Observational Learning. Neuron. 2020;106(4):687–699.e7. doi: 10.1016/j.neuron.2020.02.028 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref007] 7.Burke CJ, Tobler PN, Baddeley M, Schultz W. Neural mechanisms of observational learning. Proc Natl Acad Sci U S A. 2010;107(32):14431–14436. doi: 10.1073/pnas.1003111107 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref008] 8.Busemeyer JR. Decision making under uncertainty: A comparison of simple scalability, fixed-sample, and sequential-sampling models. J Exp Psychol Learn Mem Cogn. 1985;11(3):538–564. doi: 10.1037//0278-7393.11.3.538 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref009] 9.Moffatt PG. Stochastic Choice and the Allocation of Cognitive Effort. Exp Econ. 2005;8(4):369–388. doi: 10.1007/s10683-005-5375-6 [DOI] [Google Scholar]

[pbio.3002686.ref010] 10.Gabaix, Laibson D. Bounded Rationality and Directed Cognition [Working Paper].

[pbio.3002686.ref011] 11.Gabaix X, Laibson D, Moloche G, Weinberg S. Costly Information Acquisition: Experimental Analysis of a Boundedly Rational Model. Am Econ Rev. 2006;96(4):1043–1068. [Google Scholar]

[pbio.3002686.ref012] 12.Alós-Ferrer C, Granić ÐG, Kern J, Wagner AK. Preference reversals: Time and again. J Risk Uncertainty. 2016;52(1):65–97. doi: 10.1007/s11166-016-9233-z [DOI] [Google Scholar]

[pbio.3002686.ref013] 13.Konovalov A, Krajbich I. Revealed strength of preference: Inference from response times. Judgm Decis Mak. 2019;14(4):381–394. doi: 10.1017/S1930297500006082 [DOI] [Google Scholar]

[pbio.3002686.ref014] 14.Alós-Ferrer C, Garagnani M. Strength of Preference and Decisions under Risk. Department of Economics—University of Zurich; 2022. Accessed 2023 Sep 4. Available from: https://econpapers.repec.org/paper/zureconwp/330.htm. [Google Scholar]

[pbio.3002686.ref015] 15.Chabris CF, Laibson D, Morris CL, Schuldt JP, Taubinsky D. The allocation of time in decision-making. J Eur Econ Assoc. 2009;7(2):628–637. doi: 10.1162/jeea.2009.7.2-3.628 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref016] 16.Krajbich I, Bartling B, Hare T, Fehr E. Rethinking fast and slow based on a critique of reaction-time reverse inference. Nat Commun. 2015;6(1):7455. doi: 10.1038/ncomms8455 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref017] 17.Bulley A, Lempert KM, Conwell C, Irish M, Schacter DL. Intertemporal choice reflects value comparison rather than self-control: insights from confidence judgements. Philos Trans R Soc Lond B Biol Sci. 1866;2022(377):20210338. doi: 10.1098/rstb.2021.0338 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref018] 18.Krajbich I, Armel C, Rangel A. Visual fixations and the computation and comparison of value in simple choice. Nat Neurosci. 2010;13(10):1292–1298. doi: 10.1038/nn.2635 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref019] 19.Krajbich I, Rangel A. Multialternative drift-diffusion model predicts the relationship between visual fixations and choice in value-based decisions. Proc Natl Acad Sci U S A. 2011;108(33):13852–13857. doi: 10.1073/pnas.1101328108 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref020] 20.Krajbich I, Lu D, Camerer C, Rangel A. The Attentional Drift-Diffusion Model Extends to Simple Purchasing Decisions. Front Psychol. 2012;3. Accessed 2023 Sep 4. Available from: https://www.frontiersin.org/articles/10.3389/fpsyg.2012.00193. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref021] 21.Fisher G. An attentional drift diffusion model over binary-attribute choice. Cognition. 2017;168:34–45. doi: 10.1016/j.cognition.2017.06.007 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref022] 22.Clithero JA. Improving out-of-sample predictions using response times and a model of the decision process. J Econ Behav Organ. 2018;148:344–375. doi: 10.1016/j.jebo.2018.02.007 [DOI] [Google Scholar]

[pbio.3002686.ref023] 23.Clithero JA. Response times in economics: Looking through the lens of sequential sampling models. J Econ Psychol. 2018;69:61–86. doi: 10.1016/j.joep.2018.09.008 [DOI] [Google Scholar]

[pbio.3002686.ref024] 24.Gates V, Callaway F, Ho MK, Griffiths TL. A rational model of people’s inferences about others’ preferences based on response times. Cognition. 2021;217:104885. doi: 10.1016/j.cognition.2021.104885 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref025] 25.Gluth S, Sommer T, Rieskamp J, Büchel C. Effective connectivity between hippocampus and ventromedial prefrontal cortex controls preferential choices from memory. Neuron. 2015;86(4):1078–1090. doi: 10.1016/j.neuron.2015.04.023 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref026] 26.Gluth S, Kern N, Kortmann M, Vitali CL. Value-based attention but not divisive normalization influences decisions with multiple alternatives. Nat Hum Behav. 2020;4(6):634–645. doi: 10.1038/s41562-020-0822-0 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref027] 27.Weilbächer RA, Krajbich I, Rieskamp J, Gluth S. The influence of visual attention on memory-based preferential choice. Cognition. 2021;215:104804. doi: 10.1016/j.cognition.2021.104804 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref028] 28.Liu S, Netzer N. Happy Times: Measuring Happiness Using Response Times. Published online. 2023. doi: 10.2139/ssrn.4416789 [DOI] [Google Scholar]

[pbio.3002686.ref029] 29.Schotter A, Trevino I. Is response time predictive of choice? An experimental study of threshold strategies. Exp Econ. 2021;24(1):87–117. doi: 10.1007/s10683-020-09651-1 [DOI] [Google Scholar]

[pbio.3002686.ref030] 30.Cotet M, Krajbich I. Response Times in the Wild: eBay Sellers Take Hours Longer to Reject High Offers and Accept Low Offers. Published online. March 14, 2021. doi: 10.2139/ssrn.3804578 [DOI] [Google Scholar]

[pbio.3002686.ref031] 31.Frydman C, Krajbich I. Using Response Times to Infer Others’ Private Information: An Application to Information Cascades. Manag Sci. 2022;68(4):2970–2986. doi: 10.1287/mnsc.2021.3994 [DOI] [Google Scholar]

[pbio.3002686.ref032] 32.Hu J, Konovalov A, Ruff CC. A unified neural account of contextual and individual differences in altruism. O’Connell RG, Frank MJ, O’Connell RG, Hutcherson CA, editors. eLife. 2023;12:e80667. doi: 10.7554/eLife.80667 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref033] 33.Krajbich I, Oud B, Fehr E. Benefits of Neuroeconomic Modeling: New Policy Interventions and Predictors of Preference. Am Econ Rev. 2014;104(5):501–506. [Google Scholar]

[pbio.3002686.ref034] 34.Spiliopoulos L, Ortmann A. The BCD of response time analysis in experimental economics. Exp Econ. 2018;21(2):383–433. doi: 10.1007/s10683-017-9528-1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref035] 35.Alós-Ferrer C, Fehr E, Netzer N. Time Will Tell: Recovering Preferences When Choices Are Noisy. J Pol Econ. 2021;129(6):1828–1877. doi: 10.1086/713732 [DOI] [Google Scholar]

[pbio.3002686.ref036] 36.Konovalov A, Ruff CC. Enhancing models of social and strategic decision making with process tracing and neural data. WIREs Cogn Sci. 2022;13(1):e1559. doi: 10.1002/wcs.1559 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref037] 37.Konovalov A, Krajbich I. Decision Times Reveal Private Information in Strategic Settings: Evidence from Bargaining Experiments1. Econ J. Published online. July 24, 2023:uead055. doi: 10.1093/ej/uead055 [DOI] [Google Scholar]

[pbio.3002686.ref038] 38.Evans AM, van de Calseyde PPFM. The effects of observed decision time on expectations of extremity and cooperation. J Exp Soc Psychol. 2017;68:50–59. doi: 10.1016/j.jesp.2016.05.009 [DOI] [Google Scholar]

[pbio.3002686.ref039] 39.Van de Calseyde PPFM, Keren G, Zeelenberg M. Decision time as information in judgment and choice. Organ Behav Hum Decis Process. 2014;125(2):113–122. doi: 10.1016/j.obhdp.2014.07.001 [DOI] [Google Scholar]

[pbio.3002686.ref040] 40.Richardson E, Keil FC. Thinking takes time: Children use agents’ response times to infer the source, quality, and complexity of their knowledge. Cognition. 2022;224:105073. doi: 10.1016/j.cognition.2022.105073 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref041] 41.Kahneman D, Knetsch JL, Thaler RH. Fairness and the Assumptions of Economics. J Bus. 1986;59(4):S285–S300. [Google Scholar]

[pbio.3002686.ref042] 42.Forsythe R, Horowitz JL, Savin NE, Sefton M. Fairness in Simple Bargaining Experiments. Games Econ Behav. 1994;6(3):347–369. doi: 10.1006/game.1994.1021 [DOI] [Google Scholar]

[pbio.3002686.ref043] 43.Baker CL, Jara-Ettinger J, Saxe R, Tenenbaum JB. Rational quantitative attribution of beliefs, desires and percepts in human mentalizing. Nat Hum Behav. 2017;1(4):1–10. doi: 10.1038/s41562-017-0064 [DOI] [Google Scholar]

[pbio.3002686.ref044] 44.Jern A, Lucas CG, Kemp C. People learn other people’s preferences through inverse decision-making. Cognition. 2017;168:46–64. doi: 10.1016/j.cognition.2017.06.017 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref045] 45.Collette S, Pauli WM, Bossaerts P, O’Doherty J. Neural computations underlying inverse reinforcement learning in the human brain. Elife. 2017;6:e29718. doi: 10.7554/eLife.29718 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref046] 46.Diaconescu AO, Mathys C, Weber LAE, Daunizeau J, Kasper L, Lomakina EI, et al. Inferring on the Intentions of Others by Hierarchical Bayesian Learning. PLoS Comput Biol. 2014;10(9):e1003810. doi: 10.1371/journal.pcbi.1003810 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref047] 47.Lucas CG, Griffiths TL, Xu F, Fawcett C, Gopnik A, Kushnir T, et al. The Child as Econometrician: A Rational Model of Preference Understanding in Children. PLoS ONE. 2014;9(3):e92160. doi: 10.1371/journal.pone.0092160 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref048] 48.Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS. Learning the value of information in an uncertain world. Nat Neurosci. 2007;10(9):1214–1221. doi: 10.1038/nn1954 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref049] 49.McGuire JT, Nassar MR, Gold JI, Kable JW. Functionally Dissociable Influences on Learning Rate in a Dynamic Environment. Neuron. 2014;84(4):870–881. doi: 10.1016/j.neuron.2014.10.013 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref050] 50.Ernst MO, Banks MS. Humans integrate visual and haptic information in a statistically optimal fashion. Nature. 2002;415(6870):429–433. doi: 10.1038/415429a [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref051] 51.Rescorla RA, Wagner AR. A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. Classical conditioning II: Current research and theory. 1972;2:64–99. [Google Scholar]

[pbio.3002686.ref052] 52.Sutton RS, Barto AG. Reinforcement Learning: An Introduction. IEEE Trans Neural Netw. 1998;9(5):1054–1054. doi: 10.1109/TNN.1998.712192 [DOI] [Google Scholar]

[pbio.3002686.ref053] 53.Jordan JJ, Hoffman M, Nowak MA, Rand DG. Uncalculating cooperation is used to signal trustworthiness. Proc Natl Acad Sci U S A. 2016;113(31):8658–8663. doi: 10.1073/pnas.1601280113 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref054] 54.Glimcher PW, Fehr E. Neuroeconomics: Decision Making and the Brain. Academic Press; 2013. [Google Scholar]

[pbio.3002686.ref055] 55.Lempert R, Nakicenovic N, Sarewitz D, Schlesinger M. Characterizing Climate-Change Uncertainties for Decision-Makers. An Editorial Essay. Climatic Change. 2004;65(1):1–9. doi: 10.1023/B:CLIM.0000037561.75281.b3 [DOI] [Google Scholar]

[pbio.3002686.ref056] 56.Dunne S O’Doherty JP. Insights from the application of computational neuroimaging to social neuroscience. Curr Opin Neurobiol. 2013;23(3):387–392. doi: 10.1016/j.conb.2013.02.007 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref057] 57.Charpentier CJ O’Doherty JP. The application of computational models to social neuroscience: promises and pitfalls. Soc Neurosci. 2018;13(6):637–647. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref058] 58.Devaine M, Daunizeau J. Learning about and from others’ prudence, impatience or laziness: The computational bases of attitude alignment. PLoS Comput Biol. 2017;13(3):e1005422. doi: 10.1371/journal.pcbi.1005422 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref059] 59.Jara-Ettinger J, Gweon H, Schulz LE, Tenenbaum JB. The Naïve Utility Calculus: Computational Principles Underlying Commonsense Psychology. Trends Cogn Sci. 2016;20(8):589–604. doi: 10.1016/j.tics.2016.05.011 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref060] 60.Devetag G, Di Guida S, Polonio L. An eye-tracking study of feature-based choice in one-shot games. Exp Econ. 2016;19(1):177–201. doi: 10.1007/s10683-015-9432-5 [DOI] [Google Scholar]

[pbio.3002686.ref061] 61.Polonio L, Coricelli G. Testing the level of consistency between choices and beliefs in games using eye-tracking. Games Econ Behav. 2019;113:566–586. doi: 10.1016/j.geb.2018.11.003 [DOI] [Google Scholar]

[pbio.3002686.ref062] 62.Hausfeld J, von Hesler K, Goldlücke S. Strategic gaze: an interactive eye-tracking study. Exp Econ. 2021;24(1):177–205. doi: 10.1007/s10683-020-09655-x [DOI] [Google Scholar]

[pbio.3002686.ref063] 63.Fischbacher U, Hausfeld J, Renerte B. Strategic incentives undermine gaze as a signal of prosocial motives. Games Econ Behav. 2022;136:63–91. doi: 10.1016/j.geb.2022.07.006 [DOI] [Google Scholar]

[pbio.3002686.ref064] 64.Costa-Gomes M, Crawford VP, Broseta B. Cognition and Behavior in Normal-Form Games: An Experimental Study. Econometrica. 2001;69(5):1193–1235. doi: 10.1111/1468-0262.00239 [DOI] [Google Scholar]

[pbio.3002686.ref065] 65.Johnson EJ, Camerer C, Sen S, Rymon T. Detecting Failures of Backward Induction: Monitoring Information Search in Sequential Bargaining. J Econ Theory. 2002;104(1):16–47. doi: 10.1006/jeth.2001.2850 [DOI] [Google Scholar]

[pbio.3002686.ref066] 66.Costa-Gomes MA, Crawford VP. Cognition and Behavior in Two-Person Guessing Games: An Experimental Study. Am Econ Rev. 2006;96(5):1737–1768. doi: 10.1257/aer.96.5.1737 [DOI] [Google Scholar]

[pbio.3002686.ref067] 67.Brocas I, Carrillo JD, Wang SW, Camerer CF. Imperfect Choice or Imperfect Attention? Understanding Strategic Thinking in Private Information Games. Rev Econ Stud. 2014;81(3):944–970. [Google Scholar]

[pbio.3002686.ref068] 68.Stillman PE, Shen X, Ferguson MJ. How Mouse-tracking Can Advance Social Cognitive Theory. Trends Cogn Sci. 2018;22(6):531–543. doi: 10.1016/j.tics.2018.03.012 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref069] 69.McClure SM, Li J, Tomlin D, Cypert KS, Montague LM, Montague PR. Neural correlates of behavioral preference for culturally familiar drinks. Neuron. 2004;44(2):379–387. doi: 10.1016/j.neuron.2004.09.019 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref070] 70.Berns GS, Capra CM, Chappelow J, Moore S, Noussair C. Nonlinear Neurobiological Probability Weighting Functions For Aversive Outcomes. Neuroimage. 2008;39(4):2047–2057. doi: 10.1016/j.neuroimage.2007.10.028 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref071] 71.Grosenick L, Greer S, Knutson B. Interpretable Classifiers for fMRI Improve Prediction of Purchases. IEEE Trans Neural Syst Rehabil Eng. 2008;16(6):539–548. doi: 10.1109/TNSRE.2008.926701 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref072] 72.Lebreton M, Jorge S, Michel V, Thirion B, Pessiglione M. An automatic valuation system in the human brain: evidence from functional neuroimaging. Neuron. 2009;64(3):431–439. doi: 10.1016/j.neuron.2009.09.040 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref073] 73.Levy I, Lazzaro SC, Rutledge RB, Glimcher PW. Choice from Non-Choice: Predicting Consumer Preferences from Blood Oxygenation Level-Dependent Signals Obtained during Passive Viewing. J Neurosci. 2011;31(1):118–125. doi: 10.1523/JNEUROSCI.3214-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref074] 74.Smith A, Bernheim BD, Camerer CF, Rangel A. Neural Activity Reveals Preferences without Choices. Am Econ J Microecon. 2014;6(2):1–36. doi: 10.1257/mic.6.2.1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref075] 75.Tusche A, Böckler A, Kanske P, Trautwein FM, Singer T. Decoding the Charitable Brain: Empathy, Perspective Taking, and Attention Shifts Differentially Predict Altruistic Giving. J Neurosci. 2016;36(17):4719–4732. doi: 10.1523/JNEUROSCI.3392-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref076] 76.Webb R, Levy I, Lazzaro SC, Rutledge RB, Glimcher PW. Neural random utility: Relating cardinal neural observables to stochastic choice behavior. J Neurosci Psychol Econ. 2019;12(1):45–72. doi: 10.1037/npe0000101 [DOI] [Google Scholar]

[pbio.3002686.ref077] 77.Haxby JV, Connolly AC, Guntupalli JS. Decoding Neural Representational Spaces Using Multivariate Pattern Analysis. Annu Rev Neurosci. 2014;37(1):435–456. doi: 10.1146/annurev-neuro-062012-170325 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref078] 78.Boorman ED O’Doherty JP, Adolphs R, Rangel A. The behavioral and neural mechanisms underlying the tracking of expertise. Neuron. 2013;80(6):1558–1571. doi: 10.1016/j.neuron.2013.10.024 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref079] 79.Hampton AN, Bossaerts P, O’Doherty JP. The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. J Neurosci. 2006;26(32):8360–8367. doi: 10.1523/JNEUROSCI.1010-06.2006 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref080] 80.Wittmann MK, Kolling N, Faber NS, Scholl J, Nelissen N, Rushworth MFS. Self-Other Mergence in the Frontal Cortex during Cooperation and Competition. Neuron. 2016;91(2):482–493. doi: 10.1016/j.neuron.2016.06.022 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref081] 81.Protopapa F, Hayashi MJ, Kulashekhar S, van der Zwaag W, Battistella G, Murray MM, et al. Chronotopic maps in human supplementary motor area. PLoS Biol. 2019;17(3):e3000026. doi: 10.1371/journal.pbio.3000026 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref082] 82.Ofir N, Landau AN. Neural signatures of evidence accumulation in temporal decisions. Curr Biol. 2022;32(18):4093–4100.e6. doi: 10.1016/j.cub.2022.08.006 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref083] 83.Faul F, Erdfelder E, Lang AG, Buchner A. G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods. 2007;39(2):175–191. doi: 10.3758/bf03193146 [DOI] [PubMed] [Google Scholar]

[pbio.3002686.ref084] 84.Lo S, Andrews S. To transform or not to transform: using generalized linear mixed models to analyse reaction time data. Front Psychol. 2015;6. Accessed 2023 Sep 7. Available from: https://www.frontiersin.org/articles/10.3389/fpsyg.2015.01171. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pbio.3002686.ref085] 85.Webb R. The (Neural) Dynamics of Stochastic Choice. Manag Sci. 2019;65(1):230–255. doi: 10.1287/mnsc.2017.2931 [DOI] [Google Scholar]

[pbio.3002686.ref086] 86.Daw N. Trial-by-trial data analysis using computational models. Affect, Learning and Decision Making, Attention and Performance XXIII. 2011;23. doi: 10.1093/acprof:oso/9780199600434.003.0001 [DOI] [Google Scholar]

[pbio.3002686.ref087] 87.Navarro DJ, Fuss IG. Fast and accurate calculations for first-passage times in Wiener diffusion models. J Math Psychol. 2009;53(4):222–230. doi: 10.1016/j.jmp.2009.02.003 [DOI] [Google Scholar]

PERMALINK

Humans can infer social preferences from decision speed alone

Sophie Bavard

Erik Stuchlý

Arkady Konovalov

Sebastian Gluth

Roles

Abstract

Introduction

Results

Experimental protocol

Fig 1. Task design and validation of the social preference measure.

Dictator Game results

Observational learning results

Table 1. Pairwise comparisons of average accuracy per condition.

Fig 2. Behavioral results in the estimation phase.

Extrapolation to unseen decisions

Fig 3. Behavioral results in the prediction phase.

Observers’ prediction speed mimics dictators’ decision speed

Table 2. Results from GLMM fitted on observers’ RT in the prediction phase.

Table 3. Results from GLMM fitted on observers’ RT in the Dictator Game task.

Computational formalization of the behavioral results

Fig 4. Qualitative model comparison.

Discussion

Materials and methods

Ethics statement

Preregistration

Participants

Dictator Game experiment

Main experiment

Behavioral tasks

Dictator Game experiment

Main experiment

Trial selection

Empirical chance level

Behavioral analyses

Computational models

Estimating social preference based on all data

Drift diffusion model

Choice-based softmax method

Learning social preference based on sequential observations

Reinforcement learning models

Bayes-optimal model

Supporting information

Acknowledgments

Abbreviations

Data Availability

Funding Statement

References

Decision Letter 0

Christian Schnell, PhD

Roles

Decision Letter 1

Christian Schnell, PhD

Roles

Author response to Decision Letter 1

Decision Letter 2

Christian Schnell, PhD

Roles

Author response to Decision Letter 2

Decision Letter 3

Christian Schnell, PhD

Roles

Author response to Decision Letter 3

Decision Letter 4

Christian Schnell, PhD

Roles

Author response to Decision Letter 4

Decision Letter 5

Christian Schnell, PhD

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases