A newly detected bias in self-evaluation

Guillaume Deffuant; Thibaut Roubin; Armelle Nugier; Serge Guimond

doi:10.1371/journal.pone.0296383

. 2024 Feb 8;19(2):e0296383. doi: 10.1371/journal.pone.0296383

A newly detected bias in self-evaluation

Guillaume Deffuant ^1,^2,^*, Thibaut Roubin ¹, Armelle Nugier ², Serge Guimond ²

Editor: Srebrenka Letina³

PMCID: PMC10852250 PMID: 38330018

Abstract

The widely observed positive bias on self-evaluation is mainly explained by the self-enhancement motivation which minimizes negative feedbacks and emphasizes positive ones. Recent agent based simulations suggest that a positive bias also emerges if the sensitivity to feedbacks decreases when the self-evaluation increases. This paper proposes a simple mathematical model in which these different biases are integrated. Moreover, it describes an experiment (N = 1509) confirming that the sensitivity to feedbacks tends to decrease when self-evaluation increases and that a directly related positive bias is detected.

Introduction

People overrate themselves. On average, people say that they are “above average” in skill, over-estimate the likelihood that they will engage in desirable behaviors and achieve favorable outcomes, furnish overly optimistic estimates of when they will complete future projects, and reach judgements with too much confidence.

This quotation from [1] is part of a review of numerous evidences of a positive bias in self-evaluation. This review reports in particular evidence of overoptimism or overconfidence in judgement and predictions, for instance about the duration of a romantic relationship [2] or the ability to complete a task [3] or about forecasting events in general [4–7].

A general explanation of this positive bias is the self-enhancement motive, which is the drive to convince ourselves, and any significant others in the vicinity, that we are intrinsically meritorious persons: worthwhile, attractive, competent, lovable, and moral [8].

Self-enhancement manifests itself in a variety of processes [8]. For instance, when people describe events in which they were involved, they tend to attribute positive outcomes to themselves, but negative outcomes to others or to circumstances, thus making it possible to claim credit for successes and to disclaim responsibility for failures [9, 10]. People also tend to remember their strengths better than their weaknesses [11, 12]. Where a threat to ego cannot be easily ignored, people will spend time and energy trying to refute it. A familiar example is the student who unthinkingly accepts a success to an examination but mindfully searches for reasons to reject a failure [13]. Note that, in order to achieve self-deception, these processes should be at least partially unconscious. Moreover, it seems that they activate various neural patterns, depending on the type of threat to the ego [14].

If self-enhancement seems to be a generally dominant motive, in some contexts the motives of self-assessment, self-confirmation or self-improving may prevail [15, 16]. Moreover, in some contexts, people tend to evaluate themselves consistently worse than the average others [17]. In this respect, the experiment reported in [18] is an important example for this paper because our own experiment shares several features of its design. Surprisingly, the results of [18] show that the participants self-derogate instead of self-enhancing. Indeed, they give more influence to negative feedbacks than to positive ones, hence they show a negative bias. This experiment repeats sequences where the participants perform a task, then receive a feedback about their performance at the task and try to improve their performance at the next sequence. The feedback is actually defined by the experimenters and completely disconnected from the performance of the participants. The authors argue that because the participants are learning a task and try to improve their performance, negative feedbacks are more informative and thus are given more weight, which induces a negative bias.

In this paper, we consider a simple model of self-enhancement or self-derogation, leading to a positive or a negative bias in self-evaluation [18, 19]. The model represents an agent holding a self-evaluation and changing this self-evaluation when receiving feedback. The feedback is said positive when it is higher than the agent’s self-evaluation and negative in the opposite case. The agent increases its self-evaluation when receiving a positive feedback and decreases it when receiving a negative feedback. Then, self-enhancement is the tendency of the agent to react more strongly to positive than to negative feedbacks. In this case, the agent increases its self-evaluation more strongly when receiving positive feedbacks than it decreases its self-evaluation when receiving negative feedbacks. By contrast, the agent shows self-derogation when it reacts more strongly to negative than to positive feedbacks. Self-enhancement causes a positive bias, namely a self-evaluation higher than the average of received feedbacks, while the self-derogation causes a negative bias, namely a self-evaluation lower than the average of received feedbacks. This model clarifies the distinction between the processes, self-enhancement or self-derogation, and their effect, a positive or a negative bias in self-evaluation.

More importantly in our work, an additional positive bias, generated by a completely different process, appears in this model. This bias is the main subject of this paper and, as far as we know, it is absent from the social-psychology literature. This additional bias appears when the sensitivity of the self-evaluation to the feedbacks (whether positive or negative) decreases when the self-evaluation increases, independently from self-enhancement, self-derogation or perfectly symmetric reactions to positive and negative feedbacks. Its cause is purely statistical and it has nothing to do with any motivation related to self. We call it bias from decreasing sensitivity to feedbacks or in short, bias from sensitivity.

We firstly observed this bias on a more complex agent based model [20]. Indeed, in this model, the agents have perfectly symmetric reactions to positive and negative feedbacks (no self-enhancement and no self-derogation), therefore the bias from sensitivity appears alone which makes it more easily observable. In other conditions, there is no means to disentangle it from the effect of self-enhancement or of self-derogation, without applying a specifically designed mathematical treatment. Moreover, understanding the process generating this bias required serious efforts [21] and it seems extremely unlikely that anyone would conceive this process without having detected the bias. It is therefore unsurprising that we did not find any paper about this bias in the social-psychology literature.

In this paper, our primary objective is precisely to detect this bias from sensitivity to feedbacks in humans after its observation in computer agents. More precisely, we aim at bringing experimental evidence supporting two main hypotheses:

People show an average decreasing sensitivity to feedbacks, when their self-evaluation increases;
This induces a specific positive bias in self-evaluation that is added to self-enhancement or self-derogation bias.

We report the results of an experiment (N = 1509) designed with this aim. The participants in this experiment perform a task once and then evaluate their performance at the task several times in reaction to evaluations (feedbacks) given by the experimenters. From data coupling self-evaluation and change of self-evaluation in reaction to a feedback, we compute a linear regression approximating the average sensitivity to feedbacks as a function of self-evaluation in different sets of participants. The slope of this linear regression is significantly negative especially when computed from the participants who believed that the feedbacks were real, showing that the sensitivity to feedbacks is decreasing. Moreover, we measure a significant average positive bias from this decreasing sensitivity to feedbacks, additional to the self-enhancement (or self-derogation) bias, especially in the set of participants who believed that the feedbacks were real. These results provide evidence supporting our hypotheses.

Our secondary objective is to observe how both sensitivity and self-enhancement biases are moderated by variables such as evaluation scale, gender and self-esteem. Of course, we did not find any publication about the effect of these variables on the sensitivity bias, since, as far as we know, the very existence of this bias has not been envisaged up to now. Hence this part of the paper can be seen as exploratory.

Surprisingly, we did not find any previous theoretical or experimental work about the effect of the evaluation scale on self-enhancement and our results about this effect can also be seen as exploratory. By contrast, there is a very abundant literature about the effects of gender and self-esteem on self-enhancement. Obviously, it is out of the scope of this paper to provide an exhaustive review of this literature and we limit our analysis to selected references that appear particularly relevant in the context of our experiment.

From this analysis, that we report in more details in the discussion, we conclude that:

The literature points to an increase of self-enhancement with self-esteem. The main rationale behind this effect is that people with high self-esteem tend to be more motivated to protect or increase their self-esteem [22, 23] or high self-esteem tends to increase the self-deceptive mechanisms of self-enhancement [24].
In the case of our experiment, the literature predicts that men will self-enhance more than women. Indeed, first, in our experiment, the task is related to a rather masculine subject and men tend to self-enhance more than women with respect to such subjects [25, 26]. Second, the evaluation is based on individual performance, closely related to the agency domain, in which men tend to self-enhance more than women [27, 28]. Finally the performance is not publicly disclosed, hence the self-enhancement is rather self-deceptive and men are more easily subject to this self-deception than women who engage more easily in impression management [29].

With respect to our secondary objective, our main finding is that the average sensitivity bias remains relatively stable when scale, gender and self-esteem vary (in the group of participants who believed in the feedbacks). This stability of the sensitivity bias is remarkable because the closely related self-enhancement bias varies very significantly, primarily with the scale.

Indeed, when the scale decreases with the evaluation, we find a significant self-enhancement bias as expected, but when the scale increases with the evaluation, we systematically find a significant self-derogation bias, like in the experiment of [18] (who used an increasing scale). This effect, which, as far as we know, has not been reported in the literature, questions the explanation of the self-derogation by the motive to learn proposed in [18].

Moreover, in line with our analysis of the literature, the group of participants with high self-esteem and men tend to show a higher self-enhancement or a weaker self-derogation.

The next section presents our hypotheses in more details, the experimental setting and the method used for treating the results. The following section reports the results of the experiment. The final section proposes a discussion about them.

Materials and methods

Model and hypotheses

This section presents the model of an agent modifying its self-evaluation when receiving feedbacks. This is a simplified version of the agent model described in [20, 21]. It shows how a positive bias emerges from different series of feedbacks, if the sensitivity to feedback decreases when the self-evaluation increases. Then it extends the approach to a model that includes self-enhancement or self-derogation.

General definition of the positive bias from decreasing sensitivity

Consider an agent with self-evaluation a_t at time t when receiving feedback f_t (i.e. an evaluation coming from an outside source). The main hypotheses from [20] are:

the change of self-evaluation due to this feedback is proportional to the difference between the feedback and the self-evaluation;
Moreover, the coefficient of proportionality decreases with a_t.

These hypotheses are thus expressed by Eq 1:

\begin{matrix} a_{t + 1} - a_{t} = h (a_{t}) (f_{t} - a_{t}), \end{matrix}

(1)

where h(a_t) is a positive and decreasing function (after averaging possible random fluctuations) that we call “sensitivity to feedbacks”. In the following, we assume that the sensitivity h is derivable, thus its derivative h′ is negative: h′(a_t)<0 for all a_t.

According to this model, for the same difference between feedbacks and self-evaluations (whether positive or negative), agents with a high self-evaluation are less influenced than agents with a low self-evaluation.

This model is a particular case of the model of interactions between two agents in [20, 21, 30]. Indeed, for any pair agent 1 and agent 2 in the population, the latter model assumes that the more agent 1 feels superior to agent 2, the lower the influence of agent 2 on agent 1 and the more agent 1 feels inferior to agent 2, the higher the influence of agent 2 on agent 1. Mathematically, the influence of agent 2 on agent 1 is a decreasing function h(a₁₁ − a₁₂), where a₁₁ is agent 1’s self-opinion and a₁₂ is the opinion of agent 1 about agent 2. Eq 1 is the same model when assuming that the opinion a₁₂ of agent 1 about agent 2 is constant.

In our experiment, agent 1 is the participant and agent 2 is the source of the feedback, which is described in more details later. We assume that the participants have no reason to change their opinion about the source of feedback over time. Hence in this perspective, postulating Eq 1 can be seen as neglecting the variations of the opinion of the participant about this source.

However, Eq 1 can also address cases where the feedback does not come directly from another agent but is a general evaluation from the environment, like a failure or a success. In this case, a possible justification is that agents with a high self-evaluation tend to be more confident and this makes them less prone to change their mind. The general hypothesis is that people having a high self-evaluation are less easily influenced than people having a low self-evaluation.

The fact that the sensitivity to feedbacks h is decreasing induces a general positive bias that we define mathematically now. Assume that the feedback is a random distribution of average a₁, which is also the initial self-evaluation. The first feedback is f₁ = a₁ + θ₁, θ₁ being randomly drawn from the distribution of average 0. The self evaluation after receiving this feedback is:

\begin{matrix} a_{2} = a_{1} + h (a_{1}) θ_{1} . \end{matrix}

(2)

Then, after the second feedback f₂ = a₂ + θ₂, θ₂ being randomly drawn from the distribution around a₁, the self-evaluation a₃ after receiving this feedback is:

\begin{matrix} a_{3} = a_{2} + h (a_{2}) (a_{1} + θ_{2} - a_{2}) \end{matrix}

(3)

Assuming that θ₁ is small, the sensitivity h(a₂) at a₂ can be approximated at the first order as:

\begin{matrix} h (a_{2}) = h (a_{1}) + h^{'} (a_{1}) h (a_{1}) θ_{1} . \end{matrix}

(4)

Replacing a₂ by its value and h(a₂) by this approximation yields:

\begin{matrix} a_{3} = a_{1} + h (a_{1}) θ_{1} + (h (a_{1}) + h^{'} (a_{1}) h (a_{1}) θ_{1}) (θ_{2} - h (a_{1}) θ_{1}), \end{matrix}

(5)

\begin{matrix} = a_{1} + h (a_{1}) θ_{1} + h (a_{1}) (θ_{2} - h (a_{1}) θ_{1}) + h^{'} (a_{1}) h (a_{1})) (θ_{2} θ_{1} - h (a_{1}) θ_{1}^{2}) . \end{matrix}

(6)

Because we assume the averages of θ₁ and of θ₂ are 0, the average $\bar{a_{3}}$ of a₃ over all possible draws of θ₁ and θ₂ is:

\begin{matrix} \bar{a_{3}} = a_{1} - h^{'} (a_{1}) h^{2} (a_{1}) \bar{θ_{2}^{2}} . \end{matrix}

(7)

As we assume h′(a₁) < 0, we always have:

\begin{matrix} - h^{'} (a_{1}) h^{2} (a_{1}) \bar{θ_{2}^{2}} > 0 . \end{matrix}

(8)

This value defines the positive bias. The second evaluation a₃ is on average higher than the average feedback a₁ because of this bias.

This result extends to longer series of feedbacks [21]. The positive bias increases with the length of the series to an asymptotic value, which remains of the second order (in $\bar{θ^{2}}$ ).

Our main aim is to check experimentally the existence of this bias. If we directly derive the experiment from the previous formulas, we face a hard problem: we need a huge number of random draws of feedbacks in order to get their average close to 0 and get a chance to detect the bias. To overcome this difficulty, we consider particular series of feedbacks in which the bias appears without averaging over many trials.

Positive bias from decreasing sensitivity with alternating positive and negative feedbacks

Let f_t − a_t be the intensity of feedback f_t. We say that a feedback is positive when its intensity is positive and negative otherwise. We show now that the previous model generates a positive bias when receiving a series of feedbacks of opposite intensities. We consider the simple example of an agent receiving two consecutive feedbacks of opposite intensities ±δ.

Assume that the agent starts with self-evaluation a₁ and receives first the positive feedback f₁ = a₁ + δ. Applying Eq 1, the self-evaluation of the agent becomes a₂:

\begin{matrix} a_{2} = a_{1} + h (a_{1}) δ . \end{matrix}

(9)

Then the agent receives the negative feedback f₂ = a₂ − δ and its self-evaluation a₃ becomes:

\begin{matrix} a_{3} = a_{2} - h (a_{2}) δ . \end{matrix}

(10)

The difference of self-evaluation between before and after receiving the couple of feedbacks is:

\begin{matrix} a_{3} - a_{1} = a_{1} + h (a_{1}) δ - h (a_{2}) δ - a_{1} = (h (a_{1}) - h (a_{2})) δ . \end{matrix}

(11)

As we assume that at any time t, h(a_t) > 0, we have a₁ < a₂ and, as h is decreasing, we have: h(a₁) − h(a₂) > 0, hence a₃ − a₁ > 0.

If we invert the order of the feedbacks (f₁ = a₁ − δ and f₂ = a₂ + δ), we have:

\begin{matrix} a_{3} - a_{1} = (h (a_{2}) - h (a_{1})) δ . \end{matrix}

(12)

Now a₂ < a₁, therefore again, because h is decreasing a₃ − a₁ > 0.

Therefore, after receiving two feedbacks of opposite intensities, the self-evaluation tends to increase.

Developing h(a₂) at the first order like previously, we can approximate the value of the bias:

\begin{matrix} h (a_{2}) \approx h (a_{1}) + h^{'} (a_{1}) h (a_{1}) δ, if f_{1} = a_{1} + δ; \end{matrix}

(13)

\begin{matrix} h (a_{2}) \approx h (a_{1}) - h^{'} (a_{1}) h (a_{1}) δ, if f_{1} = a_{1} - δ . \end{matrix}

(14)

Therefore, for both sequences of feedbacks we get:

\begin{matrix} S (a_{1}) = a_{3} - a_{1} \approx - h^{'} (a_{1}) h (a_{1}) δ^{2} . \end{matrix}

(15)

This positive bias is thus expected to be of the second order of the intensity of the feedback, hence rather small. With a series of feedbacks of opposite intensities, the positive bias appears directly, without requiring to average on a large number of trials. In an experiment, the participants processing such a series of feedbacks of opposite intensities are expected to provide a noisy value of function h(a) for each self-evaluation a in the series. We expect to approximate the average value of h(a) and the related bias when computing them from data collected on a reasonable number of participants. Up to now, we have assumed that the agent self-evaluates without self-enhancement, because in Eq 1, the sensitivity to the positive feedback a_t + δ and to the negative feedback a_t − δ are the same: h(a_t). We now extend the model to the case where these functions are different.

Positive bias from decreasing sensitivity with self-enhancement or self-derogation

In the framework of this model, self-enhancement or self-derogation take place when the sensitivity h_p(a_t) to positive and h_n(a_t) to negative feedbacks are different:

\begin{matrix} a_{t + 1} - a_{t} = h_{p} (a_{t}) δ, if f_{t} = a_{t} + δ, \end{matrix}

(16)

\begin{matrix} a_{t + 1} - a_{t} = - h_{n} (a_{t}) δ, if f_{t} = a_{t} - δ . \end{matrix}

(17)

In the following, for sake of simplicity, we use self-enhancement in a general sense which includes self-derogation, considered as a negative self-enhancement. When induced by feedbacks of intensity ±δ, the bias of self-enhancement E(a) at a given self-evaluation a can be expressed as the difference between the reaction to the positive feedback f_p = a + δ and the reaction to the negative feedback f_n = a − δ:

\begin{matrix} E (a) = (h_{p} (a) - h_{n} (a)) δ . \end{matrix}

(18)

Now, assume that the agent’s self-evaluation is a₁ and that the agent receives a positive and then a negative feedback. Repeating the previous calculations, we get:

\begin{matrix} a_{2} = a_{1} + h_{p} (a_{1}) δ, \end{matrix}

(19)

\begin{matrix} a_{3} = a_{2} - h_{n} (a_{2}) δ . \end{matrix}

(20)

The total bias B(a₁) from these successive feedbacks is:

\begin{matrix} B (a_{1}) = a_{3} - a_{1} \end{matrix}

(21)

\begin{matrix} = (h_{p} (a_{1}) - h_{n} (a_{1})) δ - h_{n}^{'} (a_{1}) h_{p} (a_{1}) δ^{2} . \end{matrix}

(22)

We recognise the self-enhancement bias (Eq 18) in the first term and the bias from decreasing sensitivity (Eq 15) in the second term. For this sequence of feedbacks, the bias from decreasing sensitivity is thus:

\begin{matrix} S (a_{1}) = - h_{n}^{'} (a_{1}) h_{p} (a_{1}) δ^{2} . \end{matrix}

(23)

This value is positive when $h_{n}^{'} (a_{1})$ is negative and we have:

\begin{matrix} B (a_{1}) = E (a_{1}) + S (a_{1}) . \end{matrix}

(24)

Moreover, if we have a series of 2 positive and 2 negative feedbacks in a random order (as it will be the case in the experiment), the average bias from decreasing sensitivity is:

\begin{matrix} S (a) = \frac{1}{4} (- h_{n}^{'} (a) h_{p} (a) - h_{p}^{'} (a) h_{n} (a) - h_{p}^{'} (a) h_{p} (a) - h_{n}^{'} (a) h_{n} (a)) δ^{2}, \end{matrix}

(25)

\begin{matrix} S (a) = - h_{m}^{'} (a) h_{m} (a) δ^{2}, \end{matrix}

(26)

where h_m is the average of h_p and h_n: $h_{m} (a) = \frac{1}{2} (h_{p} (a) + h_{n} (a))$ . In the following experiments, we approximate functions h_n and h_p with linear regressions from data collected on several participants. Then we evaluate the biases from self-enhancement and decreasing sensitivity using the above formulas.

Experiment

The experiment design has been approved by the committee of ethics from Clermont Auvergne Université (reference number IRB00011540–2020-39). The participants live in France and were recruited online by a specialised company which verifies that they are not bots by checking an ID number. The participants are explicitly requested to give their consent by ticking a specific button that enables them to start the questionnaire.

The participants receive a series of 4 feedbacks, two positive, two negative, of same intensity in absolute value, starting from different self-evaluations. From these data, we compute linear approximations of the sensitivities to positive and negative feedbacks (functions h_p and h_n in the model) and then the different biases. We can thus reach our primary objective, which is to check the existence of a bias from decreasing sensitivity combined with the bias from enhancement, and our secondary objective, which is to observe how the biases are modulated by other variables (scale, gender, self-esteem).

Overview of the experiment

The experiment is schematically presented on Fig 1. The type of task and the evaluation by comparison with the performances of a large group, using a scale between 0 and 100, are similar to the ones used in [18]. The participants answer to an online questionnaire which includes the following steps:

The participants are requested to assess the size of the coloured surface in the 3 different 2D images (see an example of image on the top left of Fig 1).
The participants are told that the experimenters can compute exactly their error of surface assessment on these three images and can do the same for a large number of other people who already performed the task. Moreover, the participants are told that the experimenters gathered at random 6 different groups (G₀ to G₅) of errors from 100 people and that the error of the participant will be compared to these groups. This comparison provides an evaluation, between 1 and 100, of the participant with respect to the group. We tested two evaluation scales: rank and score which are described further.
We assume that, initially, the participants have no idea of their self-evaluation at the task. Therefore, the participants are given the initial evaluation f₀ of their error with respect to G₀, the errors of the first group of randomly chosen 100 persons. We call f₀ the anchor because it is the initial reference evaluation for the participant. This anchor is actually defined by the experimenters in a way that is described further.
Given this anchor as first evaluation, we assume that the participants have a precise information about their performance at the task, enabling them to self-evaluate. With this purpose, the participants are asked to express their expected evaluation in the second group of 100 people’s errors (G₁). We interpret this expected evaluation a₁ as the first self-evaluation of the participant.
The feedback f₁ is presented as the evaluation of the participant in group G₁. It is actually defined automatically as:
$\begin{matrix} f_{1} = a_{1} \pm δ \pm ϵ, \end{matrix}$ (27)
where δ = 13 and ϵ = 1. The choice of δ is constrained. It should not be too small because this would make the bias difficult to detect and not too high, because then it would make the variations difficult to believe by the participants. The addition of the small variation ±ϵ aims at avoiding to produce too regular series of feedbacks that could undermine the confidence of the participant in the reality of the feedback. We assume that the participants judge this feedback in comparison with their self-evaluation, like in the model.
The participants are asked their expected evaluation a₂ in group G₂. They are requested to express this self-evaluation between their previous expectation a₁ and the feedback f₁ that they just received. Doing this, we impose that the sensitivity to the feedback is between 0 and 1 for each observation. The literature (e.g. [31, 32]) and several pilot experiments that we made (not-reported here) suggest that this assumption holds in a large majority of cases. Indeed, participants often put their self-evaluation outside the interval when they do not understand well the requests or do not pay attention. Therefore, the constraint on the self-evaluation is primarily a means to limit the noise in the results. Moreover, the participants are free to choose any value within the bounds. The possible variation of this choice when the self-evaluation varies is thus constrained only in its limits, not in its direction. The main subject of our investigation is precisely the direction of this variation, which we expect to be decreasing. This direction of variation is not constrained by the experimental setting.
The same process is repeated again three times, with feedbacks f₂, f₃ and f₄ that are presented as the evaluation of the participant in groups G₂, G₃ and G₄, and requesting the participant’s expected evaluations a₃, a₄ and a₅ in groups G₃, G₄ and G₅ (interpreted as successive self-evaluations). Actually, each time, the feedbacks are computed as:
$\begin{matrix} f_{t} = a_{t} \pm δ \pm ϵ, \end{matrix}$ (28)
where a_t is the expected evaluation of the participant in group G_t given the last feedback f_t−1 which is (allegedly) their evaluation in group G_t−1.
Finally, the participants are asked if they believed that the feedbacks were really the evaluation of their error with respect of the errors from real groups of 100 persons or if they believed that these feedbacks were manipulated by the experimenters. The participants are requested to rate their belief between 0 (the feedbacks are fake) to 10 (the feedbacks are real). In the following, we call this answer “trust in feedback” or sometimes simply “trust” of the participant.

The sequence of positive and negative feedbacks is chosen at random in the six possible sequences that contain two positive and two negative feedbacks (see Table 1). However, in some cases, when the self-evaluation a_t is close to the limit 1 or 100, the chosen feedback would leave the [1, 100] interval. In these cases, the feedback is truncated in order to remain in [1, 100]. This might lead to some sequences where the positive and negative feedbacks are not balanced. We removed these sequences from the treated results.

Table 1. The six series of the 4 feedbacks f₁, f₂, f₃ and f₄ (two positive, two negative).

f ₁	f ₂	f ₃	f ₄
+	+	-	-
+	-	+	-
+	-	-	+
-	+	+	-
-	+	-	+
-	-	+	+

Open in a new tab

Finally, the experiment also includes a questionnaire evaluating the self-esteem of the participants using Rosenberg’s scale [33].

Experimental design

The experimental design includes the following conditions:

Low anchor (randomly chosen in [15, 40]) vs high anchor (randomly chosen in [60, 85]).
Six possible series of feedbacks (shown on Table 1);
Evaluation by rank vs evaluation by score:
- The rank is the number of persons (within the considered group of 100 persons) who perform better than the participant, plus one. 1 is the best rank, 101 is the worst:
- The score is the number of persons in the group who perform worse than the participant. 100 is the best score, 0 is the worst. This is the scale used in [18].

In total there are 24 different conditions: 2 (anchor) x 6 (feedback sequences) x 2 (scale). All the conditions have the same probability, except the high anchor, which has a higher probability than the low anchor ( $\frac{2}{3}$ vs $\frac{1}{3}$ ). Indeed, a pilot experiment suggested that sensitivity to feedbacks decreases only when the anchor is high. Therefore it appeared important to collect more data in these conditions.

In [18] the evaluation is made by score only. However, it is important in our experiment to check that a possible decreasing sensitivity is also detected when using an evaluation by rank.

Choice of a very specific task

The choice of the very specific task of assessing a surface within 2D images requires justification. Indeed, our main assumption is that people tend to be less sensitive to the feedbacks when their self-evaluation is high because then they tend to be more self-confident and less prone to be influenced by others. If this assumption holds, at a first glance, the experiment is more likely to succeed if the feedbacks and self-evaluations are about a general ability than about a very specific task. Indeed, a high self-evaluation at a very specific task seems less likely to affect general self-confidence.

However, we can also assume that a high self-evaluation at a very specific task has an influence on the self-confidence related to this task only. This specific increase of self-confidence would decrease the sensitivity to feedbacks about this task only. Here, we assume that the self-confidence depends, at least partly, on the context. This seems a reasonable assumption: most people tend to be self-confident in their field of expertise and more insecure in unknown situations.

Moreover, this assumption shows strong practical advantages. Firstly, testing it avoids serious ethical and practical difficulties in manipulating the self-evaluation about general abilities. Secondly, if the task is very specific and unknown to the participants, they initially have no idea about their evaluation at this task and they can easily believe any feedback given during the experiment.

For these reasons, we finally considered that designing the experiment about a very specific task is preferable. This task is assessing the size of a coloured surface in 3 different 2D images. An example of image is shown on Fig 1 which schematises the experiment.

Result treatment

We report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study. The R code of all treatments and the data are available at https://github.com/guillaumeDeffuant/sensitivityBias.

The experiment yields a set of triples including self-evaluation at t, feedback at t, self-evaluation at t + 1, denoted by $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ , the exponent i designating the participant and t ∈ {1, 2, 3, 4} the index of the successive feedbacks and self-evaluations for this participant(t is called time step in the following). In the experiment, $f_{t}^{i} - a_{t}^{i} = \pm δ \pm ϵ$ . As ϵ is small, in the following text, to simplify the notation, we define $δ_{t}^{i} = δ \pm ϵ$ , hence $f_{t}^{i} - a_{t}^{i} = \pm δ_{t}^{i}$ . We removed participants whose series of self-evaluations got too close to 0 or to 100 because we could not then apply the planned feedbacks. In the S10 and S11 Tables also show results when removing the participants who filled the questionnaire in less than 3 minutes. Indeed, 3 minutes seems a very minimal time for carefully answering the questions. However, this filter on the participants does not change the main results.

Importantly, in order to simplify the presentation of the results, we only use the increasing scale of evaluation (score). Hence the first treatment is to transform any rank r whether allegedly computed by performance comparison in a group (for feedbacks) or expected by the participant (for self-evaluations) into 100 − r.

Linear approximations of the sensitivity to feedback functions

Our first aim is to check the hypothesis that sensitivities h(a_t) to all feedbacks, h_p(a_t) to positive feedbacks and h_n(a_t) to negative feedbacks are decreasing.

The ideal approach would be to derive from the data of each participant i, approximations of the sensitivities hⁱ(a_t) to all feedbacks, $h_{p}^{i} (a_{t})$ to positive feedbacks and $h_{n}^{i} (a_{t})$ to negative feedbacks of this participant. However, there are only four triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ available for each participant and this is not enough to get a reliable approximation.

Instead of computing the sensitivities of a single participant, we derive approximations of the sensitivities from samples of triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ mixing several participants and several time steps. The larger size of the sample provides higher chances to get significant results. We assume that, in this case, we obtain an approximation of the average sensitivities to feedbacks of the participants in the set.

Hence, considering a sample A of triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ mixing participants and time steps, we compute the linear regressions taking self-evaluation change $| \frac{a_{t + 1}^{i} - a_{t}^{i}}{δ_{t}^{i}} |$ as outcome variable and self-evaluation $a_{t}^{i}$ as predictor variable and this provides linear approximations of the sensitivities h, h_p and h_n defined in Eqs 1, 19 and 20. More precisely, from a given sample of triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ we derive the following linear models:

For all feedbacks $f_{t}^{i} = a_{t}^{i} \pm δ_{t}^{i}$ , the linear model:
$\begin{matrix} \frac{| a_{t + 1}^{i} - a_{t}^{i} |}{δ_{t}^{i}} \approx c \frac{a_{t}^{i}}{100} + b \approx h (a_{t}^{i}), \end{matrix}$ (29)
approximates the sensitivity to feedbacks h;
For positive feedbacks $f_{t}^{i} = a_{t}^{i} + δ_{t}^{i}$ , the linear model:
$\begin{matrix} \frac{| a_{t + 1}^{i} - a_{t}^{i} |}{δ_{t}^{i}} \approx c_{p} \frac{a_{t}^{i}}{100} + b_{p} \approx h_{p} (a_{t}^{i}), \end{matrix}$ (30)
approximates the sensitivity to positive feedbacks h_p;
For negative feedbacks $f_{t}^{i} = a_{t}^{i} - δ_{t}^{i}$ , the linear model:
$\begin{matrix} \frac{| a_{t + 1}^{i} - a_{t}^{i} |}{δ_{t}^{i}} \approx c_{n} \frac{a_{t}^{i}}{100} + b_{n} \approx h_{n} (a_{t}^{i}), \end{matrix}$ (31)
approximates the sensitivity to negative feedbacks h_n.

In the following, these linear models are respectively called the sensitivity to feedbacks, the sensitivity to positive feedbacks and the sensitivity to negative feedbacks in sample A. The sign of slopes c, c_p and c_n indicates if these functions are increasing or decreasing.

These linear regressions are computed in various subsets of the whole set of triples, mixing more or less participants and time steps.

When the sets include 3 or 4 time steps, we also computed linear mixed effect models, using the R package lme4 [34] for the approximation of the sensitivity to feedbacks (positive and negative together), because the 3 or 4 self-evaluations of a single participant are not independent [35]. In the other cases, sets including less than 3 time steps, or including only positive or only negative feedbacks, the linear mixed effect model is not applicable and we use only standard linear regressions (see more details in the S1 Appendix).

Total bias B in a sample

The measure of the total bias B(A) is performed on a sample A such that the triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ , for the four time steps t ∈ {1, 2, 3, 4} are included in A, for each of participant i. Formally, the total bias B(A) is:

\begin{matrix} B (A) = \frac{1}{p_{A}} \sum_{i} \frac{a_{5}^{i} - a_{1}^{i}}{\frac{1}{2} \sum_{t \in {1, \dots, 4}} δ_{t}^{i}}, \end{matrix}

(32)

where p_A is the number of participants in sample A.

Indeed, B(A) is the average difference between the last self-evaluation (a₅) of the series of two positive and two negative feedbacks and the first one (a₁), as a proportion of the average feedback intensity in the series of four triples. Hence, we divide $a_{5}^{i} - a_{1}^{i}$ by twice the average of feedback intensity $δ_{t}^{i}$ for t = 1, %, 4. Moreover, B(A) is the sum of the self-enhancement bias and the bias from decreasing sensitivity as shown by Eq 24.

Average self-enhancement bias E in a sample

The evaluation of the self-enhancement bias in a sample of triples A is based on the sensitivity to positive and to negative feedbacks in this set. It is the average of the self-enhancement for $a_{t}^{i}$ as defined by Eq 18, i.e. the difference between the reaction to a positive and to a negative feedback at $a_{t}^{i}$ . Let (c_p, b_p) and (c_n, b_n) be the slope and intercept of respectively the approximate sensitivity to positive and negative feedbacks, as defined by Eqs 30 and 31. Applying Eq 18 with these approximate functions and averaging on sample A yields:

\begin{matrix} E (A) = \frac{1}{n_{A}} \sum_{i, t} (c_{p} - c_{n}) \frac{a_{t}^{i}}{100} + b_{p} - b_{n}, \end{matrix}

(33)

where n_A is the number of triples in sample A.

The self-enhancement bias E can be seen as the average change of self-evaluation after a sequence of two opposite feedbacks, without taking into account the variation of sensitivity to the feedback. This change is a proportion of the feedback intensity (here since $δ_{t}^{i} = | f_{t} - a_{t} | = δ \pm ϵ$ , it is roughly a proportion of δ).

The self-enhancement bias is negative when the participants are, on average, more sensitive to the negative than to the positive feedbacks.

Theoretical bias from sensitivity of feedbacks S′

This measure is the theoretical average change of self-evaluation due to the decreasing sensitivity to feedbacks. Following formula 26, this measure is:

\begin{matrix} S^{'} (A) = \frac{1}{n_{A}} \sum_{i, t} - c_{m} (c_{m} \frac{a_{t}^{i}}{100} + b_{m}) δ_{t}^{i}, \end{matrix}

(34)

where $c_{m} = \frac{c_{p} + c_{n}}{2}$ and $b_{m} = \frac{b_{p} + b_{n}}{2}$ . Note that we divided Eq 26 by $δ_{t}^{i}$ , so that this measure is expressed as a percentage of $δ_{t}^{i}$ like self-enhancement and total biases. The difference between total and self-enhancement biases is the bias from sensitivity and it should be close to the theoretical value:

\begin{matrix} S (A) = B (A) - E (A) \approx S^{'} (A) . \end{matrix}

(35)

Fig 2 illustrates the computation of the biases from self-enhancement and sensitivity. Note that both the self-enhancement bias and the theoretical bias from sensitivity can be computed as soon as the sensitivity to positive and to negative feedbacks are available. In particular, they can be computed on sets including incomplete series of triples for each participants (i.e. including less than the four time steps).

Bootstrap

Bootstrap is used to evaluate the variability of a quantity Q that is derived from a sample A [36]. Its principle is to generate a large number of random samples (with replacement) A^k of A, each of them providing an evaluation of Q^k of the quantity. Statistics on the values Q^k, for instance quantiles or standard deviation, provide an evaluation of the variability of Q because of the sample. We use bootstrap in order to evaluate the variability of the different measures (enhancement bias, sensitivity bias, theoretical sensitivity bias).

Moreover, when the considered sample A includes the triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ , for the four time steps t ∈ {1, 2, 3, 4}, for each of participant i in set A, then bootstrap helps to evaluate the robustness of the difference between the sensitivity bias (measured as the difference between the total bias and the enhancement bias) and the theoretical sensitivity bias (measured as the average of the sensitivity bias on the sample). If we constitute the samples A^k as usual by randomly choosing triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ , in general, we do not keep the sequences of the four time steps t ∈ {1, 2, 3, 4} complete, thus there is no guarantee to be able to compute the total bias $a_{5}^{i} - a_{1}^{i}$ . Therefore, in this case, instead of deriving the sample A^k by drawing triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ in A at random, we draw the participants at random (with replacement) and for each drawn participant i, we add the triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ , for the four time steps t ∈ {1, 2, 3, 4} into the new sample A^k.

In the following we compute means and standard deviation on 200 bootstrap samples. We checked on several sets that, with this number of samples, the mean and standard deviation are at most around 20% different from the values obtained with 1000 samples. We considered that this level of precision is sufficient.

The bootstrap allows us to compute effect size when comparing the results of a measure on two sets A₁ and A₂. The effect size is indeed defined as:

\begin{matrix} s = \frac{| m_{1} - m_{2} |}{σ_{1}}, \end{matrix}

(36)

where m₁ and m₂ are the bootstrap averages of the considered measure computed respectively on A₁ and A₂, and σ₁ is the bootstrap standard deviation of the measure for set A₁. An effect size of 0.2 is small, of 0.5 medium, of 0.8 large and of 1.3 very large [37].

Power analysis

The power analysis concerns the linear regressions that approximate the sensitivity to feedbacks in different data sets. Using the G*Power software for linear regressions, with one tail, seeking a small effect (0.1 recommended by the software) and a power (1 − α) of 0.95, we get a recommended sample size of 1073.

We need to perform linear regressions for both positive and negative feedbacks and at least when considering each scale independently. This leads to a recommended sample size of 4*1073 = 4302.

We decided to collect a sample from 1500 participants, generating, with 4 triples each, a sample of size 4*1500 = 6000. The surplus of around 30% accounts for the unavoidable unreliable data (participants not believing in the feedbacks or with self-evaluations that make the feedbacks leave the interval [1, 100]).

Results

The experiment involves 1509 participants (803 females, 706 males, age between 17 and 79). We removed 141 participants because their series of self-evaluations got too close to 0 or 100 and we could not apply the planned feedback. In total, after these exclusions the data set includes 5472 triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ for 1368 participants (729 women and 639 men, mean age: 36.8 years). The data of the experiment and all the results are available at https://github.com/guillaumeDeffuant/sensitivityBias.

Checking main hypotheses

The sensitivity to feedback decreases when self-evaluation increases

Our first hypothesis is that the sensitivity to feedbacks decreases when self-evaluation increases. This sensitivity is evaluated by the linear regression taking change of self-evaluation as outcome variable and self-evaluation as predictor variable (Eq 29), that we recall here for convenience:

\begin{matrix} \frac{| a_{t + 1}^{i} - a_{t}^{i} |}{δ_{t}^{i}} & \approx c \frac{a_{t}^{i}}{100} + b . \end{matrix}

(37)

We also evaluate the slope c using a linear mixed effect model [34, 35], which takes the non-independence of the self-evaluations from the same participants into account (see S1 Appendix for details).

Table 2 shows the value of the slope of the sensitivity c derived from different data sets mixing several time steps ((1 : k) = {1, ‥, k}) and participants of different values of trust in the feedback (on the first line of the table all the possible values of trust are considered). For the three first time steps or all the four time steps (t ∈ (1 : 3) or t ∈ (1 : 4)), the table shows the slope computed with the linear mixed effect model (c (lmer)). Note that the method does not provide a p-value associated with the slope. The table also shows the size N of each data set.

Table 2. Slope of sensitivity to feedback on several time steps.

The slope is computed for different intervals of trust. N is the number of triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ in the considered data set. c is the slope computed with a standard linear regression and c(lmer) designates the slope computed with a linear mixed effect model (p-value not provided).

Slope of sensitivity to feedback on several time steps
Trust	t ∈ (1 : 2)		t ∈ (1 : 3)			t ∈ (1 : 4)
Trust	N	c	N	c	c(lmer)	N	c	c(lmer)
[0, 10]	2736	−0.08^**	4104	−0.09^***	−0.08	5472	−0.07^***	−0.06
[0, 6]	1656	−0.07^*	2484	−0.07^*	−0.07	3312	−0.04 .	−0.04
[7, 10]	1080	−0.12^**	1620	−0.13^***	−0.12	2160	−0.11^***	−0.1
[8, 10]	834	−0.16^***	1251	−0.15^***	−0.14	1668	−0.13^***	−0.12
[9, 10]	562	−0.18^**	843	−0.18^***	−0.17	1124	−0.16^***	−0.14

Open in a new tab

*** : p < 0.001,

** : p < 0.01,

* : p < 0.05,

. : p < 0.1

In all cases, the value of the slope c is significantly negative, with an increasing amplitude when trust in the feedback increases. The slopes computed with the linear mixed effect model are very close to the slopes obtained by standard linear regression (only slightly less negative). Therefore, the non-independence of the self-evaluations for each participant has a weak effect.

Overall these results confirm our main hypothesis that sensitivity to feedback decreases when self-evaluation increases. The confirmation is particularly clear in sets of participants reporting a high trust, but the hypothesis is also valid in those reporting a low trust. Fig 3 represents two examples of sensitivity to feedbacks computed on sets of participants with high trust.

Fig 3 — The values of the self-evaluation change axis are $\frac{| a_{t + 1}^{i} - a_{t}^{i} |}{δ_{t}^{i}}$ and those of the self-evaluation axis are $a_{t}^{i} / 100$ . The black line is the regression computed on the whole set (of slope c), the red line for positive feedbacks (of slope c_p) and the blue line for negative feedbacks (of slope c_n). The red points represent the couples (self-evaluation, self-evaluation change) for positive feedbacks, the blue points represent those couples for negative feedbacks. The orange surface represents positive self-enhancement while the light blue surface represents negative self-enhancement (self-derogation). E is the average self-enhancement bias and S′ is the estimation of the average bias from sensitivity to feedbacks, both expressed as a percentage of δ.

The values of c are slightly less negative for t ∈ (1 : 4), especially when participants report a low trust. This suggests to compute the sensitivity to feedbacks on data from each time step as shown by Table 3. This table suggests that the behaviour of the participants is stable on average on the three first time steps and changes significantly at t = 4. Indeed, the slope of the sensitivity c is significantly negative (p-value at least <0.1) except at the last time step (t = 4) and for participants reporting low trust (T ≤ 6) and t > 1. The slope c is less significant than in Table 2 which could be expected as the data sets are smaller.

Table 3. Slope c(t) of sensitivity to feedback at each time step. The slope is computed on data from participants reporting different trust values (first column).

N is the number of triples $(a_{t}^{i}, f_{t}^{i}, a_{t + 1}^{i})$ in the considered data sets (it does not vary with t).

Slope c(t) of sensitivity to feedback at each time step
Trust	N	c(1)	c(2)	c(3)	c(4)
[0, 10]	1368	−0.09^*	−0.07 .	−0.09^*	−0.01
[0, 6]	828	−0.08 .	−0.06	−0.06	0.02
[7, 10]	540	−0.13^*	−0.12^*	−0.13^*	−0.07
[8, 10]	417	−0.14^*	−0.18^**	−0.13 .	−0.1
[9, 10]	281	−0.2^*	−0.16 .	−0.19^*	−0.09

Open in a new tab

** : p < 0.01,

* : p < 0.05,

. : p < 0.1

A possible explanation for the change of behaviour at t = 4 is a loss of attention after repeating the same process of evaluations many times.

The interested reader can find complementary results about slopes c_p of the sensitivity to positive and c_n of the sensitivity to negative feedbacks in the (S1 Table).

The decreasing sensitivity to feedback generates a measurable positive bias

Our second hypothesis is that the decreasing sensitivity to feedback generates a measurable positive bias in self-evaluation. As presented in the section devoted to the result treatments, on a data set A mixing the four time steps (t ∈ (1 : 4)), we have two ways of measuring this bias:

The difference between the total bias (a₅ − a₁) and the self-enhancement bias, denoted by S(A) (Eq 35). This measurement is relevant only for data sets that include the four time steps;
The average of the sensitivity bias measured from the sensitivities to positive and negative feedbacks, denoted by S′ (A) (Eq 34). The sensitivities to positive or negative feedbacks can be computed only with standard linear regressions because the sets include at most two time steps for which the feedback is of the same sign, which makes the linear mixed effect model non-applicable.

Fig 4 shows the average and standard deviation of these measures, computed on 200 bootstrap samples, for t ∈ (1 : 4) and for the different values of trust considered in the previous tables. The values of these biases are percentages of the feedback intensity δ, hence a value around 1% should therefore be interpreted as an average increase of the self-evaluation of 1% of δ at each time step.

These results suggest that a bias from sensitivity is detected with both measurements in all cases except for participants reporting low trust (interval [0, 6]). Indeed, except for this set of low trust, the standard deviation on the bootstrap is around one third of the mean. Therefore, the bias is not likely to be 0. In the set of participants reporting trust in the interval [0, 6], the standard deviation is close to the mean, therefore the value is not significant. Note that both S and S′ increase when trust increases.

Moreover, the small difference between S and S′ indicates that S′ is a reliable approximation of S on sets where S cannot be computed (because the data include values from only a part of the four time steps). This allows us to measure the bias from sensitivity to feedbacks on data for time steps in (1 : 2) and (1 : 3). This is important because we noticed previously that the data for t = 4 are probably of lower quality, thus the measures on sets excluding the last time step are likely to be more accurate.

Fig 5 shows the values of the measure S′ of the bias from sensitivity for time steps (1 : 2) and (1 : 3). We observe that in this case, even for participants reporting low trust, a significant bias is detected as the standard deviation is lower than half the mean. Moreover, as expected from the values of the slope c of the sensitivity to the feedbacks, the values of the bias are higher than for t ∈ (1 : 4). This can be explained again by the change of behaviour at t = 4.

Fig 3 illustrates the results obtained on sets of participants reporting a level of trust higher than 7 or higher than 9 and for t ∈ (1 : 3). In both cases, the self-enhancement is negative (light-blue surface) when the self-evaluation is low and positive (orange surface) when the self-evaluation is high.

If the data at t = 4 are considered as unreliable, Fig 5 provides the relevant measurements of the bias from sensitivity to feedbacks. In this case, the results show again that a bias from sensitivity is detected for all the tested levels of trust (even in the trust interval [0, 6]), and its value increases with trust. This confirms our second hypothesis.

Variations of biases with scale, gender and self-esteem

We focus on the participants reporting a high level of trust (T ∈ [7, 10]), as the data from these participants are the most meaningful. The results for low trust (T ∈ [0, 6]) are available in the Supplementary materials. Figs 6 and 7 show the mean and standard deviation of sensitivity and self-enhancement biases computed over 200 bootstrap samples from sets differentiating scales (rank or score) gender and self-esteem (SE in the graph). The values are shown for t ∈ (1 : 3) which we consider as the most relevant cases. The results for t ∈ (1 : 2) and for t ∈ (1 : 4) show broadly the same features (they are available in Supplementary Materials). Moreover, the variations of the sensitivity bias with the anchor, which seem to us more peripheral, are available in the S2 and S3 Tables.

Fig 7 — The bars and the error bars are respectively the mean and standard deviation computed on 200 bootstrap samples.

Fig 6 shows that the bias from sensitivity varies around 1% of the feedback intensity. Except for the set of participants of low self-esteem which shows a sensitivity bias close to 0, the bias for other sets varies between 0.73% (male with evaluation by rank) to 1.34% of the feedback (male with evaluation by score). For participants of high self esteem and women, the difference between evaluation by rank and by score is weak (effect size 0.12 and 0.07 respectively). The relative stability of the sensitivity bias is remarkable as the closely associated self-enhancement bias shows much stronger variations, as shown on Fig 7:

The self-enhancement bias changes dramatically with the scale: it is significantly positive when participants self-evaluate by rank and significantly negative when they self-evaluate by score in all considered sets (effect size 7.86 between rank and score).
The self-enhancement bias is higher for men and for participants with high self-esteem than for women and for participants with low self-esteem. The difference is very significant when participants self-evaluate by score (effect size 2.84 between men and women and 3.91 between participants with high self-esteem and women). In the whole data set, the average self-esteem of men (3.08) is only slightly higher than the average self-esteem of women (2.98) and this difference of self-esteem seems insufficient to explain the strong difference of self-enhancement bias between men and women.

Discussion

We first discuss the results on the variations of self-enhancement and self-derogation and then the results about the decreasing sensitivity to feedbacks and its associated bias.

Comments on the variations of self-enhancement bias

This paper focusing on the bias from decreasing sensitivity, we limit ourselves to preliminary comments and remarks about self-enhancement bias.

Effect of scale

The very significant self-derogation observed when participants self-evaluate by score deserves a specific discussion. This result is in line with the significant negative bias observed in the experiment reported in [18], which shows strong similarities with ours. However, in [18] this negative bias is explained by the motive of participants to improve their results at the task, which incites them to draw more attention to more informative negative feedbacks. This explanation seems irrelevant in the context of our experiment because the participants cannot improve their performance at the task, which is achieved at the beginning of the questionnaire and cannot change afterwards. Arguing that the participants are learning to self-evaluate could be a way to introduce the learning context. However, we should then observe self-derogation when the evaluation is made by rank as well, but it is not the case at all, as we measure a significantly positive self-enhancement bias in all sets of participants self-evaluating by rank. These results suggest an effect of the scale.

We can only formulate a preliminary hypothesis that an evaluation using a growing scale could be perceived as a possessed quantity, as much as the evaluation of an ability. Then, a negative feedback would be perceived as the loss of some possessions as well as a set back in status. The higher reaction to negative feedbacks could then be related to a general loss aversion [38] or higher sensitivity to negative events [39]. By contrast, the evaluation by rank seems to be more exclusively related to a perceived status, triggering self-enhancement, as expected from the literature.

A possibly related effect of the scale could be found in the experiment reported in [40], suggesting that student performance is significantly improved when using a grading system based on student ranking rather than on performance standards. Our results suggest that the grading systems generate significantly different self-enhancement or self-derogation biases, which could influence the performance of the students. In particular, strong levels of self-derogation which, extrapolating from our results, could be expected with the grades based on performance standards, could discourage students. Of course, this does not exclude the influence of other factors mentioned in [40].

Effect of self-esteem

Self-enhancement and self-esteem are deeply related as the self-enhancement motive is to preserve or increase self-esteem. Yet, the literature shows contradictory views about the influence of self-esteem on self-enhancement [41]. The self-enhancement theory for instance assumes that individuals with a low self-esteem have a stronger motivation for self-enhancement [42]. Other theories suggest that, on the contrary, individuals with a high self-esteem are more motivated to protect their positive self-view [22, 23] or to confirm it [18]. The experiments reported in [24] corroborate the latter theories, as they suggest that the status of expert may provide enough overconfidence to claim impossible knowledge. The experiment reported in [18] corroborates them as well, as participants with a lower self-esteem tend to show a more significant negative bias. Similarly, in our results, sets of participants with a lower self-esteem tend to show a greater self-derogation (for the evaluation by score) and a lower self-enhancement (for the evaluation by rank).

Moreover, this tendency is also confirmed when considering how the self-enhancement varies when self-evaluation increases within a set, instead of comparing the average measures from different sets. Indeed, within most considered sets, self-enhancement increases when self-evaluation increases, because the slope of the sensitivity to negative feedbacks is more negative than the slope of the sensitivity to positive feedbacks in most of the sets (S1 Table).

Effect of gender

Previous research established gender differences in self-enhancement. First, men tend to engage more, comparatively to women, in self-deceptive enhancement and women more in impression management [29]. Our experiment does not involve much impression management, as the participants are told that they interact, via the computer, with a program that computes their rank or score in several predefined groups. This can explain the lower self-enhancement of women measured in our experiment.

Also, the gender difference may depend on the context. In general, men reveal significantly higher self-enhancement with respect to masculine subjects than women do, whereas the self-enhancement of men and women in relation to feminine subjects are similar [25, 26]. More generally, men tend to show a higher self-enhancement than women in a context where qualities related to agency (competence, independence, openness) are important [27] as opposed to qualities related to communion (warmth, interdependence, agreeableness) [28].

Arguably, the task of surface assessment in our experiment can be perceived as close to mathematics, a masculine subject, and the self-evaluation concerns an individual competence. This can explain that men show a higher self-enhancement in our results when they self-evaluate by rank. This explanation seems more dubious for the lower self-derogation of men when participants self-evaluate by score. Indeed, if our hypothesis that the score is also perceived as a possessed quantity which decreases in case of negative feedbacks, the gender difference in self-derogation should probably be rather related to gender differences with respect to loss aversion or to perception of negative events.

Discussion about the bias from decreasing sensitivity to feedback

The results support our first main hypothesis that the sensitivity to feedbacks decreases with the self-evaluation. Indeed, we measure a significant decrease of sensitivity to feedbacks in the set of all the participants and in sets of participants of different trust in the feedbacks. The decrease is more significant in sets of participants reporting high trust and when excluding the last time step.

The results also support our second main hypothesis because we detect a significant positive bias from this decrease of sensitivity, which is added to the usual self-enhancement bias. As expected, this bias is more significant in sets of participants reporting high trust, because the sensitivity to feedbacks decreases more significantly in these sets. The bias is around 1% of the feedback intensity in these sets and appears rather stable when scale, self-esteem or gender vary. We now discuss the general significance of the newly detected bias in relation to the motivations for self-enhancement and self-assessment. In [16], self-enhancement and self-assessment are defined as follows:

“self-enhancement is the motivation of people to elevate the positivity of their self-conceptions and to protect their self-concepts from negative information,
self-assessment is the motivation of people to obtain a consensually accurate evaluation of the self.”

Moreover, [16] stresses that the positive bias on self-evaluation induced by self-enhancement is often considered useful because it can provide the will or general self-efficacy necessary to initiate novel action. As expressed by [43]: “Even if one is sick and anxious and poor, there should be reason to get up in the morning…Hence self-cognitions do not always have to be veridical in order to be functional”.

However, excessive self-overestimation can expose to severe negative consequences as shown in various domains such as health, education and the workplace [1]. Moreover, it can lead to excessive narcissism [28] or bitterness when people become the only ones convinced of their own high merit.

The motivation for self-assessment can be seen as contradicting self-enhancement. Indeed, self-assessment removes the protection against negative feedbacks in order to get an unbiased and accurate self-perception. There is thus a tension between both motivations as, in principle, an accurate self-assessment should remove the positive bias from self-enhancement.

This work suggests that, when the sensitivity to feedbacks decreases as the self-evaluation increases, the self-assessment process, though removing protections against negative feedbacks, also generates a positive bias. Consider situations where the feedbacks fluctuate around an fixed average value, at least for a while. These situations seem indeed more likely in everyday life than series of feedbacks of alternating intensities. As shown at the beginning of the “Material and methods” section, in these situations, the decreasing sensitivity to feedbacks implies an average self-evaluation that is slightly higher than the average feedback. Then, in some cases, this higher self-evaluation influences the average feedback after a while. If the average feedback increases, then the average self-evaluation increases again, and so on. However, if the average feedback significantly decreases, then the self-evaluation adapts and decreases as well (though remaining slightly higher than the new average). Therefore, the bias from decreasing sensitivity pushes the subject forward, in a cautious and adaptive way.

Moreover, the bias from decreasing sensitivity is more likely to take place when people accumulate many feedbacks. Indeed, as the bias is an average, it is likely to be wrongly appreciated on a low number of feedbacks. Therefore, active and daring people who are eager to accumulate experiences benefit from it more regularly. In comparison, a significant bias from self-enhancement can appear by dismissing a small number of negative feedbacks and the self-overestimation may then remain more or less stable, even if the average feedback decreases.

Let us illustrate these remarks with an example. Tennis players of a given level tend to loose against players of superior level and win against ones of inferior level. Assuming that their sensitivity to wins and losses and to successes and failures of their shots decreases with their self-evaluation, the players are subject to the positive bias from decreasing sensitivity. If their motivation of self-assessment is very high and they self-evaluate without self-enhancement (i.e. giving the same weight to successes and failures in their shots and their matches) they tend to self-evaluate a bit higher than their actual level. This slight surplus of confidence is likely to help them win tight matches against opponents of similar level because they are likely to remain positive in difficult situations and take reasonable risks. These wins actually increase slightly their level. Hence, their self-evaluation is likely to increase as well, and so on. However, if their self-evaluation raises too much, the players are likely to experience more frequent losses against players that they consider inferior and they will decrease their self-evaluation accordingly (still keeping it a bit higher than their average level).

By contrast, players who self-evaluate with self-enhancement tend to overestimate their level even if they do not play much, because they blame external conditions (the racket, the balls, the wind, the public…) for their failures. Then, they often poorly adjust their game during matches, because they overestimate their own shots and underestimate the ones of their opponents. Moreover, their inability to adjust their self-evaluation from even more losses increases the discrepancy.

This example considers the theoretical case of a player succeeding to remove any self-enhancement bias. However, our experiment suggests that in most cases, the bias from sensitivity only marginally modifies a significantly bigger self-enhancement bias. The conclusion is therefore normative: self-assessing as honestly as possible, removing any self-enhancement or self-derogation, is a recommended goal because the bias from sensitivity provides a small over-estimation that helps being positive and active without taking disproportionate risks. This conclusion complements previous research about “optimal self-esteem” [44], which distinguishes between fragile and secure self-esteem and advocates for unbiased processing in order to reach authenticity. Indeed, our work suggests that the effort for unbiased processing makes more relevant the slight supplement of positive self-evaluation generated by the decrease of sensitivity to feedbacks.

Limitations and future challenges

A major limitation of the experiment is the small size of data obtained from each participant. Indeed, the sensitivity to feedback may vary significantly with the participants and computing average sensitivity functions in sets of participants hides this variety. Ideally, the experiment should collect a much larger number of triples (self-evaluation, feedback, change in self-evaluation) from each participant in order to derive significant individual models of sensitivity to feedbacks. Moreover, in the experiment, we constrained the self-evaluations to be between the previous self-evaluation and the feedback. We underlined that this restriction does not constrain the change of sensitivity when the self-evaluation varies, therefore, as we find a decrease of this sensitivity when self-evaluation increases with the restriction, we should also find it without the restriction. Nevertheless, it would be important to check this assumption by replicating the experiment without the restriction on the self-evaluations. This would certainly require a larger sample to cope with more noisy data.

A second limitation worth mentioning relates to the detection of the sensitivity bias itself. Indeed, its detection in a very specific experiment says nothing about its potential role in daily life. Actually, our measures suggest that the sensitivity bias might generally be too dominated by self-enhancement or self-derogation biases to show an independent effect. Moreover, the agent based model simulations suggest that the sensitivity bias has a significant effect only on the long term. In this case again, long series of data could provide more information on the potential effect of this bias.

The setting of our experiment is clearly inadequate for collecting long series of individual data as the attention and motivation of participants already drop at the fourth time step. Designing a completely different experiment, that would provide long term individual data, is a serious challenge. Large scale long lasting game experiment on the internet, like for instance the one reported in [45], could offer new means to address this issue.

Finally, it seems noticeable that our main result, the existence of a bias from sensitivity to feedbacks, originates in theoretical agent simulations. This bias was indeed identified because its effects were easily observed in long lasting simulations, involving millions of virtual interactions. We could then detect its much smaller effect on short simulations, that we had initially overlooked. Similarly, it seems almost impossible to observe this bias in real life without looking for it with specific computations on data from a specific experiment. This is a case, common in physics but not so much in social sciences, of an initially purely theoretical concept whose existence is confirmed experimentally.

A second case of theoretical bias could be experimentally confirmed in the near future. Indeed, the theoretical work on the agent model identifies a second bias from the decreasing sensitivity to feedbacks, a negative bias on the evaluation of others [20, 21]. This bias has also not been observed yet and designing a new experiment to detect it is another serious challenge. The experiment reported in this paper seems an interesting starting point to take up this challenge.

Supporting information

S1 File

(PDF)

Click here for additional data file.^{(157.3KB, pdf)}

S1 Appendix. Linear mixed effect models.

The appendix explains how we computed linear mixed effect models for data structured in different levels [35] using the lme4 R package [34].

(PDF)

Click here for additional data file.^{(52.8KB, pdf)}

S1 Table. Sensitivity to positive or negative feedbacks.

The table shows the slopes c_p and c_n of the sensitivity to positive and negative feedbacks for different values of trust and different sets of time steps. Overall, the slope c_n appears stronger and more significant than slope c_p. This suggests that the bias from sensitivity to feedbacks is mainly due to the sensitivity to negative feedbacks, especially for high trust. Moreover, for participants of high trust, c_p − c_n the derivative of the self-enhancement bias is positive, suggesting that the self-enhancement bias increases with the self-evaluation. This is not true only for participants reporting low trust and t ∈ (1 : 2).

(PDF)

Click here for additional data file.^{(69.9KB, pdf)}

S2 Table. Slope of sensitivity c for low (f₀ ≤ 40) and high (f₀ ≥ 60) anchor.

The table shows the slope of the sensitivity to feedbacks for sets distinguishing participants starting with low or high anchor and reporting different levels of trust. In sets of participants reporting high trust, the decreasing of sensitivity is significant only when the anchor is high, which confirms the pilot studies. However, in the set of participants reporting low trust, the tendency is inverted: the slope of the sensitivity is significant only when the anchor is low. These results suggest that the sensitivity to the feedbacks is not linear as it decreases more significantly when the self-evaluation is in some ranges of values, like a logistic function for instance. Moreover, the range of self-evaluation for which the sensitivity decreases more significantly depends on the level of trust and possibly on the related level of involvement or attention.

(PDF)

Click here for additional data file.^{(74.4KB, pdf)}

S3 Table. Bias from sensitivity S′ for low (f₀ ≤ 40) and high (f₀ ≥ 60) anchor.

The table shows the bias from sensitivity for sets distinguishing participants starting with low or high anchor and reporting different levels of trust. In sets of participants reporting high trust, the bias from sensitivity is significant only when the anchor is high, which confirms the pilot studies. However, in the set of participants reporting low trust, the tendency is inverted: the bias from sensitivity is significant only when the anchor is low.

(PDF)

Click here for additional data file.^{(72.8KB, pdf)}

S4 Table. Self-enhancement bias E for t ∈ (1 : 2).

The table shows the variations of the measures of self-enhancement bias E computed for t ∈ (1 : 2) with scale, gender and self-esteem. The main features are similar to the ones of the same table for t ∈ (1 : 2) shown in the main text, with a higher standard deviation for time steps in (1 : 2) because the sets are smaller.

(PDF)

Click here for additional data file.^{(58.8KB, pdf)}

S5 Table. Theoretical sensitivity bias S′ for t ∈ (1 : 2).

The table shows the variations of the measures theoretical sensitivity bias S′ for t ∈ (1 : 2) with scale, gender and self-esteem. The main features are similar to the ones of the same table for t ∈ (1 : 3) and t ∈ (1 : 4), with a higher standard deviation for time steps in (1 : 2) because the sets are smaller.

(PDF)

Click here for additional data file.^{(66.9KB, pdf)}

S6 Table. Self-enhancement bias E for t ∈ (1 : 3).

The table shows the variations of the measures of self-enhancement bias E computed for t ∈ (1 : 3) with scale, gender and self-esteem.

(PDF)

Click here for additional data file.^{(59.2KB, pdf)}

S7 Table. Theoretical sensitivity bias S′ for t ∈ (1 : 3).

The table shows the variations of the measures theoretical sensitivity bias S′ for t ∈ (1 : 3) with scale, gender and self-esteem.

(PDF)

Click here for additional data file.^{(67.2KB, pdf)}

S8 Table. Self-enhancement bias E for t ∈ (1 : 3).

The table shows the variations of the measures of self-enhancement bias E computed for t ∈ (1 : 3) with scale, gender and self-esteem.

(PDF)

Click here for additional data file.^{(59KB, pdf)}

S9 Table. Theoretical sensitivity bias S′ for t ∈ (1 : 3).

The table shows the variations of the measures theoretical sensitivity bias S′ for t ∈ (1 : 3) with scale, gender and self-esteem.

(PDF)

Click here for additional data file.^{(67.1KB, pdf)}

S10 Table. Slope c of sensitivity to feedback for interview time greater than 3 minutes.

The table shows the slope of the sensitivity to feedbacks, when removing the 14 participants who took less than 3 minutes to fill the questionnaire from the data (56 triples $a_{t}^{i}, δ_{t}^{i}, a_{t + 1}^{i}$ removed from the data).

(PDF)

Click here for additional data file.^{(63.4KB, pdf)}

S11 Table. Bias S′ from sensitivity to feedbacks for interview time greater than 3 minutes.

The table shows the slope of the sensitivity to feedbacks and the mean and standard deviation of the bias from sensitivity computed on 200 bootstrap samples, when removing the 14 participants who took less than 3 minutes to fill the questionnaire from the data (56 triples $a_{t}^{i}, δ_{t}^{i}, a_{t + 1}^{i}$ removed from the data).

(PDF)

Click here for additional data file.^{(63.8KB, pdf)}

Acknowledgments

We are grateful to Sylvie Huet for her support at early stages of the research.

Data Availability

All data and programs are publicly available at: https://github.com/guillaumeDeffuant/sensitivityBias.

Funding Statement

In complement to our earlier statement that “the research has been partly supported by the French National Research Agency through the grant number ANR-18-ORAR-0003-01 (ToRealSim project)”, we confirm again that there was no additional external funding received for this study. Moreover, the funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1. Dunning D, Heath C, Suls J. Flawed Self-Assessment. Implications for Health, Education and the Workplace. Psychological Science in the Public Interest. 2004;21:69–106. doi: 10.1111/j.1529-1006.2004.00018.x [DOI] [PubMed] [Google Scholar]
2. Epley N, Dunning D. Feeling holier than thou: Are self-serving assessments produced by errors in self or social prediction? Journal of Personality and Social Psychology. 2000;79:861–875. doi: 10.1037/0022-3514.79.6.861 [DOI] [PubMed] [Google Scholar]
3. Dunning D, Story AJ. Exploring the planning fallacy: Why people underestimate their task completion times. Journal of Personality and Social Psychology. 1991;61:521–532. [DOI] [PubMed] [Google Scholar]
4. Fischhoff B, Slovic P, Lichtenstein S. Knowing with certainty: The appropriateness of extreme confidence. Journal of Experimental Psychology: Human Perception and Performance. 1977;3:552–564. [Google Scholar]
5. Buchler R, Griffin D, Ross M. Depression, realism, and the over-confidence effect: Are the sadder wiser when predicting future actions and events? Journal of Personality and Social Psychology. 1991;67:366–361. [DOI] [PubMed] [Google Scholar]
6. Vallone RP, Griffin D, Lin S, Ross L. Overconfident prediction of future actions and outcomes by self and others. Journal of Personality and Social Psychology. 1990;58:582–592. doi: 10.1037/0022-3514.58.4.582 [DOI] [PubMed] [Google Scholar]
7. Griffin D, Dunning D, Ross I. The role of construal processes in overconfident predictions about the self and others. Journal of Personality and Social Psychology. 1990;59:1128–1139. doi: 10.1037/0022-3514.59.6.1128 [DOI] [PubMed] [Google Scholar]
8. Sedikides C, Gregg AP. Portraits of the self. In: Hogg A, Cooper J, editors. The SAGE Handbook of Social Psychology. Sage; 2003. p. 93–122. [Google Scholar]
9. Campbell WK, Sedikides C. Self-Threat Magnifies the Self-Serving Bias: A Meta-Analytic Integration. Review of General Psychology. 1999;3:23–43. doi: 10.1037/1089-2680.3.1.23 [DOI] [Google Scholar]
10. Zuckerman M. Attribution of Success and FailureRevisited, or: The Motivational Bias Is Alive and Well inAttribution Theory. Journal of Personality. 1979;47:245–87. doi: 10.1111/j.1467-6494.1979.tb00202.x [DOI] [Google Scholar]
11. Mischel W, Ebbesen E B aARZ. Determinants of Selective Memory About the Self. Journal of Consulting and Clinical Psychology. 1976;29:279–82. [DOI] [PubMed] [Google Scholar]
12. Skowronski JJ, Betz AI, Thompson CP, Shannon L. Social Memory in Everyday Life: Recall of Self-Events and Other-Events. Jounral of Personality and Social Psychology. 1991;68:247–60. [Google Scholar]
13. Ditto PH, Boardman AF. Perceived Accuracy of Favorable and Unfavorable Psychological Feedback. Basic and Applied Social Psychology,. 1995;13:137–57. doi: [DOI] [Google Scholar]
14. Beer JS. Exaggerated Positivity in Self-Evaluation: A Social Neuroscience Approach to Reconciling the Role of Self-esteem Protection and Cognitive Bias. Social and Personality Psychology Compass. 2014;8(10):583–594. doi: 10.1111/spc3.12133 [DOI] [Google Scholar]
15. Swann WB, Schroeder DG. The search for beauty and truth: a framework for understanding reactions to evaluations. Personality and social psychology bulletin. 1985;21(12). doi: 10.1177/01461672952112008 [DOI] [Google Scholar]
16. Sedikides C, Strube M. Self-evaluation: To Thine Own self Be Good, to Thine Own Self Be Sure, To Thine Own Self Be True, And To Thine Own Self Be better. Advances In Expermiental Social Psychology. 1997;9:209–269. [Google Scholar]
17. Moore D, Small D. Error and bias in comparative social judgement: On being both better and worse than we think we are. Journal of Personality and Social Psychology. 2007;96(6):972–989. doi: 10.1037/0022-3514.92.6.972 [DOI] [PubMed] [Google Scholar]
18. Muller-Pinzler L, Czekalla N, Mayer AV, Stolz DS, Gazzola V, Christian Keysers FMP, et al. Negativity-bias in forming beliefs about own abilities. Scientitfic reports. 2019;9(14416). doi: 10.1038/s41598-019-50821-w [DOI] [PMC free article] [PubMed] [Google Scholar]
19. Moreland RL, Sweeney PD. Self-expectancies and relations to evaluations of personal performance. Personality. 1984;52:156–176. doi: 10.1111/j.1467-6494.1984.tb00350.x [DOI] [Google Scholar]
20. Deffuant G, Bertazzi I, Huet S. The dark side of gossips: hints from a simple opinion dynamics model. Advances in Complex Systems. 2018;21. doi: 10.1142/S0219525918500212 [DOI] [Google Scholar]
21. Deffuant G, Roubin T. Do interactions among unequal agents undermine those of low status? Physica A: Statistical Mechanics and its Applications. 2022;592:126780. doi: 10.1016/j.physa.2021.126780 [DOI] [Google Scholar]
22. Kobayashi C, Brown J. Self-Esteem and Self-Enhancement in Japan and America. Journal of Cross-Cultural Psychology. 2003;34(3). [Google Scholar]
23. Bosson J, Brown R, Zeigler-Hill V, Swann W. Self-Enhancement Tendencies Among People With High Explicit Self-Esteem: The Moderating Role of Implicit Self-Esteem. Self and Identity. 2010;2(3):169–187. doi: 10.1080/15298860309029 [DOI] [Google Scholar]
24. Atir S, Rosenzweig E, Dunning D. When Knowledge Knows No Bounds: Self-Perceived Expertise Predicts Claims of Impossible Knowledge. Psychological Science. 2015;26(8):1295–1303. doi: 10.1177/0956797615588195 [DOI] [PubMed] [Google Scholar]
25. Beyer S. Gender differences in the accuracy of self-evaluation of performance. Journal of Personality and Social Psychology. 1990;59:960–970. doi: 10.1037/0022-3514.59.5.960 [DOI] [Google Scholar]
26. Kurman J. Gender, Self-Enhancement, and Self-Regulationof Learning Behaviors in Junior High School. Sex Roles. 2004;50(9/10). doi: 10.1023/B:SERS.0000027573.36376.69 [DOI] [Google Scholar]
27. Gebauer JE, Wagner J, Sedikides C, Neberich W. Agency-communion and self-esteem relations are moderated by culture, religiosity, age, and sex: evidence for the “self-centrality breeds self-enhancement” principle. Journal of Personality. 2013;81(3):261–275. doi: 10.1111/j.1467-6494.2012.00807.x [DOI] [PubMed] [Google Scholar]
28. Paulhus D, John O. Egoistic and Moralistic Biases inSelf-Perception:The Interplay of Self-Deceptive Styles With Basic Traits and Motives. Jounral of Personality. 1998;66(6). [Google Scholar]
29. Lalwani AK, Lee H, Shrum LJ, Viswanathan M. Men engage in self-deceptive enhancement, whereas women engage in impression management. Psychology and Marketing. 2023;40(7):1405–1416. doi: 10.1002/mar.21805 [DOI] [Google Scholar]
30. Deffuant G, Carletti T, Huet S. The Leviathan model: Absolute dominance, generalised distrust and other patterns emerging from combining vanity with opinion propagation. Journal of Artificial Societies and Social Simulation. 2013;16(23). [Google Scholar]
31. Hovland C, Sherif M. Social judgment: Assimilation and contrast effects in communication and attitude change. Greenwood; 1980. [DOI] [PubMed] [Google Scholar]
32. Takacs K, Flache A, Mas M. Discrepancy and Disliking Do Not Induce Negative Opinion Shifts. PLoS ONE. 2016;11(6). doi: 10.1371/journal.pone.0157948 [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Vallières E, Vallerand R. Traductuion et validation canadienne-française de l’échelle de l’estime de soi de Rosenberg. International Journal of Psychology. 1990;25(2):305–316. doi: 10.1080/00207599008247865 [DOI] [Google Scholar]
34. Bates D, Machler M, Bolker B, Walker S. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software. 2015;67(1):1–48. doi: 10.18637/jss.v067.i01 [DOI] [Google Scholar]
35. Nezlek JB. An introduction to Multilevel Modeling for Social and Personality Psychology. Social and Personality Psychology Compass. 2008;2(2):842–860. doi: 10.1111/j.1751-9004.2007.00059.x [DOI] [Google Scholar]
36. Efron B, Tibshirani R. An Introduction to the Bootstrap. Chapman and Hall/CRC; 1993. [Google Scholar]
37. Sullivan GM, Feinn R. Using Effect Size-or Why the P Value Is Not Enough. J Grad Med Educ. 2012;4(3):279–282. doi: 10.4300/JGME-D-12-00156.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
38. Kahneman D, Tversky A. Prospect Theory: An Analysis of Decision under Risk. Ecnometrica. 1979;47(2):263–291. doi: 10.2307/1914185 [DOI] [Google Scholar]
39. Baumeister R, Finkenauer C, Vohs K. Bad is stonger than good. Review of General Psychology. 2001;5(4):323–370. doi: 10.1037/1089-2680.5.4.323 [DOI] [Google Scholar]
40. Cherry T, Ellis L. Does Rank-Order Grading Improve Student Performance? Evidence from a Classroom Experiment. International Review of Economics Education. 2005;4(1):9–19. doi: 10.1016/S1477-3880(15)30140-7 [DOI] [Google Scholar]
41. Brown J, Rebecca Collins, Schmidt G. Self-esteem and direct versus indirect forms of self-enhancement. Journal of Personality and Social-Psychology. 1985;55(3):445–453. doi: 10.1037/0022-3514.55.3.445 [DOI] [Google Scholar]
42. Shrauger JS. Responses to evaluation as a function of initial self-perceptions. PsychologicalBulletin. 1975;82:581–596. [DOI] [PubMed] [Google Scholar]
43. Cairns RB. Developmental epistemology and self-knowledge: Towards a reinterpretation of self-esteem. In: Greenberg G, Tobach E, editors. Theories of the evolution of knowing: TheL C. Schneirla conference series. Hillsdale, NJ: Erlbaum.; 1990. p. 69–86. [Google Scholar]
44. Kernis MH. Toward a Conceptualization of Optimal Self-Esteem. Psychological Inquiry. 2003;14(1):1–26. doi: 10.1207/S15327965PLI1401_01 [DOI] [Google Scholar]
45. Szekely A, Lipari F, Antonioni A, Paolucci M, Tummolini L, Andrighetto G. Evidence from a long-term experiment that collective risk change social norms and promote cooperation. Nature Communications. 2021;12 (5452). doi: 10.1038/s41467-021-25734-w [DOI] [PMC free article] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0296383.r001

Decision Letter 0

Srebrenka Letina

26 Apr 2023

PONE-D-22-34148A newly detected bias in self-evaluationPLOS ONE

Dear Dr. DEFFUANT,

==============================

In line with reviewers' comments, your major revision will require:1. More detailed theoretical background (see helpful comments by reviewer #2) and introduction, including the rationale for all included variables in your models.2. Restating of the research goals so that they correspond more closely to your analyses.3 .A different, more suitable analytical approach (multilevel modelling that aligns with the research design) 4. More clarity and detailed report on data analysis e.g., model specifications, diagnostics. Also making your code available - I suggest providing a link to a GitHub account with the code - would help clarify some of concerns and questions of the reviewers regarding the analyses you have done. Finally, the revision will require including some additional tables and figures - possibly as Supplementary Materials (see reviewer #2 comments).Since both reviewers, as myself, recognize the potential of the paper I would like to give you a chance to resubmit with major revisions. However, I recognize that it will require considerable changes.It is my impression that the reviewers provided well-explained and helpful comments and I hope that will enable you to make your revisions.

I want to emphasize that the major revision should address each and every comment by both reviewers fully for paper to be accepted.

==============================

Please submit your revised manuscript by Jun 10 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Srebrenka Letina, Ph.D.

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please provide additional details regarding participant consent. In the Methods section, please ensure that you have specified (1) whether consent was informed and (2) what type you obtained (for instance, written or verbal). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

3. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

4. Thank you for stating in your Funding Statement:

“This research has been partly supported by the French National Research Agency through the ToRealSim project”

Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now. Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement.

Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf.

5. Thank you for stating the following in the Acknowledgments Section of your manuscript:

“We are grateful to Sylvie Huet for her support at early stages of the research.”

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form.

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:

“This research has been partly supported by the French National Research Agency through the ToRealSim project”

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

6. Please include your full ethics statement in the ‘Methods’ section of your manuscript file. In your statement, please include the full name of the IRB or ethics committee who approved or waived your study, as well as whether or not you obtained informed written or verbal consent. If consent was waived for your study, please include this information in your statement as well.

Additional Editor Comments:

One reviewer suggested to reject this paper, while the other suggested a major revision. Upon closer inspection of their reasoning it seems that they both refer to similar general issues that require re-stating of the research goals, more effort in providing suitable background rationale and a rather different analytical approach. But both reviewers, as myself, recognize the potential of the paper. Therefore, if you are able to make such considerable changes in the available time, I invite to resubmit your major revision.

I want to emphasize that the major revision should address each and every comment by both reviewers fully for paper to be accepted.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: No

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I reviewed a previous version of this paper. In this revision the authors have addressed some of the previous concerns, for example by offering a rationale for why higher self-evaluations for a highly specific task might be associated with lower responsiveness to feedback on performance. The paper’s underlying idea remains interesting, a new process (different from conventional self-enhancement) that can lead to exaggerated self-evaluations. A number of concerns remain, however. In the most general terms, the main problem is a mismatch between the paper’s stated goal, testing a theoretical prediction derived from earlier modeling work in a human-subjects experiment, and what is actually done, which has a strongly exploratory flavor. There are also important areas of unclarity, for example about the details of the analyses.

The paper’s stated goal would best be met by an analysis that (a) omitted participants who reported low levels of trust in the manipulation – because such participants are not relevant for testing the hypothesis – and (b) used all the remaining participants in a multilevel analysis that accounts for nonindependence of responses from the same individual. As far as I can tell, no such analysis is reported. Instead, the analyses mostly retain the inappropriate low-trust participants, and slice and dice the data into many subsets for separate analyses (for example, only the first two or the first 3 responses from the 4 trials; males versus females; high versus low self-esteem; two different scoring methods), as seen in Tables 4 through 10. The problem is that multiple low-powered analyses (low-powered because they use only subsets of the data) give highly variable results, and some will pop up as significant by chance alone without these results being meaningful. A better approach would be to conduct a single analysis using all high-trust participants to test the overall hypothesis, and perhaps follow that up with exploratory analyses to examine trends by gender or self-esteem, etc.

A related concern is that the potential moderator variables that are examined (scoring method, gender, self-esteem and so on) are not theoretically motivated – there is no conceptual discussion of why these variables should influence the magnitude of the effect. In fact, one atheoretical factor (rank versus score) ends up reversing the predicted effect (top of p. 24), which can be given only a speculative interpretation. The whole paper comes across as a large-scale exercise in data-fishing rather than as a focused test of an interesting a priori hypothesis.

There is still a concern that participants were required to report a new self-rating between their former rating and the new feedback – in other words, the program forced participants to modify their self-evaluations in the way described by the mathematical model. This may partially account for any agreement between the experimental data and the theoretical predictions.

The exact details of the multilevel (hierarchical) analyses HLM1 and HLM2 not specified on p. 9. The model equation or model specification in lmer syntax for R (for example) should be provided.

Reviewer #2: The authors present a very interesting paper dedicated to investigate the interplay between self-esteem, sensitivity to feedback and self-enhancement. They propose to research this important topic by an experimental study in which participants assess an area of irregular geometric shapes and then self-assess their performance. In addition, participants are given experimentally manipulated feedback, which is either positive or negative. Overall, this is a very interesting study with potentially novel contribution.

However, reading the article I could not shrug off the impression that the paper resembles more a technical report than a full-blown research paper. This is mainly due to a very cursory introduction, in which, in my opinion, the theses of the research are not sufficiently introduced and explained. The authors refer to agent-model studies that have established a relation between the level of self-assessment and sensitivity to feedback, pointing to a decreased sensitivity to feedback as one of the sources of self-enhancement. This is a very interesting research, but I would like to see more theoretical elaboration on it as simple references to some previous studies do not suffice. The literature on sources and mechanisms of self-enhancement is vast, so I recommend introducing the ideas of the research in the context of a proper literature review. The list of exemplary publications to which the authors can refer contains, i.a.:

Abele, A. E., & Wojciszke, B. (2007). Agency and communion from the perspective of self versus others. Journal of Personality and Social Psychology, 93(5), 751–763. https://doi.org/10.1037/0022-3514.93.5.751

Beer, J. S. (2014). Exaggerated Positivity in Self-Evaluation: A Social Neuroscience Approach to Reconciling the Role of Self-esteem Protection and Cognitive Bias. Social and Personality Psychology Compass, 8(10), 583–594. https://doi.org/10.1111/spc3.12133

Beer, J. S., Rigney, A. E., & Koski, J. E. (2018). Self-evaluation. In J. T. Wixted (Ed.), Stevens’ Handbook ofExperimental Psychology and Cognitive Neuroscience. Fourth Edition. (pp. 1–30). John Wiley & Sons. https://doi.org/10.7551/mitpress/7458.003.0023

Cahill, D. P. (2015). Wishful Thinking, Fast and Slow. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences. https://dash.harvard.edu/bitstream/handle/1/17467495/CAHILL-DISSERTATION-2015.pdf?sequence=1

Gesiarz, F., Cahill, D., & Sharot, T. (2019). Evidence accumulation is biased by motivation: A computational account. PLoS Computational Biology, 15(6), 1–15. https://doi.org/10.1371/journal.pcbi.1007089

Miller, T. M., & Geraci, L. (2011). Training metacognition in the classroom: The influence of incentives and feedback on exam predictions. Metacognition and Learning, 6(3), 303–314. https://doi.org/10.1007/s11409-011-9083-7

Paulhus, D. L., & Reid, D. B. (1991). Enhancement and Denial in Socially Desirable Responding. Journal of Personality and Social Psychology, 60(2), 307–317. https://doi.org/10.1037/0022-3514.60.2.307

Rigney, A. E. (2019). The role of biased searching through memory in motivated social evaluation. Unpublished doctoral dissertation. University of Texas, TX, USA.

Sedikides, C., & Green, J. D. (2000). On the self-protective nature of inconsistency-negativity management: Using the person memory paradigm to examine self-referent memory. Journal of Personality and Social Psychology, 79(6), 906–922. https://doi.org/10.1037/0022-3514.79.6.906

Sedikides, C., & Gregg, A. P. (2008). Portraits of the self. The SAGE Handbook of Social Psychology, 93–122. https://doi.org/10.4135/9781848608221.n5

Swann, W. B., & Read, S. J. (1981b). Acquiring self-knowledge: The search for feedback that fits. Journal of Personality and Social Psychology, 41(6), 1119.

Trope, Y., & Neter, E. (1994). Reconciling competing motives in self-evaluation: the role of self-control in feedback seeking. Journal of Personality and Social Psychology, 66(4), 646- 657. https://doi.org/10.1037/0022-3514.66.4.646

The authors should also comment on the role of the domain in which participants self-assess and receive feedback with regard to its relevance/importance and agency/communion values.

Besides the request to write a much broader literature review, I would also like to rise some minor issues:

Anchor: Why 1/3 of the participants were given low anchor and 2/3 high anchor and not 50:50? Was that done solely to have more data for respondents with higher anchor?

Research design: In general, authors should re-write that part of the paper and explain their experimental design in clear and short terms. If I am not mistaken, the design is in fact close to 6 (type of feedback sequence) x 2 (type of feedback: score or rank), so authors should present it more clearly in terms understandable to interdisciplinary readership. I also think that the paper’s clarity would benefit from commenting and explaining more about the regression equations used to analyse the results, including their hierarchical character. I think some graphs or diagrams could help here, otherwise the reader is a bit lost in the overflow of equations and tables.

Slopes and Intercepts: I would appreciate if the authors could possibly present all the regression coefficients in a clear way before presenting the numerical results. Otherwise, it is difficult to follow what all those different slopes and intercepts mean.

Tables and Figures layout: The tables could be a bit more readable and more conforming to standards of communication, e.g., explaining what does the “*” stand for (I assume it’s for the p-levels but it is not explained in the paper). Also, the tables and figures lack titles.

Self-disparagement: I would also suggest replacing the term “self-disparagement” with something more often used in the self-motives literature, e.g. terms “self-effacement”, “self-derogation”, or “self-diminishment” (e.g. Kiu, Chiu & Zou, 2010). “Self-disparagement” seems to be used in other contexts, e.g. depression or humour research. In my opinion, we should avoid multiplying terms and should stick to the already many terms used in the self-motives field.

Gender differences: I think the gender differences in the self-enhancement bias described in the paper are not entirely new and the authors should comment on them in the light of literature. For example, the authors can use the works on math-related self-enhancement of:

Paulhus, D. L., & John, O. P. (1998). Egoistic and moralistic biases in self‐perception: The interplay of self‐deceptive styles with basic traits and motives. Journal of personality, 66(6), 1025-1060.

Palczyńska, M., & Rynko, M. (2021). ICT skills measurement in social surveys: Can we trust self-reports?. Quality & Quantity, 55(3), 917-943.

In general, males have higher self-enhancement tendencies in agentic than communal domains. Please link your findings with literature on this topic.

Self-esteem and self-enhancement: The authors should also elaborate more on this topic in the discussion section, e.g. starting from the discussion in:

Gebauer, J. E., Wagner, J., Sedikides, C., & Neberich, W. (2013). Agency-communion and self-esteem relations are moderated by culture, religiosity, age, and sex: Evidence for the “self-centrality breeds self-enhancement” principle. Journal of Personality, 81(3), 261–275. https://doi.org/10.1111/j.1467-6494.2012.00807.x

Kernis, M. H. (2003). Toward a conceptualization of optimal self-esteem. Psychological Inquiry, 14(1), 1-26.

Screening out participants: I would like to see some justification on using a broad RT filter based on 3 minutes. Why this value has been chosen? The literature on using response times in screening out careless respondents advocates for a different, more theory-based approach. The authors should present the distribution of response times and base their further moves on previous works, e.g.:

Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27, 99-114.

Kroehne, U., Buchholz, J., & Goldhammer, F. (2019). Detecting carelessly invalid responses in item sets using item-level response times. In Annual Meeting of the National Council on Measurement in Education.

Ulitzsch, E., Pohl, S., Khorramdel, L., Kroehne, U., & von Davier, M. (2022). A response-time-based latent response mixture model for identifying and modeling careless and insufficient effort responding in survey data. psychometrika, 87(2), 593-619.

Consequences of screening: Basing on the consensus in careless responding research when any participants’ screening out is proposed, its consequences should be clearly showed, i.e., all analyses should be conducted both on full sample and on screened out (filtered) sample. Please present such results.

Trust: Why the threshold of trust higher or equal to 7 was chosen? Would the results differ if other cut-off point have been picked up?

From very minor issues – the paper is in general well-written, but there are occasional typos and similar minor linguistic problems so I recommend proofreading before final submission.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Marek Muszyński

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2024 Feb 8;19(2):e0296383. doi: 10.1371/journal.pone.0296383.r002

Author response to Decision Letter 0

15 Jun 2023

Our response to the reviewers is detailed in the cover letter to the editor.

Attachment

Submitted filename: answerToReviewers.pdf

Click here for additional data file.^{(1,001.9KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0296383.r003

Decision Letter 1

Srebrenka Letina

17 Jul 2023

PONE-D-22-34148R1A newly detected bias in self-evaluationPLOS ONE

Dear Dr. DEFFUANT,

Although reviewers recognized the upgraded quality of the revised manuscript and generally show an interest and a belief in its contributions, a major revision is still required in order for the manuscript be published. Both reviewers still require a list of improvements. Some of the main points include:- Justify the choice of responses provided to participants and clearly state the consequences of such choice for your findings- Justify/explain the sample size and undertake a power analysis- Theoretically justify all of your analysis and examine the findings accordingly- Rethink your statistical approach in general, specifically consider using hierarchical modelling approach, if not applicable explain in details why and what are the limitations of not doing so- Strengthen the Introduction theoretically (with works suggested by reviewers), but also the Discussion, including sections on future research and limitations- Make a more substantial effort in providing figures that help in understanding the paper's findingsThese are just main improvements expected in the next revision, but each point made by both reviewers (see the details further below) needs to be addressed fully and convincingly for the paper to be accepted.I hope you will find the reviewer's comments helpful for your next revision.

Please submit your revised manuscript by Aug 31 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Srebrenka Letina, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (if provided):

Major revision required.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #1: This relatively thorough revision has improved the paper in several respects. I still have reservations which I will list briefly below, but the paper now seems to offer enough value to warrant publication. The empirical demonstration of a novel process leading to self-enhancement, evaluation-dependent sensitivity to feedback, is a solid contribution.

The authors may or may not wish to address my remaining concerns.

1. The experiment forced participants to respond as theory specifies (making each self-assessment update fall between the previous self-assessment and the current feedback). The paper notes that in pilot testing, a majority of participants responded in this way without being restricted. However, the restriction still troubles me – it shapes the way participants think about the task, clearly pushing them in the direction of conforming to the theoretical prediction.

2. The primary hypotheses (line 69) are solid and well-motivated. In contrast, the secondary objective is to examine several potential moderator variables which are not introduced with any conceptual rationale. This is especially troubling for the evaluation scale, which has strong and unpredicted effects on the results. One previous study found a somewhat similar pattern, but no compelling theoretical interpretation can be offered in this paper (lines 490-518).

3. I still do not think the data analysis strategy is optimal. First, participants who express low “trust” in the experimental deception are irrelevant to the hypotheses and should not be included in the analysis. Second, the analyses in the main manuscript inappropriately use multiple observations ("triples") from each participant, which are statistically non-independent in violation of the assumptions of ordinary regression analysis. The analyses in SM use a two-step approach, running a regression for each participant and then predicting the coefficients of those analyses as outcome measures. This approach suffers from low power, because in the first step each analysis has very few observations (and so few coefficients are significant, line 18 of SM). More appropriate would be a hierarchical linear analysis, for example with the lmer function in R. Such an analysis does not use two separate steps but properly accounts for the non-independence of multiple observations from each participant, and gives results that describe the entire sample.

As stated above, however, despite these conceerns I think the paper offers meaningful value to the field.

Reviewer #2: I thank the authors for addressing my comments.

I think that the present version of the article is markedly better than the previous one. Well done!

However, I still think that some parts of the manuscript should be improved before publication. I present my concerns below:

1. Please provide power analysis to justify that the sample sizes enable achieving required statistical power in your analyses.

2. The introduction still needs improving. I think the authors should clearly introduce topics such as agency and communion (with relation to self-enhancement) or different types of self-enhancement (impression management, self-deceptive denial, self-deceptive enhancement). Please elaborate more on that in the introduction, e.g., by using many excellent works of Delroy Paulhus (types of self-enhancement) or Bogdan Wojciszke (agency and communion). Please also refer to this recent article that seems relevant and also comments on inter-gender differences:

Lalwani, A. K., Lee, H., Shrum, L. J., & Viswanathan, M. (2023). Men engage in self‐deceptive enhancement, whereas women engage in impression management. Psychology & Marketing.

3. Please also include more information about the relations between self-esteem, gender and perceived level of expertise with self-enhancement in the introduction. An excellent paper by Stav Atir can help you to achieve this:

Atir, S., Rosenzweig, E., & Dunning, D. (2015). When knowledge knows no bounds: Self-perceived expertise predicts claims of impossible knowledge. Psychological Science, 26(8), 1295-1303.

4. Please, please think hard on how can you improve the presentation of your findings in a graphical way. For now, it is hard to follow the information flow presented in Tables 4-7. I would also like to point your attention to the fact that they do not conform to any widely accepted style. I would suggest to rethink their "look" and redraw them in the APA style. This also relates to the * for p-levels. Still, I do not know to which number of stars is related to which p-level.

5. I was also expecting some results about the role of gender and self-esteem play in feedback sensitivity and self-enhancement presented more in a fashion of regresssion results. Could you possibly present it in this way to increase understanding of the analyses and results?

6. In lines 75-76 and 187 you use the term "modulated". I think you should use "moderated" or "related to", depending on what is your analytic stance towards those relations. "Modulated" is a conversational term and should not be used here (I know it is used widely, but I think we should avoid it).

7. Please tone down the language in lines 58-63. These are only hypotheses/suggestions.

8. Regarding lines 100-107 - can you provide any evidence from studies with human participants?

9. Short question about Table 1 - why does it contain only four combinations? There are more combinations of balanced six feedbacks...

10. It is outstanding that you have found such a valuable and relevant references as number 19 (Muller-Pinzler et al., 2019)! I think you should capitalise more on that and present this important study more in the introduction. Please, do not assume that the readership knows this paper. It should be sufficiently presented and commented on in the relevant sections of the paper.

11. Please expand the discussion on self-esteem and self-enhancement on topics such as sensitivity to feedback, narcissism, fragile self-esteem and rely to relevant literature.

12. The paper lacks important sections such as "Limitations" and "Future directions". I would ask authors to amend for that.

I see that the paper is approaching the right direction and would like to encourage the authors for some more effort before the publication. Fingers crossed!

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

PLoS One. 2024 Feb 8;19(2):e0296383. doi: 10.1371/journal.pone.0296383.r004

Author response to Decision Letter 1

25 Aug 2023

The complete response to the reviewers is in the file "answerToTheRevewers_R3.pdf".

Attachment

Submitted filename: answersToReviewers_R3.pdf

Click here for additional data file.^{(2.9MB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0296383.r005

Decision Letter 2

Srebrenka Letina

8 Oct 2023

PONE-D-22-34148R2A newly detected bias in self-evaluationPLOS ONE

Dear Dr. DEFFUANT,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. A major revision is required. In the next revision it is expected that you will address the review #1 suggestions appropriately - namely, by using hierarchical multi-level modeling. Also, address the issue related with artificially imposed restriction by recognizing it explicitly in the section about limitation of the study. Finally, we require that you address all minor comments made by reviewer #2 (see below).

Please submit your revised manuscript by Nov 22 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Srebrenka Letina, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (if provided):

Major revision required.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #1: (No Response)

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Partly

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: I Don't Know

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #1: This second revision is very similar in substance to the previous version. The theoretical point that differential sensitivity to feedback is a novel mechanism that can contribute to self-enhancement is sound. But I still have two significant concerns about the experimental demonstration reported in this paper.

First, as noted before: “The experiment forced participants to respond as theory specifies (making each self-assessment update fall between the previous self-assessment and the current feedback). The paper notes that in pilot testing, a majority of participants responded in this way without being restricted. However, the restriction still troubles me – it shapes the way participants think about the task, clearly pushing them in the direction of confirming the theoretical prediction.” The authors’ response emphasizes that this restriction was intended to reduce noise in the observations, and that participants remained free to choose any self-evaluation between the given limits. But that response misses my point that the restriction imposes on participants a specific way of thinking about the task. By forcing them to think of it as updating their previous response by moving some proportion of the way toward the current feedback, the researchers are in effect forcing them to use the theoretically assumed process. There are potentially other ways participants might approach the task if this restriction was not artificially imposed, for example deciding that the feedback jumps around too much to be useful or believable and therefore trying to disregard it. The authors’ response completely ignores this concern, that the restriction shapes the way participants think about the task.

Second, I remain concerned that the main analyses use ordinary linear regression on a set of “triples,” which include multiple observations from each participant. Those are obviously non-independent, violating a statistical assumption of the regression model. I recommended using a hierarchical regression (also termed multilevel or mixed model), in which participants are explicitly treated as a random effect, to correctly account for this non-independence.

The authors respond in two ways. One part of the authors’ response is to present regression analyses that include additional predictors (age, gender, trust, etc.). These are entirely irrelevant to the criticism I was making.

Second, they report “hierarchical” models in SM (2.1 and 2.2) that they argue show any non-independence can be ignored. These models are analyzed in two separate steps (between and within participants) rather than as a standard hierarchical model, for example using lmer* in R. I do not understand how or why the authors claim that the results from these models rule out non-independence. More important, experts recommend the use of hierarchical models in all cases when the data has a multilevel structure (here, observations nested within participants) regardless of the statistical level of non-independence. For example, Nezlek (2008)* writes “analysts should use multilevel modeling when they have a multilevel data structure – pure and simple. When I am asked for advice regarding whether or not multilevel modeling is appropriate, my first question concerns the nature of the data structure. If there is a meaningful nested hierarchy to the data, my advice is to use multilevel modeling, irrespective of distracting arguments about ICCs and so forth.” And “researchers sometimes use a low or zero ICC to justify a decision not to use multilevel modeling – on the grounds that because there is no (or very little) between-group variance in the dependent measure, the grouped (or nested) structure of the data can be ignored. This is a dangerous assumption that is not justifiable. Frequently (or almost invariably), researchers are interested in relationships between measures. The fact that there is little or no between-group variance in a measure does not mean that the relationship between this measure and another measure is the same across all groups, something that is assumed if one conducts and analysis that ignores the grouped structure of the data. By extension, even if there is no between-group variance for all of the measures of interest, it cannot be assumed that relationships between or among these measures do not vary across groups.”

To be absolutely clear, the model I view as appropriate, expressed as an lmer* formula, is something like:

Dependent.variable ~ 1 + independent.variables + (1 + independent.variables | Participant)

where the term in parentheses indicates random slopes and intercepts per participant.

*References

Nezlek, J. B. (2008). An introduction to multilevel modeling for social and personality psychology. Social and Personality Psychology Compass, 2(2), 842-860.

https://cran.r-project.org/web//packages/lme4/vignettes/lmer.pdf

Reviewer #2: I think that the manuscript is improved but it is still written in a unnecessarily complicated way, even convoluted sometimes. The literature review is still shallow at times, although the changes introduced in the review rounds have increased the quality and comprehensibility of the paper.

I feel that I cannot provide more suggestions for the authors and it is up to the editors to decide on further fate of the manuscript. I just have few remaining suggestions/questions (the number is the number of line in the corrected manuscript):

a) 28-29 "in its various dimensions"

I think the paper relates to self-enhancement in the sense of self-exaggeration?

b) 109-112

Is it really completely different from other biases?

Please tone down the language here as the originality of the effects is not as high as you claim.

Take a look at feedback sensitivity research available, e.g. this article by Filip Gęsiarz:

Gesiarz, F., Cahill, D., & Sharot, T. (2019). Evidence accumulation is biased by motivation: A computational account. PLOS Computational Biology, 15(6), e1007089.

This topic was also researched by Stav Atir or Jennifer S. Beer, for example.

c) 303

Table 1 says "six series of 4 feedbacks" but I rather see four series of six different feedbacks?

d) 476

you do not have sample size of 6000, you just have 1500 that you use in 4, if I am right, different analyses, n'est-ce pas?

e) 553

I think that treating continuous variables as dichotomous is very questionable, I advice to redo the analyses presented in Figure 6 with trust and self-esteem treated as continuous predictors

f) 583

You quite often refer to differences as "more significant" or "very significant". How about using effect size measures to quantify such differences? Please provide effect size measures for your key effects.

g) Citations - I suggest to take APA style but it is up to the editors to decide on that.

h) spelling mistakes:

Table 1. heading contains a small mistake ("tow" instead of "two")

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

PLoS One. 2024 Feb 8;19(2):e0296383. doi: 10.1371/journal.pone.0296383.r006

Author response to Decision Letter 2

27 Oct 2023

The response to the reviewer is provided in a specific file submitted via the system (answerToReviewerR4.pdf)

Attachment

Submitted filename: answerToReviewersR4.pdf

Click here for additional data file.^{(1.4MB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0296383.r007

Decision Letter 3

Srebrenka Letina

24 Nov 2023

PONE-D-22-34148R3A newly detected bias in self-evaluationPLOS ONE

Dear Dr. DEFFUANT,

For paper to be accepted you are required to:1. Address the remaining comments of reviewer #2 (see below)2. Address previous comments made by reviewer #1 (see previous reviews) that in my opinion have not been addressed in detail and in convincing manner yet.

Please submit your revised manuscript by Jan 08 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Srebrenka Letina, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #2: I would like to thank the authors for all the reactions and comments to my suggestions. In general, I think that the responses satisfy my curiosity and adequately address all the caveats I had.

Some comments that remain:

a) 109-112:

I think that your discussion of the similarities and differences between your paper and the works of Gesiarz et al. (2019) and Atir et al. (2015) is very interesting and you should include it somewhere in the paper (maybe in the Discussion section?). One of the main tasks of the authors of any paper, especially one that claims to have found something new, is to comprehensively compare and contrast new findings with what was previously done in the field. It does not suffice to say “it’s new”, you have to show it. Sometimes it is really new and there are very few studies that directly links to your findings. Then you have to dig somewhat deeper and bring forth also the studies with more indirect links to what you have done.

In my opinion, your discussion does not discuss the many studies indirectly related to your experiments, therefore, I would ask you to improve it by addressing other studies about the interplay of feedback, self-enhancement, self-esteem, etc. This discussion should include studies like Gesiarz et al. (2019) as the task here is also to contrast the present findings with the previous ones and to show, where new knowledge was provided.

Some other studies you may find useful:

Haddara, N., & Rahnev, D. (2022). The impact of feedback on perceptual decision-making and metacognition: Reduction in bias but no change in sensitivity. Psychological Science, 33(2), 259-275.

Heck, P. R., & Krueger, J. I. (2015). Self-enhancement diminished. Journal of Experimental Psychology: General, 144(5), 1003–1020. https://doi.org/10.1037/xge0000105 (Study 3)

Jussim, L., Yen, H., & Aiello, J. R. (1995). Self-consistency, self-enhancement, and accuracy in reactions to feedback. Journal of experimental social psychology, 31(4), 322-356.

Schulz, L., Fleming, S. M., & Dayan, P. (2023). Metacognitive computations for information search: Confidence in control. Psychological Review.

I would also like to remind the authors that it is their task to discuss their findings in such a manner to convince the reviewers that the discussion is comprehensive and well-based in the extant literature.

b) 303

OK, now I see what you see and we see it eye to eye. Maybe adding a first column “Serie” or something like that would enhance this table further still, but now I think it is ok. Thank you.

c) 476

OK, this is rather a semantic discussion – “sample size” is typically used to indicate number of observations, which, in this case, can be either participants or answers. My recommendation is to reframe this section to make it clear what you count as “sample size” in each of the instances/analyses to avoid any confusion between the number of participants and the number of answers collected from them. I also think that the power calculation should be re-run now, to accommodate to the new analytic model employed (hierarchical model with mixed effects).

d) 553

OK, so be it, please consider adding this point somewhere to the Limitations or Future Directions section then.

Otherwise, I have no further comments and I think that this article would be ready for publication after adequate addressing of the present recommendations.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

**********

PLoS One. 2024 Feb 8;19(2):e0296383. doi: 10.1371/journal.pone.0296383.r008

Author response to Decision Letter 3

3 Dec 2023

The response to the reviewers is in the file "answerToReviewersR5.pdf"

Attachment

Submitted filename: answerToReviewersR5.pdf

Click here for additional data file.^{(849.4KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0296383.r009

Decision Letter 4

Srebrenka Letina

12 Dec 2023

A newly detected bias in self-evaluation

PONE-D-22-34148R4

Dear Dr. DEFFUANT,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Srebrenka Letina, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

PLoS One. doi: 10.1371/journal.pone.0296383.r010

Acceptance letter

Srebrenka Letina

24 Jan 2024

PONE-D-22-34148R4

PLOS ONE

Dear Dr. Deffuant,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Srebrenka Letina

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 File

(PDF)

Click here for additional data file.^{(157.3KB, pdf)}

S1 Appendix. Linear mixed effect models.

The appendix explains how we computed linear mixed effect models for data structured in different levels [35] using the lme4 R package [34].

(PDF)

Click here for additional data file.^{(52.8KB, pdf)}

S1 Table. Sensitivity to positive or negative feedbacks.

(PDF)

Click here for additional data file.^{(69.9KB, pdf)}

S2 Table. Slope of sensitivity c for low (f₀ ≤ 40) and high (f₀ ≥ 60) anchor.

(PDF)

Click here for additional data file.^{(74.4KB, pdf)}

S3 Table. Bias from sensitivity S′ for low (f₀ ≤ 40) and high (f₀ ≥ 60) anchor.

(PDF)

Click here for additional data file.^{(72.8KB, pdf)}

S4 Table. Self-enhancement bias E for t ∈ (1 : 2).

(PDF)

Click here for additional data file.^{(58.8KB, pdf)}

S5 Table. Theoretical sensitivity bias S′ for t ∈ (1 : 2).

(PDF)

Click here for additional data file.^{(66.9KB, pdf)}

S6 Table. Self-enhancement bias E for t ∈ (1 : 3).

The table shows the variations of the measures of self-enhancement bias E computed for t ∈ (1 : 3) with scale, gender and self-esteem.

(PDF)

Click here for additional data file.^{(59.2KB, pdf)}

S7 Table. Theoretical sensitivity bias S′ for t ∈ (1 : 3).

The table shows the variations of the measures theoretical sensitivity bias S′ for t ∈ (1 : 3) with scale, gender and self-esteem.

(PDF)

Click here for additional data file.^{(67.2KB, pdf)}

S8 Table. Self-enhancement bias E for t ∈ (1 : 3).

The table shows the variations of the measures of self-enhancement bias E computed for t ∈ (1 : 3) with scale, gender and self-esteem.

(PDF)

Click here for additional data file.^{(59KB, pdf)}

S9 Table. Theoretical sensitivity bias S′ for t ∈ (1 : 3).

The table shows the variations of the measures theoretical sensitivity bias S′ for t ∈ (1 : 3) with scale, gender and self-esteem.

(PDF)

Click here for additional data file.^{(67.1KB, pdf)}

S10 Table. Slope c of sensitivity to feedback for interview time greater than 3 minutes.

(PDF)

Click here for additional data file.^{(63.4KB, pdf)}

S11 Table. Bias S′ from sensitivity to feedbacks for interview time greater than 3 minutes.

(PDF)

Click here for additional data file.^{(63.8KB, pdf)}

Attachment

Submitted filename: answerToReviewers.pdf

Click here for additional data file.^{(1,001.9KB, pdf)}

Attachment

Submitted filename: answersToReviewers_R3.pdf

Click here for additional data file.^{(2.9MB, pdf)}

Attachment

Submitted filename: answerToReviewersR4.pdf

Click here for additional data file.^{(1.4MB, pdf)}

Attachment

Submitted filename: answerToReviewersR5.pdf

Click here for additional data file.^{(849.4KB, pdf)}

Data Availability Statement

All data and programs are publicly available at: https://github.com/guillaumeDeffuant/sensitivityBias.

[pone.0296383.ref001] 1. Dunning D, Heath C, Suls J. Flawed Self-Assessment. Implications for Health, Education and the Workplace. Psychological Science in the Public Interest. 2004;21:69–106. doi: 10.1111/j.1529-1006.2004.00018.x [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref002] 2. Epley N, Dunning D. Feeling holier than thou: Are self-serving assessments produced by errors in self or social prediction? Journal of Personality and Social Psychology. 2000;79:861–875. doi: 10.1037/0022-3514.79.6.861 [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref003] 3. Dunning D, Story AJ. Exploring the planning fallacy: Why people underestimate their task completion times. Journal of Personality and Social Psychology. 1991;61:521–532. [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref004] 4. Fischhoff B, Slovic P, Lichtenstein S. Knowing with certainty: The appropriateness of extreme confidence. Journal of Experimental Psychology: Human Perception and Performance. 1977;3:552–564. [Google Scholar]

[pone.0296383.ref005] 5. Buchler R, Griffin D, Ross M. Depression, realism, and the over-confidence effect: Are the sadder wiser when predicting future actions and events? Journal of Personality and Social Psychology. 1991;67:366–361. [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref006] 6. Vallone RP, Griffin D, Lin S, Ross L. Overconfident prediction of future actions and outcomes by self and others. Journal of Personality and Social Psychology. 1990;58:582–592. doi: 10.1037/0022-3514.58.4.582 [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref007] 7. Griffin D, Dunning D, Ross I. The role of construal processes in overconfident predictions about the self and others. Journal of Personality and Social Psychology. 1990;59:1128–1139. doi: 10.1037/0022-3514.59.6.1128 [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref008] 8. Sedikides C, Gregg AP. Portraits of the self. In: Hogg A, Cooper J, editors. The SAGE Handbook of Social Psychology. Sage; 2003. p. 93–122. [Google Scholar]

[pone.0296383.ref009] 9. Campbell WK, Sedikides C. Self-Threat Magnifies the Self-Serving Bias: A Meta-Analytic Integration. Review of General Psychology. 1999;3:23–43. doi: 10.1037/1089-2680.3.1.23 [DOI] [Google Scholar]

[pone.0296383.ref010] 10. Zuckerman M. Attribution of Success and FailureRevisited, or: The Motivational Bias Is Alive and Well inAttribution Theory. Journal of Personality. 1979;47:245–87. doi: 10.1111/j.1467-6494.1979.tb00202.x [DOI] [Google Scholar]

[pone.0296383.ref011] 11. Mischel W, Ebbesen E B aARZ. Determinants of Selective Memory About the Self. Journal of Consulting and Clinical Psychology. 1976;29:279–82. [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref012] 12. Skowronski JJ, Betz AI, Thompson CP, Shannon L. Social Memory in Everyday Life: Recall of Self-Events and Other-Events. Jounral of Personality and Social Psychology. 1991;68:247–60. [Google Scholar]

[pone.0296383.ref013] 13. Ditto PH, Boardman AF. Perceived Accuracy of Favorable and Unfavorable Psychological Feedback. Basic and Applied Social Psychology,. 1995;13:137–57. doi: [DOI] [Google Scholar]

[pone.0296383.ref014] 14. Beer JS. Exaggerated Positivity in Self-Evaluation: A Social Neuroscience Approach to Reconciling the Role of Self-esteem Protection and Cognitive Bias. Social and Personality Psychology Compass. 2014;8(10):583–594. doi: 10.1111/spc3.12133 [DOI] [Google Scholar]

[pone.0296383.ref015] 15. Swann WB, Schroeder DG. The search for beauty and truth: a framework for understanding reactions to evaluations. Personality and social psychology bulletin. 1985;21(12). doi: 10.1177/01461672952112008 [DOI] [Google Scholar]

[pone.0296383.ref016] 16. Sedikides C, Strube M. Self-evaluation: To Thine Own self Be Good, to Thine Own Self Be Sure, To Thine Own Self Be True, And To Thine Own Self Be better. Advances In Expermiental Social Psychology. 1997;9:209–269. [Google Scholar]

[pone.0296383.ref017] 17. Moore D, Small D. Error and bias in comparative social judgement: On being both better and worse than we think we are. Journal of Personality and Social Psychology. 2007;96(6):972–989. doi: 10.1037/0022-3514.92.6.972 [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref018] 18. Muller-Pinzler L, Czekalla N, Mayer AV, Stolz DS, Gazzola V, Christian Keysers FMP, et al. Negativity-bias in forming beliefs about own abilities. Scientitfic reports. 2019;9(14416). doi: 10.1038/s41598-019-50821-w [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0296383.ref019] 19. Moreland RL, Sweeney PD. Self-expectancies and relations to evaluations of personal performance. Personality. 1984;52:156–176. doi: 10.1111/j.1467-6494.1984.tb00350.x [DOI] [Google Scholar]

[pone.0296383.ref020] 20. Deffuant G, Bertazzi I, Huet S. The dark side of gossips: hints from a simple opinion dynamics model. Advances in Complex Systems. 2018;21. doi: 10.1142/S0219525918500212 [DOI] [Google Scholar]

[pone.0296383.ref021] 21. Deffuant G, Roubin T. Do interactions among unequal agents undermine those of low status? Physica A: Statistical Mechanics and its Applications. 2022;592:126780. doi: 10.1016/j.physa.2021.126780 [DOI] [Google Scholar]

[pone.0296383.ref022] 22. Kobayashi C, Brown J. Self-Esteem and Self-Enhancement in Japan and America. Journal of Cross-Cultural Psychology. 2003;34(3). [Google Scholar]

[pone.0296383.ref023] 23. Bosson J, Brown R, Zeigler-Hill V, Swann W. Self-Enhancement Tendencies Among People With High Explicit Self-Esteem: The Moderating Role of Implicit Self-Esteem. Self and Identity. 2010;2(3):169–187. doi: 10.1080/15298860309029 [DOI] [Google Scholar]

[pone.0296383.ref024] 24. Atir S, Rosenzweig E, Dunning D. When Knowledge Knows No Bounds: Self-Perceived Expertise Predicts Claims of Impossible Knowledge. Psychological Science. 2015;26(8):1295–1303. doi: 10.1177/0956797615588195 [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref025] 25. Beyer S. Gender differences in the accuracy of self-evaluation of performance. Journal of Personality and Social Psychology. 1990;59:960–970. doi: 10.1037/0022-3514.59.5.960 [DOI] [Google Scholar]

[pone.0296383.ref026] 26. Kurman J. Gender, Self-Enhancement, and Self-Regulationof Learning Behaviors in Junior High School. Sex Roles. 2004;50(9/10). doi: 10.1023/B:SERS.0000027573.36376.69 [DOI] [Google Scholar]

[pone.0296383.ref027] 27. Gebauer JE, Wagner J, Sedikides C, Neberich W. Agency-communion and self-esteem relations are moderated by culture, religiosity, age, and sex: evidence for the “self-centrality breeds self-enhancement” principle. Journal of Personality. 2013;81(3):261–275. doi: 10.1111/j.1467-6494.2012.00807.x [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref028] 28. Paulhus D, John O. Egoistic and Moralistic Biases inSelf-Perception:The Interplay of Self-Deceptive Styles With Basic Traits and Motives. Jounral of Personality. 1998;66(6). [Google Scholar]

[pone.0296383.ref029] 29. Lalwani AK, Lee H, Shrum LJ, Viswanathan M. Men engage in self-deceptive enhancement, whereas women engage in impression management. Psychology and Marketing. 2023;40(7):1405–1416. doi: 10.1002/mar.21805 [DOI] [Google Scholar]

[pone.0296383.ref030] 30. Deffuant G, Carletti T, Huet S. The Leviathan model: Absolute dominance, generalised distrust and other patterns emerging from combining vanity with opinion propagation. Journal of Artificial Societies and Social Simulation. 2013;16(23). [Google Scholar]

[pone.0296383.ref031] 31. Hovland C, Sherif M. Social judgment: Assimilation and contrast effects in communication and attitude change. Greenwood; 1980. [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref032] 32. Takacs K, Flache A, Mas M. Discrepancy and Disliking Do Not Induce Negative Opinion Shifts. PLoS ONE. 2016;11(6). doi: 10.1371/journal.pone.0157948 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0296383.ref033] 33. Vallières E, Vallerand R. Traductuion et validation canadienne-française de l’échelle de l’estime de soi de Rosenberg. International Journal of Psychology. 1990;25(2):305–316. doi: 10.1080/00207599008247865 [DOI] [Google Scholar]

[pone.0296383.ref034] 34. Bates D, Machler M, Bolker B, Walker S. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software. 2015;67(1):1–48. doi: 10.18637/jss.v067.i01 [DOI] [Google Scholar]

[pone.0296383.ref035] 35. Nezlek JB. An introduction to Multilevel Modeling for Social and Personality Psychology. Social and Personality Psychology Compass. 2008;2(2):842–860. doi: 10.1111/j.1751-9004.2007.00059.x [DOI] [Google Scholar]

[pone.0296383.ref036] 36. Efron B, Tibshirani R. An Introduction to the Bootstrap. Chapman and Hall/CRC; 1993. [Google Scholar]

[pone.0296383.ref037] 37. Sullivan GM, Feinn R. Using Effect Size-or Why the P Value Is Not Enough. J Grad Med Educ. 2012;4(3):279–282. doi: 10.4300/JGME-D-12-00156.1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0296383.ref038] 38. Kahneman D, Tversky A. Prospect Theory: An Analysis of Decision under Risk. Ecnometrica. 1979;47(2):263–291. doi: 10.2307/1914185 [DOI] [Google Scholar]

[pone.0296383.ref039] 39. Baumeister R, Finkenauer C, Vohs K. Bad is stonger than good. Review of General Psychology. 2001;5(4):323–370. doi: 10.1037/1089-2680.5.4.323 [DOI] [Google Scholar]

[pone.0296383.ref040] 40. Cherry T, Ellis L. Does Rank-Order Grading Improve Student Performance? Evidence from a Classroom Experiment. International Review of Economics Education. 2005;4(1):9–19. doi: 10.1016/S1477-3880(15)30140-7 [DOI] [Google Scholar]

[pone.0296383.ref041] 41. Brown J, Rebecca Collins, Schmidt G. Self-esteem and direct versus indirect forms of self-enhancement. Journal of Personality and Social-Psychology. 1985;55(3):445–453. doi: 10.1037/0022-3514.55.3.445 [DOI] [Google Scholar]

[pone.0296383.ref042] 42. Shrauger JS. Responses to evaluation as a function of initial self-perceptions. PsychologicalBulletin. 1975;82:581–596. [DOI] [PubMed] [Google Scholar]

[pone.0296383.ref043] 43. Cairns RB. Developmental epistemology and self-knowledge: Towards a reinterpretation of self-esteem. In: Greenberg G, Tobach E, editors. Theories of the evolution of knowing: TheL C. Schneirla conference series. Hillsdale, NJ: Erlbaum.; 1990. p. 69–86. [Google Scholar]

[pone.0296383.ref044] 44. Kernis MH. Toward a Conceptualization of Optimal Self-Esteem. Psychological Inquiry. 2003;14(1):1–26. doi: 10.1207/S15327965PLI1401_01 [DOI] [Google Scholar]

[pone.0296383.ref045] 45. Szekely A, Lipari F, Antonioni A, Paolucci M, Tummolini L, Andrighetto G. Evidence from a long-term experiment that collective risk change social norms and promote cooperation. Nature Communications. 2021;12 (5452). doi: 10.1038/s41467-021-25734-w [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

A newly detected bias in self-evaluation

Guillaume Deffuant

Thibaut Roubin

Armelle Nugier

Serge Guimond

Roles

Abstract

Introduction

Materials and methods

Model and hypotheses

General definition of the positive bias from decreasing sensitivity

Positive bias from decreasing sensitivity with alternating positive and negative feedbacks

Positive bias from decreasing sensitivity with self-enhancement or self-derogation

Experiment

Overview of the experiment

Fig 1. Schema of the experiment.

Table 1. The six series of the 4 feedbacks f1, f2, f3 and f4 (two positive, two negative).

Experimental design

Choice of a very specific task

Result treatment

Linear approximations of the sensitivity to feedback functions

Total bias B in a sample

Average self-enhancement bias E in a sample

Theoretical bias from sensitivity of feedbacks S′

Fig 2. Illustration of the computation of biases from self-enhancement and from sensitivity.

Bootstrap

Power analysis

Results

Checking main hypotheses

The sensitivity to feedback decreases when self-evaluation increases

Table 2. Slope of sensitivity to feedback on several time steps.

Fig 3. Examples of regressions and measures of the different biases.

Table 3. Slope c(t) of sensitivity to feedback at each time step. The slope is computed on data from participants reporting different trust values (first column).

The decreasing sensitivity to feedback generates a measurable positive bias

Fig 4. Comparison of measures S and S′ of the bias from sensitivity to feedbacks for t ∈ (1 : 4).

Fig 5. Measure S′ of the bias from sensitivity to feedbacks for t ∈ (1 : 2) and t ∈ (1 : 3).

Variations of biases with scale, gender and self-esteem

Fig 6. Sensitivity bias E for different values of scale, gender and self-esteem (low SE: Self-esteem ≤3, high SE: Self-esteem >3), for trust in [7, 10] and t ∈ (1 : 3).

Fig 7. Enhancement bias E for different values of scale, gender and self-esteem (low SE: Self-esteem ≤3, high SE: Self-esteem >3), for trust in [7, 10] and t ∈ (1 : 3).

Discussion

Comments on the variations of self-enhancement bias

Effect of scale

Effect of self-esteem

Effect of gender

Discussion about the bias from decreasing sensitivity to feedback

Limitations and future challenges

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Srebrenka Letina

Roles

Author response to Decision Letter 0

Decision Letter 1

Srebrenka Letina

Roles

Author response to Decision Letter 1

Decision Letter 2

Srebrenka Letina

Roles

Author response to Decision Letter 2

Decision Letter 3

Srebrenka Letina

Roles

Author response to Decision Letter 3

Decision Letter 4

Srebrenka Letina

Roles

Acceptance letter

Srebrenka Letina

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Table 1. The six series of the 4 feedbacks f₁, f₂, f₃ and f₄ (two positive, two negative).