Abstract
Previous studies have demonstrated that individuals can utilize the wisdom of crowds, known as ‘the wisdom of the inner crowd’. This requires answering the same question multiple times and averaging the estimates. Although several methods have been proposed to achieve more accurate estimates, its efficacy remains relatively low. Therefore, this study proposes a method that assembles multiple independent methods to simulate the wisdom of the inner crowd effect. Particularly, our method instructs participants to provide estimates five times. Through a behavioural experiment, we confirmed that our method can produce the wisdom-of-innercrowd effect. Moreover, we found that our method produced more accurate estimates than a method that required participants to estimate five times without specific instructions. Furthermore, mathematical modelling demonstrated that the effectiveness of our method was greater than that of 1.5 persons. In sum, this study proposes a method to improve daily estimates.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-025-14740-3.
Keywords: The wisdom of crowds, The wisdom of the inner crowd, Estimation
Subject terms: Psychology, Human behaviour
Introduction
In daily life, people often need to estimate unknown events (e.g., the number of people attending a conference or the price of a car). How can people make accurate estimates for these types of questions? One promising approach is that of ‘the wisdom of the crowds’1–12. In other words, the average estimate of a crowd of individuals yields surprisingly accurate estimates. Researchers have investigated this topic for over 100 years.
However, it is also well known that the wisdom of crowds has an underlying problem: difficulty in collecting estimates from several people. Many studies have addressed this issue. Specifically, they have shown that even an individual could use the wisdom of the crowds (called ‘the wisdom of the inner crowd’13–33. In these studies, individuals were instructed to produce different estimates (mainly twice) for a single question. In other words, they were expected to produce quasi-crowd estimates. The estimates were then averaged. The results showed that the average estimate was more accurate than the individual estimate (i.e., the first estimate).
Thus, the inner crowd’s wisdom can improve daily life estimates. However, it also has a fundamental problem in that its efficacy is relatively low. Specifically, previous studies13,16 reported that the average estimate’s accuracy corresponded to only 1.1–1.4 persons’ first estimates. Therefore, this study aims to propose a method that can ‘boost’33–37 the wisdom-of-inner-crowd effect.
To do so, we combined multiple methods proposed in previous studies. The first refers to using the forgetting power. For a single question, a previous study13 gave participants a timespan (two weeks) between two estimates. Subsequently, they discovered that the wisdom of the inner crowd emerged. The second approach uses the power of dialectics. Previous studies14,19,20 asked participants to consider the opposite in their second estimates and showed that people could utilise the wisdom of the inner crowd (called ‘dialectical bootstrapping’). The third method involves perspective-taking38,39. It is well known that taking others’ perspectives changes various forms of cognition (e.g., stereotypic biases40; preferential values28; and egocentric thinking41. Based on these findings, previous studies have asked participants to consider others’ perspectives in their second estimates. For accessing the perspectives of others, one study30 used general crowds, and another29,31,32 used a person who disagreed with the participants. Both previous studies reported that people can produce the wisdom of the inner crowd.
As illustrated above, these methods differ in instructing participants to produce estimates. Therefore, it seems possible that, by combining these methods, an individual can boost the wisdom-of-inner-crowd effect. The purpose of this study was to test this hypothesis.
Note that, there is only one previous study showing that ensembles multiple methods could enhance the wisdom of the inner crowd, to the best of our knowledge42. The summary of the study is as follows: Participants performed correlation assessments (e.g., the height and weight of male students, SAT section scores). Six methods were used for the assessments. For example, one method focused on the Strength of Relationship (e.g., ‘How would you characterize the strength and nature of the relationship between height and weight for male MBA students?’), while another involved estimating the Correlation (e.g., ‘What would you estimate the correlation between height and weight for male MBA students to be?’). They found that adding both experts and methods could improve accuracy.
The important point for our study is that the previous study42 has shown that even a single person can improve accuracy by combining these methods—that is, through the wisdom-of-inner-crowd effect by an ensemble method. Thus, these two studies share similar findings. However, there are key differences between our study and the previous one: The task in the previous study42 was limited to correlation assessments. In contrast, our method can be applied to all types of estimation tasks in which participants respond in percentages.
Overview of the experiments
The primary objective of this research is to propose and validate an ensemble method designed to enhance the accuracy of individual estimates by leveraging ‘the wisdom of the crowds’. In a pilot study, we initially examined whether averaging intuitive and deliberative judgments from the same individual could improve estimation accuracy compared to using either judgment alone. This provided preliminary evidence that combining distinct cognitive approaches might be beneficial.
Building upon this pilot, the subsequent experiment systematically tested our proposed ensemble method in a controlled setting. This ensemble method integrates multiple cognitive strategies: intuitive judgment, deliberative reasoning, dialectical thinking, general crowd perspective, and the perspective of a disagreeing individual. Results from this experiment robustly supported the effectiveness of our ensemble method, demonstrating significantly greater accuracy compared to repeated judgments without specific instructions.
Pilot study
Along with the above methods, this study also proposes a new method. For a single question, the method makes people think intuitively (called ‘Intuition’; for the full instruction, see Table 1) in the first estimate and think deliberately in the second estimate (‘Deliberation’). The two estimates are then averaged (see S1 in the Supplementary Information) for optimal weighting.
Table 1.
Type of estimate and detailed instructions.
| Type of estimate | Instruction |
|---|---|
| Intuition | Please do not think deliberately. Answer quickly what comes to mind intuitively. Please answer within 8 s. |
| Deliberation | Please ignore your intuition. Think deliberately before answering. Utilize your knowledge and experience, while being aware of the basis of the estimate. There is no time limit. Please use enough time to think about the following question. |
| Dialectic | First, assume that your last estimate is off the mark. Second, think about a few reasons why that could be. Which assumptions and considerations could have been wrong? Third, what do these new considerations imply? Was the second estimate too high or too low? Fourth, based on this new perspective, make an alternative estimate. (The computer display shows the second estimate). |
| General crowd’s perspective | How do you think people, in general, estimate the following question? Think and answer how people, in general, estimate this. Please do not answer you own estimate. |
| Disagree-other’s perspective | Now, picture a friend whose views and opinions are very different from yours. To illustrate, when discussing politics, societies, and daily affairs, you often find yourself disagreeing on various issues. How would he or she answer the following six questions? Answer these questions now as this friend. Please do not answer your own estimates. |
This method is based on the findings related to rule selection in judgment40. Researchers discuss judgments that involve multiple rules and suggest that successful judgment requires sufficient processing capacity. For example, rule selection depends on factors such as processing motivation43,44 and mental fatigue45. Importantly, the need for cognition46 and cognitive closure47–49 also influence rule selection. In our context, intuitive and deliberate judgments may engage in different rule selection processes, resulting in non-redundant errors, although our estimation task did not involve explicit rules.
Only the Dialectic estimate required the second (i.e., Deliberation) estimate to be displayed, while the other estimates did not. All instructions were translated into Japanese.
In the Pilot study, we examined the effectiveness of this method for the general knowledge questions (Table 2; see Methods section for more detail). Participants made an estimate in each of Intuition and Deliberation for each question. We calculated MSE (Mean Squared Error) for Intuition, Deliberation, and the average of the two estimates (as for analysis of the Mean Absolute Error, see S4 in the Supplementary Information). We found that the average of the two estimates was significantly more accurate than the Intuition estimate (t = 2.97, p = 0.0084, Cohen’s d = 0.38 [0.73, 0.03]; Fig. 1). No significant effects were observed between Intuition and Deliberation (t = 0.67; p = 0.78, Cohen’s d = 0.06 [0.40, − 0.29])
Table 2.
Questions and correct answers used in the experiments.
| Number | Question | Answer (%) |
|---|---|---|
| 1 | What percent of the world’s roads are in India? | 9.7 |
| 2 | What percent of the world’s telephone lines are in China, USA, or the European Union? | 52.0 |
| 3 | Saudi Arabia consumes what percentage of the oil it produces? | 72.1 |
| 4 | What percent of the world’s population is between 15 and 64 years old? | 65.2 |
| 5 | What percent of the world’s population is Christian? | 31.4 |
| 6 | What percent of the worldwide labour force works in the service sector? | 50.6 |
| 7 | What percent of the worldwide gross domestic product (GDP) is re-invested? | 25.2 |
| 8 | What percentage of Japanese adult males smoke? | 27.1 |
| 9 | What percentage of Japanese households have a fixed-line phone? | 69.0 |
| 10 | What percentage of the world’s countries have a higher life expectancy than the United States? | 22.6 |
Fig. 1.

Results of pilot study. Average indicates the average of the estimates in Intuition and Deliberation. As the figure shows, the estimate in average was (marginally) significantly accurate compared to those in Intuition and Deliberation.
Although we cannot find the significant effects between the average of the two estimates and the Deliberation estimate, (t = 3.75, p = 0.054), we also found a small effect size which indicated the accuracy of the average of the two estimates (Cohen’s d = 0.35 [0.70, 0.00]).
Consequently, we can consider this method as one that may lead to accurate estimates. Therefore, combining the estimates based on Intuition and Deliberation with other estimates (including one’s own estimate) could boost the wisdom-of-inner-crowd effect.
Thus, we found that the average of the Intuition and Deliberation estimates was significantly more accurate than the Intuition estimate. This experiment is related to the study by Keck and Tang10, wherein participants were instructed to provide estimates either intuitively or deliberately. Their findings revealed that the combination of intuition and deliberation enhances the wisdom-of-crowds effect; In particular, the group in which half of the participants thought intuitively and the other half thought deliberately recorded higher accuracy than the groups in which all the participants thought intuitively or deliberately. Therefore, the Pilot study partially replicated the results of Keck and Tang10 at an individual level (but see also the Experiment section).
Note that we can also connect this study to previous studies on social circles50–54. These studies indicate that by using an individual’s knowledge of their social circles (e.g. friends), we can improve predictions such as those referring to the results of a political election53. We can assume that a prediction based on knowledge of the social circles is similar to the estimate from different perspectives. However, the aims of the studies were different. In contrast to previous research, our method aims to obtain an accurate average estimate. In other words, for our method, the estimates are not necessarily accurate.
We checked all the answers on 2022/10/27. We used the answers in The CIA World Factbook55 World Bank Open Data56 and the data from Japan’s Ministry of Internal Affairs and Communications57 based on previous studies19,30. The Pilot study used all the questions. The Experiment used Questions 2, 3, 4, 5, 6, and 10. All questions were translated into Japanese.
Experiment
Based on these findings, we propose a new method that combines the methods above. This method consists of making five estimates in response to a single question (Fig. 2; Table 1): (1) making an estimate intuitively, (2) making an estimate deliberately, (3) considering the opposite (i.e. dialectical bootstrapping), (4) taking the general crowd’s perspective, and (5) taking the disagree-other’s perspective. The estimates are then averaged (for optimal weighting, see S2 in the Supplementary Information). Because this study aimed to propose a method that could be used for everyday estimation, we excluded methods that required a two-week timespan13.
Fig. 2.

An illustration of our method. For a single question, a participant estimates five times. The order of the estimates is shown in this figure.
Importantly, previous studies have attempted to develop methods that consist of making five estimates for a single question. In Rauhut and Lorenz16, participants made five estimates without specific instructions. In Fujisaki et al.30, participants first provided their estimates and then four estimates from the perspective of general crowds. However, the results indicated that these methods were ineffective. In particular, the average of the five estimates was not more accurate than that of the first two estimates.
Therefore, the key question in this research is whether our method (hereafter, ‘Ensemble method’) could produce accurate averaged estimates. In the following sections, through a behavioural experiment, we confirm that the Ensemble method can produce estimates whose accuracy increases monotonically (i.e., over the two estimates). Moreover, we compared the Ensemble method condition with the control condition, which asked participants to estimate five times without specific instruction (‘Repeated condition’). We found that the Ensemble method was more effective than the method used in the control condition.
The experimental settings were similar to those of meta-prediction methods7,58–61 which instructed participants to predict what others would predict. For example, Prelec et al.7 examined this method. In particular, they proposed an algorithm that selected the answer that was more popular than people had predicted. The results showed that this method can correct the bias in crowds and enhance the wisdom-of-crowds effect. As illustrated above, the proposed method differs from meta-prediction methods in several respects. First, our method aims to improve an individual’s (not a crowd’s) estimate. Second, our method averages the estimates. Nevertheless, we assume that our method and the meta-prediction methods together demonstrate that we can enhance the wisdom of (the inner) crowd by regulating people’s thinking.
Note that we conducted mixed-effects analysis for the statistical test (including the analysis in the Supplementary Information). We selected the best model and computed all the statistical values using the step() function in R for the full model with random participants and stimulus (i.e., question) intercepts (see ‘mixed-effects analysis’ section in Methods).
Results
Main results
Figure 3 presents the results of this analysis. The mean squared error (MSE) was used as an index of estimate accuracy (as for Mean Absolute Error, see S4 in the Supplementary Information). We then calculated the MSE for each participant for all questions. To do so, we averaged the estimates as follows: Let us define the correct answer as θ. For example, when the number of estimates was three, we calculated the MSE as follows:
![]() |
Fig. 3.
Results of experiment. When the number of estimates was 1, the Ensemble method condition had larger MSE than the repeated condition. However, the larger the number of estimates, the smaller the MSE in the Ensemble method condition. In contrast, the MSE in the Repeated condition remained flat. As a result, the Ensemble method condition had a smaller MSE than the Repeated condition when the number of estimates was 5 (p = 0.016; comparing the average estimates when the number of estimates was 5).
Subsequently, we manipulated the number of estimates from one to five. As shown in Fig. 3, the MSE in the Repeated condition remained flat, as reported in previous studies16,30. In other words, the accuracy did not tend to increase even if the number of estimates increased. In contrast, in the Ensemble method condition, the larger the number of estimates, the more the MSE decreased, although the slope tended to be slightly attenuated.
First, we confirmed whether the Ensemble method condition reproduced the wisdom of the inner crowd. We could assume the first estimate in the Repeated condition as the participants’ ‘own estimate’. Therefore, we used this estimate in our analysis. We found that the MSE of the average of the five estimates in the Ensemble method condition was lower than that of the first estimate in the Repeated condition (t = 2.19, p = 0.029, Cohen’s d = 0.54 [1.00, 0.073]). The results suggest that the Ensemble method could elicit the wisdom of the inner crowd.
Second, and more importantly, the MSE of the average of the five estimates in the Ensemble method condition was lower than that of the average of the five estimates in the Repeated condition (t = 2.51, p = 0.012, Cohen’s d = 0.59 [1.06, 0.13]). Thus, the Ensemble method condition was more effective than the Repeated condition.
Moreover, the results showed that the Ensemble method produced estimates with monotonically increasing accuracy. We applied the Page’s trend test and found that the MSE in the Ensemble method condition decreased monotonically as the number of estimates increased (z = -5.15, p < 0.001). By contrast, we confirmed that the MSE in the Repeated condition neither increased nor decreased monotonically (z = 1.09, p = 0.28).
Comparison with the wisdom of (the outer) crowds
Further analyses were performed to address the effectiveness of the Ensemble method. In particular, we compared the Ensemble method condition (and the Repeated condition) with the wisdom of the crowds. In the Repeated condition, participants first provided their own estimates. Therefore, to collect these estimates, we got the wisdom-crowds-effect (in a general sense) (hereafter called ‘Repeated-first group’).
We fitted a nonlinear mixed-effects model using Bayesian parameter estimation (for more details, see S10 in the Supplementary Information). We assumed that the relationship between the number of estimates (T) and MSE could be represented by a hyperbola as follows:
where a represents the magnitude of the wisdom-of-inner-crowd effect and b represents the error when the number of estimates is infinite.
Figure 4 presents the results of this analysis. In the Repeated-first group, the larger the group size, the smaller the MSE. Thus, we confirmed that the wisdom-of-crowds effect emerged in the Repeated-first group. Compared to the Repeated-first group, the Ensemble method condition had a more gradual slope. In other words, the Ensemble method condition had a weaker wisdom-of-crowds effect than the Repeated-first group.
Fig. 4.
Results of the model fitting. The points indicate actual values and the solid lines represent the estimated hyperbolas. To compute TT, we first projected the MSE in the Ensemble method condition onto the hyperbola of the Repeated-first group. Subsequently, we observed the x value and regarded it as TT. To compute TT, we used parameter b since b represents the MSE when the number of estimates was infinite. Subsequently, we projected b onto the hyperbola of the Repeated-first group and observed T∞.
Subsequently, we performed a quantitative comparison. We used TT, which represented the number of people in the Ensemble method condition corresponding to the Repeated-first group. For example, when TT = 1.3, the Ensemble method condition corresponded to 1.3 persons in the Repeated-first group.
Table 3 presents the results of the analysis. When T was 5, T5 was over 1.5 (i.e., 1.51). In other words, the effect of the Ensemble method was larger than 1.5 persons of the Repeated-first group. To the best of our knowledge, this value is larger than that of the method tested in previous studies, especially those focusing on the general knowledge question13,16. However, even if the number of estimates was infinite, TT could not exceed two in the Repeated-first group. Thus, the effectiveness and limitations of the Ensemble method can be observed.
Table 3.
Results of TT.
| Repeated | Ensemble method | |
|---|---|---|
| T 2 | 1.02 | 1.18 |
| T 3 | 0.99 | 1.33 |
| T 4 | 0.99 | 1.40 |
| T 5 | 0.96 | 1.51 |
| T ∞ | 0.97 | 1.90 |
In the Ensemble method condition, TT was over 1.5. On the contrary, in the Repeated condition, the values were around 1 irrespective of TT.
Decomposition of the error
How does the Ensemble method condition produce more accurate estimates than the Repeat condition? It is well known that the wisdom-of-crowd effect can be represented mathematically5. Note that we translated the equation into our context (i.e., the wisdom of the inner crowd). Let us define Ei as the estimate of group member i, < Ei > as its average over an individual’s estimates, and θ as the correct answer. Then, the equation is:
Here, we refer to the left side of the equation as the collective error, which indicates the MSE. Therefore, a lower collective error value indicates an accurate estimate. We refer to the first term on the right-hand side as the expected squared error. A lower value of the expected squared error represents an accurate estimate. The second term on the right side represents diversity. As the equation shows, higher diversity leads to better collective performance.
Table 4 presents the results of the analysis. As mentioned above, the Ensemble method condition showed a lower collective error than the Repeated condition. Importantly, the Ensemble method condition showed a higher expected squared error than the Repeated condition. In other words, the estimate in the Ensemble method condition tended to be less accurate than that in the Repeated condition (see also S6 in the Supplementary Information). However, the Ensemble method condition had significantly larger diversity than the Repeated condition (see S7 in the Supplementary Information). Consequently, the Ensemble method condition had a lower collective error (i.e., better performance) than the Repeated condition.
Table 4.
Results of the decomposition of the error. We conducted bootstrapping62 based on 10,000 sampling with replacement.
| Collective error | Expected squared error | Diversity | |
|---|---|---|---|
| Ensemble method | 225.8 [184.9, 270.4] | 438.8 [391.7, 486.7] | 212.9 [202.8, 223.3] |
| Repeated | 315.0 [259.9, 374.2] | 359.0 [303.1, 419.4] | 43.9 [39.5, 48.6] |
| Repeated-first | 108.2 [90.2, 127.0] | 300.3 [275.4, 325.8] | 192.0 [183.8, 200.2] |
Table 4 also displays the results of the Repeated-first group analysis. For this group, we defined Ei as the first estimate of group member i and < Ei > as its average overall group members. As shown in the table, the Repeated-first group exhibited a lower expected squared error than the Ensemble method. In addition, the Repeated-first group had approximately the same diversity as the Ensemble method. As a result, the Ensemble method was not more effective than the Repeated-first group.
Discussion
This study proposes a method that boosts the wisdom-of-inner-crowd effect. Our method requires participants to provide five estimates in response to a question:1) making an estimate intuitively (Intuition); 2) making an estimate deliberately (Deliberation); 3) considering the opposite (Dialectic); 4) taking the general crowds perspective (General crowd’s perspective); and 5) taking the disagree-other’s perspective (Disagree-other’s perspective). It then averaged the five estimates.
We first confirmed that our method recorded higher accuracy than the first estimate and the average of the five estimates in the control condition (i.e. the Repeated condition). Moreover, the results show that our method produced estimates whose accuracy increased monotonically with the increasing number of estimates. Furthermore, through mathematical modelling, we found that the estimation accuracy of our method was higher than 1.5 persons’ estimates.
We do not claim that this is the only method that boosts the wisdom-of-inner-crowd effect by combining multiple methods.
We conducted a post-hoc analysis of the Intuition-Deliberation method. Specifically, we compared the reduction of MSE from the first to second guesses between the Ensemble and Repeated conditions. The results showed that the Ensemble condition had a larger reduction in MSE compared to the Repeated condition (t = 3.148, p = 0.002, Cohen’s d = 0.71 [1.19, − 0.23]). However, when comparing the MSE obtained from the second guess, we found no significant difference between the conditions (t = 0.85, p = 0.40, Cohen’s d = 0.1 [0.57, − 0.36]), although the average MSE was lower in the Ensemble condition (270.09) than in the Repeated condition (301.06). In other words, there is no clear evidence that combining Intuition and Deliberation yields more accurate results than simply having individuals provide two repeated estimates. Thus, regarding the replication of Keck and Tang’s10 findings, our results constitute only a partial replication, and further research on this issue is warranted.
Therefore, in the future, it would be important to explore other methods based on the power of combining. Note that all combinations of the five estimates (including instructions, order, and time gaps) should be explored, but here we will give one example, for five estimations: (1) first guess, (2) dialectical reasoning, (3) general crowd estimation, (4) disagree-other, and (5) a last guess. Between the initial guess and the final guess, a moderate timespan would be allowed, as tested by Vul and Pashler13. This approach can be regarded as a timespan-based ensemble method.
Another promising approach is to encourage participants to provide different estimates directly. For example, through the anchoring paradigm, a previous study63 successfully prompted participants to give lower and higher estimates, thereby increasing the diversity of estimates and enhancing the wisdom-of-crowds effect. Based on these findings, we could potentially improve the wisdom-of-inner-crowd effect using similar instructions.
Since Vul and Pashler13 and Herzog and Hertwig14 demonstrated that individuals could use the wisdom of crowds, many studies have focused on this. However, its shortcomings have been highlighted. Rauhut and Lorenz16 pointed out that people cannot produce accurate estimates by increasing the number of estimates. Nevertheless, for over a decade, effective methods to overcome this shortcoming have not been proposed. In this paper, we propose an effective method that combines multiple methods proposed in previous studies. By utilising this method (i.e. the Ensemble method), we can significantly enhance the accuracy of our estimates.
Limitations of the study
Finally, we describe the limitations of this study. We developed the Ensemble method, especially the fixed order of the five estimates, for the following reasons. First, we could not set the Dialectic as the first estimate method because it would make an individual ‘re-consider’ the previous estimate. Second, we considered that we should set the Intuition as the first estimate because it was the estimate by which an individual answered what came to mind first. However, we do not claim that the order of the estimates in our method is the only one which can boost the wisdom-of-inner-crowd effect. Therefore, further studies using this methodology should be conducted. For example, we excluded the timespan method proposed by Vul and Pashler13 from our method because it required participants to commit for a long time (two weeks). Thus, by adding the timespan method to our method, an individual may enhance the wisdom-of-inner-crowd effect.
Moreover, in the Experiment, we compared our method with a control condition (i.e. Repeated condition) that did not include specific instructions. However, we did not directly compare our method with those used in previous studies13,14,30. Therefore, further studies comparing these methods are warranted.
Another limitation of this study is the type of questions. We used questions with the percentage of correct answers based on previous studies13,17,18,22,29,30. However, other types of questions exist. For example, some studies used the years of historical events14,17 whereas others used numerical estimation tasks16,17,21,29. A review study18 reported that the wisdom-of-inner-crowd effect was maintained across different types of questions. However, the manner in which our method works for other types of problems remains unclear. Notably, our method may also enhance the accuracy of choice tasks. Because it requires participants to answer five times, it is possible that aggregation rules, such as majority rules, work effectively. For example, in the binary choice task, even if Intuition and Deliberation were incorrect, we could perform accurate inference by majority rule (i.e. when Dialectic, General crowd’s perspective, and Disagree-other perspective were correct). Further research should be conducted to generalise the findings of this study.
Moreover, there are several methodological limitations that must be acknowledged. First, participants were not provided with accuracy-based incentives, and the descriptions of these incentives were inconsistently applied across different conditions. Future research should explicitly test whether accuracy incentives influence the effectiveness of the Ensemble method, ensuring that such incentives are consistently described and uniformly implemented. Second, the Experiment introduced a 30-minute interval between two rounds of estimation in the Ensemble condition, whereas this interval was not included in the Repeated condition. This discrepancy introduces a potential confound and should be addressed in future studies by standardizing the timing of estimations across all experimental conditions.
Additionally, another promising approach to further enhance the wisdom-of-inner-crowd effect is to integrate new methods into our existing approach. In our experiment, the performance of our method did not surpass the accuracy achieved by averaging two separate individuals’ estimates. Specifically, even if an infinite number of estimates were combined using our approach, the accuracy (TT) would not exceed that of two individuals in the Repeated-first group. Recently, new methods64 have been proposed that effectively boost the wisdom of the inner crowd. Incorporating these innovative approaches into our current method might enable us to achieve accuracy levels that surpass those obtained from averaging two distinct individuals’ estimates.
Methods
Two experiments were conducted using Qualtrics software. All the participants provided informed consent before participating in the study. The experimental protocol was approved by the Research Ethics Committee of University of Tokyo and was conducted in accordance with the 2013 version of the Declaration of Helsinki.
Details of the Pilot study
Participants: The participants were 64 undergraduate and graduate students (24 women, 39 men, and one did not want to respond; Mage = 21.25 and SDage = 2.24). After the experiment, they received a flat fee of 1,000 Japanese yen (approximately $9.17 at the currency rate at the time) for participation.
Stimulus: Based on previous studies18,30, we prepared ten questions about general knowledge (Table 2).
Procedure: We set only one condition in this experiment, in which all participants provided two estimates for each question (Sets 1–2).
In Set 1, participants answered all ten questions intuitively. In this set, we also set a time limit of eight seconds, based on previous studies65,66 that manipulated participants’ thinking styles. After answering each question, participants rated their confidence (see S3 for Supplementary Information). Between Sets 1 and 2, we set a 30-minute time interval because of the experimental design. During this period, the participants performed an irrelevant task. In this task, participants were instructed to make a binary choice as for consumer products.
In Set 2, participants answered all ten questions deliberately. In this set, we did not set any time limits. After answering all the questions, participants answered questions about their thinking styles in this set.
We randomised the order of the questions for each participant. The randomised order of the questions remained constant across the (two) sets.
Details of the Experiment
Participants: The participants were 76 Japanese undergraduate students. In this experiment, we set two conditions: the Ensemble method condition and the Repeated condition. Participants were randomly assigned to one of the two conditions (Ensemble method condition: n = 38, Mage = 19.58, SDage = 0.86, 24 women, 14 men; Repeated condition: n = 38, Mage =19.74, SDage = 0.92, 23 women, 14 men, and one did not want to respond).
Stimulus: We prepared six questions on general knowledge (Table 2) based on those used in the Pilot study.
Procedure: In the Ensemble method condition, the participants answered each question five times based on the instructions. They performed Intuition, Deliberation, Dialectic, General crowd’s perspective, and Disagree-others’ perspective estimates in this order. We randomised the order of the questions for each participant. Across the five estimates, the randomised order of the questions remained constant. Note that in the Ensemble method condition, we set a 30-min time interval between the Dialectic and General crowd conditions. During this period, the participants performed an irrelevant task. In this task, participants were instructed to make binary choices related to consumer products.
This experiment did not include a participation fee because it was conducted as part of a psychology class. In the Repeated condition, based on the previous study, we made participants imagine that this experiment would reward themselves depending on the accuracy of each estimate (i.e., not averaged estimates). Note that, at the start of the experiment, we explicitly informed participants that the payment was only hypothetical; they did not actually get paid. In the Ensemble condition, we did not tell participants this because they should complete the estimate based on the instruction (i.e., Dialectic) as mentioned above.
Based on the instructions, the participants answered each question five times. We randomised the order of the questions for each participant. Across the five estimates, the randomised order of the questions remained constant. Note that in the Repeated condition, participants performed the same irrelevant task as in the Ensemble method condition for 30 min after completing the estimation task.
Mixed-effects analysis
All mixed-effects analyses (see also previous studies29) were conducted using the R(4.1.1) packages lme4 and lmerTest. We selected the best model and computed all the statistical values using the step() function in R for the full model with random participants and stimulus intercepts67. All multiple comparisons were performed using the R packages lsmeans and pbkrtest68.
Supplementary Information
Below is the link to the electronic supplementary material.
Acknowledgements
This research was supported by JST CREST, Grant Number JPMJCR19A1 and Japan Society for the Promotion of Science KAKENHI Grant Number JP23K17010.
Author contributions
IF, LY, and KU developed the study concept and contributed to the study design. LY and YT collected data. IF, LY, and YK analysed the data. IF wrote the manuscript, with feedback from YK and KU. KU and YK won funding.
Data availability
The R-code and the three datasets analysed during the current study are available in the Mendeley Data: https://data.mendeley.com/datasets/yx7xscnxg8/1.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Itsuki Fujisaki, Email: bpmx3ngj@gmail.com.
Kazuhiro Ueda, Email: ueda@g.ecc.u-tokyo.ac.jp.
References
- 1.Surowiecki, J. The Wisdom of Crowds. Anchor. (2004).
- 2.Lorenz, J., Rauhut, H., Schweitzer, F. & Helbing, D. How social influence can undermine the wisdom of crowd effect. Proc. Natl. Acad. Sci.108, 9020–9025. 10.1073/pnas.1008636108 (2011). [DOI] [PMC free article] [PubMed]
- 3.Hertwig, R. Tapping into the wisdom of the crowd–with confidence. Science336, 303–304. 10.1126/science.1221403 (2012). [DOI] [PubMed] [Google Scholar]
- 4.Becker, J., Brackbill, D. & Centola, D. Network dynamics of social influence in the wisdom of crowds. Proc. Natl. Acad. Sci.114, E5070–E5076. 10.1073/pnas.1615978114 (2017). [DOI] [PMC free article] [PubMed]
- 5.Jayles, B. et al. How social information can improve Estimation accuracy in human groups. Proc. Natl. Acad. Sci.114, 12620–12625. 10.1073/pnas.1703695114 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Analytis, P. P., Barkoczi, D. & Herzog, S. M. Social learning strategies for matters of taste. Nat. Hum. Behav.2, 415–424. 10.1038/s41562-018-0343-2 (2018). [DOI] [PubMed] [Google Scholar]
- 7.Prelec, D., Seung, H. S. & McCoy, J. A solution to the single-question crowd wisdom problem. Nature541, 532–535. 10.1038/nature21054 (2017). [DOI] [PubMed] [Google Scholar]
- 8.Fujisaki, I., Honda, H. & Ueda, K. Diversity of inference strategies can enhance the ‘wisdom-of-crowds’ effect. Humanit. Soc. Sci. Commun.4, 107. 10.1057/s41599-018-0161-1 (2018). [Google Scholar]
- 9.Almaatouq, A. et al. Adaptive social networks promote the wisdom of crowds. Proc. Natl. Acad. Sci.117, 11379–11386. (2020). [DOI] [PMC free article] [PubMed]
- 10.Keck, S. & Tang, W. Enhancing the wisdom of the crowd with cognitive-process diversity: the benefits of aggregating intuitive and analytical judgments. Psychol. Sci.31, 1272–1282. 10.1177/0956797620941840 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Tylén, K., Fusaroli, R., Østergaard, S. M., Smith, P. & Arnoldi, J. The social route to abstraction: interaction and diversity enhance performance and transfer in a Rule-Based categorization task. Cogn. Sci.47, e13338. 10.1111/cogs.13338 (2023). [DOI] [PubMed] [Google Scholar]
- 12.Collins, R. N., Mandel, D. R., Karvetski, C. W., Wu, C. M. & Nelson, J. D. The wisdom of the coherent: improving correspondence with coherence-weighted aggregation. Decision11, 60–85. 10.1037/dec0000211 (2024). [Google Scholar]
- 13.Vul, E. & Pashler, H. Measuring the crowd within. Psychol. Sci.19, 645–647. 10.1111/j.1467-9280.2008.02136.x (2008). [DOI] [PubMed] [Google Scholar]
- 14.Herzog, S. M. & Hertwig, R. The wisdom of many in one Mind. Psychol. Sci.20, 231–237. 10.1111/j.1467-9280.2009.02271.x (2009). [DOI] [PubMed] [Google Scholar]
- 15.Hourihan, K. L. & Benjamin, A. S. Smaller is better (when sampling from the crowd within): low memory-span individuals benefit more from multiple opportunities for Estimation. J. Exp. Psychol. Learn. Mem. Cogn.36, 1068–1074. 10.1037/a0019694 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Rauhut, H. & Lorenz, J. The wisdom of crowds in one mind: how individuals can simulate the knowledge of diverse societies to reach better decisions. J. Math. Psychol.55, 191–197. 10.1016/j.jmp.2010.10.002 (2011). [Google Scholar]
- 17.Müller-trede, J. Repeated judgment sampling: boundaries. Judgm. Decis. Mak.6, 283–294 (2011). [Google Scholar]
- 18.Herzog, S. M. & Hertwig, R. Harnessing the wisdom of the inner crowd. Trends Cogn. Sci.18, 504–506. 10.1016/j.tics.2014.06.009 (2014). [DOI] [PubMed] [Google Scholar]
- 19.Herzog, S. M. & Hertwig, R. Think twice and then: combining or choosing in dialectical bootstrapping? J. Exp. Psychol. Learn. Mem. Cogn.40, 218–232. 10.1037/a0034054 (2014). [DOI] [PubMed] [Google Scholar]
- 20.Krueger, J. I. & Chen, L. J. The first cut is the deepest: effects of social projection and dialectical bootstrapping on judgmental accuracy. Soc. Cogn.32, 315–336. 10.1521/soco.2014.32.4.315 (2014). [Google Scholar]
- 21.Dolder, D. V., Assem, M. J. & Assam van den. The wisdom of the inner crowd in three large natural experiments. Nat. Hum. Behav.2, 21–26. 10.1038/s41562-017-0247-6 (2018). [DOI] [PubMed]
- 22.Steegen, S., Dewitte, L., Tuerlinckx, F. & Vanpaemel, W. Measuring the crowd within again: a pre-registered replication study. Front. Psychol.5, 786. 10.3389/fpsyg.2014.00786 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.van der Leer, L. & McKay, R. The optimist within? Selective sampling and self-deception. Conscious. Cogn.50, 23–29. 10.1016/j.concog.2016.07.005 (2016). [DOI] [PubMed] [Google Scholar]
- 24.Barneron, M., Allalouf, A. & Yaniv, I. Rate it again: using the wisdom of many to improve performance evaluations. J. Behav. Decis. Mak.32, 485–492. 10.1002/bdm.2127 (2019). [Google Scholar]
- 25.Litvinova, A., Herzog, S. M., Kall, A. A., Pleskac, T. J. & Hertwig, R. How the wisdom of the inner crowd can boost accuracy of confidence judgments. Decision7, 183–211. 10.1037/dec0000119 (2020). [Google Scholar]
- 26.Fiechter, J. L. & Kornell, N. How the wisdom of crowds, and of the crowd within, are affected by expertise. Cogn. Res. Princ Implic. 6, 5. 10.1186/s41235-021-00273-6 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Gaertig, C. & Simmons, J. P. The psychology of second guesses: implications for the wisdom of the inner crowd. Manag Sci.67, 5921–5942. 10.1287/mnsc.2020.3781 (2021). [Google Scholar]
- 28.Fujisaki, I., Honda, H. & Ueda, K. A simple cognitive method to improve the prediction of matters of taste by exploiting the within-person wisdom-of-crowd effect. Sci. Rep.12, 12413. 10.1038/s41598-022-16584-7 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Van de Calseyde, P. P. & Efendić, E. Taking a disagreeing perspective improves the accuracy of people’s quantitative estimates. Psychol. Sci.33, 971–983. 10.1177/09567976211061321 (2022). [DOI] [PubMed] [Google Scholar]
- 30.Fujisaki, I., Yang, K. & Ueda, K. On an effective and efficient method for exploiting the wisdom of the inner crowd. Sci. Rep.13, 3608. 10.1038/s41598-023-30599-8 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Fiechter, J. L. Drawing generalizable conclusions from multilevel models: Commentary on Van de Calseyde and Efendić (2022). Psychol. Sci.35, 694–699. 10.1177/09567976241245411 (2024). [DOI] [PubMed]
- 32.Van de Calseyde, P. & Efendić Emir. Taking a disagreeing perspective benefits the wisdom of inner crowds when people answer difficult (vs. easy) questions. Psychol. Sci. (in press). [DOI] [PubMed]
- 33.Grüne-Yanoff, T. & Hertwig, R. Nudge versus boost: how coherent are policy and theory? Minds Mach.26, 149–183. 10.1007/s11023-015-9367-9 (2016). [Google Scholar]
- 34.Hertwig, R. & Grüne-Yanoff, T. Nudging and boosting: steering or empowering good decisions. Perspect. Psychol. Sci.12, 973–986. 10.1177/1745691617702496 (2017). [DOI] [PubMed] [Google Scholar]
- 35.Hertwig, R. & Ryall, M. D. Nudge versus boost: agency dynamics under Libertarian paternalism. Econ. J.130, 1384–1415 (2020). [Google Scholar]
- 36.Kozyreva, A., Lewandowsky, S. & Hertwig, R. Citizens versus the internet: confronting digital challenges with cognitive tools. Psychol. Sci. Public. Interest.21, 103–156 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lorenz-Spreen, P. et al. Boosting people’s ability to detect microtargeted advertising. Sci. Rep.11, 15541. 10.1038/s41598-021-94796-z (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Epley, N., Keysar, B., Van Boven, L. & Gilovich, T. Perspective taking as egocentric anchoring and adjustment. J. Pers. Soc. Psychol.87, 327–339. 10.1037/0022-3514.87.3.327 (2004). [DOI] [PubMed] [Google Scholar]
- 39.Adida, C. L., Lo, A. & Platas, M. R. Perspective taking can promote short-term inclusionary behavior toward Syrian refugees. Proc. Natl. Acad. Sci.115, 9521–9526. 10.1073/pnas.1804002115 (2018). [DOI] [PMC free article] [PubMed]
- 40.Galinsky, A. D. & Moskowitz, G. B. Perspective-taking: decreasing stereotype expression, stereotype accessibility, and in-group favoritism. J. Pers. Soc. Psychol.78, 708–724. 10.1037/0022-3514.78.4.708 (2000). [DOI] [PubMed] [Google Scholar]
- 41.Yaniv, I. & Choshen-hillel, S. When guessing what another person would say is better than giving your own opinion: using perspective-taking to improve advice-taking. J. Exp. Soc. Psychol.48, 1022–1028. 10.1111/j.1467-9280.2006.01704.x (2012). [Google Scholar]
- 42.Winkler, R. L. & Clemen, R. T. Multiple experts vs. multiple methods: combining correlation assessments. Decis. Anal.1, 167–176. 10.1287/deca.1030.0008 (2004). [Google Scholar]
- 43.Kruglanski, A. W. & Gigerenzer, G. Intuitive and deliberate judgments are based on common principles. Psychol. Rev.118, 97–109. 10.1037/a0020762 (2011). [DOI] [PubMed] [Google Scholar]
- 44.Petty, R. E. & Caccioppo, J. T. The elaboration likelihood model of persuasion. In Advances in Experimental Social Psychology (ed Berkowitz, L.) Vol. 19, 123–205). (Academic, 1986). [Google Scholar]
- 45.Tetlock, P. E. Accountability: A social check on the fundamental attribution error. Soc. Psychol. Quart.48, 227–236. 10.2307/3033683 (1985). [Google Scholar]
- 46.Webster, D. M., Richter, L. & Kruglanski, A. W. On leaping to conclusions when feeling tired: mental fatigue effects on impression formation. J. Exp. Soc. Psychol.32, 181–195. 10.1006/jesp.1996.0009 (1996). [Google Scholar]
- 47.Cacioppo, J. T. & Petty, R. E. The need for cognition. J. Pers. Soc. Psychol.42, 116–131. 10.1037/0022-3514.42.1.116 (1982). [Google Scholar]
- 48.Kruglanski, A. W. The Psychology of Closed Mindedness (Psychology, 2004).
- 49.Kruglanski, A. W., Erb, H. P., Pierro, A., Mannetti, L. & Chun, W. Y. On parametric continuities in the world of binary either Ors. Psychol. Inq.17, 153–165. 10.1207/s15327965pli1703_1 (2006). [Google Scholar]
- 50.Kruglanski, A. W. & Webster, D. M. Motivated closing of the mind: seizing and freezing. Psychol. Rev.103, 263–283. 10.1037/0033-295X.103.2.263 (1996). [DOI] [PubMed] [Google Scholar]
- 51.Galesic, M. et al. Asking about social circles improves election predictions. Nat. Hum. Behav.2, 187–193. 10.1038/s41586-021-03649-2 (2018). [Google Scholar]
- 52.Bruine de Bruin, W., Parker, A. M., Galesic, M. & Vardavas, R. Reports of social circles’ and own vaccination behavior: A National longitudinal survey. Health Psychol.38, 975–983. 10.1037/hea0000771 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.de Bruine, W., Galesic, M., Parker, A. M. & Vardavas, R. The role of social circle perceptions in false consensus about population statistics: evidence from a National flu survey. Med. Decis. Mak.40, 235–241. 10.1177/0272989X20904960 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.de Bruine, W. et al. Asking about social circles improves election predictions even with many political parties. Int. J. Public. Opin. Res.34, edac006. 10.1093/ijpor/edac006 (2022). [Google Scholar]
- 55.Central Intelligence Agency. The CIA World Factbook 2020–2021. (2020).
- 56.World Bank. Agricultural land (% of land area). World Bank Data. https://data.worldbank.org/indicator/AG.LND.AGRI.ZS (accessed 10 Aug 2025).
- 57.Ministry of Internal Affairs and Communications. Internet usage by age group. White Paper 2020 on Information and Communications in Japan. https://www.soumu.go.jp/johotsusintokei/whitepaper/ja/r02/html/nd252110.html (accessed 10 Aug 2025).
- 58.Palley, A. B. & Soll, J. B. Extracting the wisdom of crowds when information is shared. Manag Sci.65, 2291–2309. 10.1287/mnsc.2018.3047 (2019). [Google Scholar]
- 59.Himmelstein, M., Budescu, D. V. & Ho, E. H. The wisdom of many in few: finding individuals who are as wise as the crowd. J. Exp. Psy Gen.152, 1223–1244. 10.1037/xge0001340 (2023). [DOI] [PubMed] [Google Scholar]
- 60.Wilkening, T., Martinie, M. & Howe, P. D. Hidden experts in the crowd: using meta-predictions to leverage expertise in single-question prediction problems. Manage. Sci.68, 487–508. 10.1287/mnsc.2020.3919 (2022). [Google Scholar]
- 61.Palley, A. B. & Satopää, V. A. Boosting the wisdom of crowds within a single judgment problem: weighted averaging based on peer predictions. Manage. Sci.69, 5128–5146. 10.1287/mnsc.2022.4648 (2023). [Google Scholar]
- 62.Efron, B. & Narasimhan, B. The automatic construction of bootstrap confidence intervals. J. Comput. Graph Stat.29, 608–619. 10.1080/10618600.2020.1714633 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Barrera-Lemarchand, F., Balenzuela, P., Bahrami, B., Deroy, O. & Navajas, J. Promoting erroneous divergent opinions increases the wisdom of crowds. Psychol. Sci.35, 872–886. 10.1177/09567976241252138 (2024). [DOI] [PubMed] [Google Scholar]
- 64.Gomilsek, T., Hoffrage, U. & Marewski, J. N. Fermian guesstimation can boost the wisdom-of-the-inner-crowd. Sci. Rep.14, 5014 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Dane, E., Rockmann, K. W. & Pratt, M. G. When should I trust my gut? Linking domain expertise to intuitive decision-making effectiveness. Organ. Behav. Hum. Decis. Process.119, 187–194. 10.1016/j.obhdp.2012.07.009 (2012). [Google Scholar]
- 66.Evans, A. M., Dillon, K. D. & Rand, D. G. Fast but not intuitive, slow but not reflective: decision conflict drives reaction times in social dilemmas. J. Exp. Psychol. Gen.144, 951–966. 10.1037/xge0000107 (2015). [DOI] [PubMed] [Google Scholar]
- 67.Bates, D., Mächler, M., Bolker, B. M. & Walker, S. C. Fitting linear mixed-effects models using lme4. J. Stat. Softw.67, 1–48. 10.18637/jss.v067.i01 (2015). [Google Scholar]
- 68.Halekoh, U. & Højsgaard, S. pbkrtest: parametric bootstrap, Kenward-Roger and Satterthwaite basedmethods for test in mixed models. See https://CRAN.R-project.org/package=pbkrtest (2023).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The R-code and the three datasets analysed during the current study are available in the Mendeley Data: https://data.mendeley.com/datasets/yx7xscnxg8/1.



