Abstract
Insight represents a sudden and profound understanding, offering a new perspective that can offer the solution to a previously intractable problem. Insight is tightly associated with an “Aha” experience. Although various theories have attempted to explain how insight emerges, the dynamic search process leading to insight remains poorly understood, primarily due to the complex nature of creative problem-solving. In this study, we employ two versions of the Japanese remote associates test (RAT) (n = 349 and n = 105 participants, respectively), alongside a simulation model. This allows us to quantitatively manipulate the constraints imposed on the problem and to track the search process within the solution space. Our findings indicate that the insight and the accompanying “Aha” moment are characterized by exploration that spans greater distances within the solution space, thereby increasing the number of potential solutions available for evaluation.
Subject terms: Human behaviour, Cognitive neuroscience
Combining human data and a simulation, this study shows that problem-solving relies on a dynamic interplay between de-fixation and exploration, where broader exploration marks more insightful searches.
Introduction
Insight is a sudden and often unexpected realization or understanding of a problem’s solution. This phenomenon is closely linked to the concept of the “Aha” moment, which describes the moment of clarity when an individual experiences a breakthrough in thinking that leads to a novel solution1. The phenomenon of insight significantly contributes to many pivotal discoveries2, and exploring its underlying mechanisms can help devise techniques to enhance creativity and advance artificial intelligence development3. Insightful problem-solving, as opposed to the step-by-step analytical approach of non-insight solutions, is marked by the sudden realization of an answer, a strong conviction in its correctness, and the positive emotions that accompany it1,4–7. However, despite these distinct phenomenological characteristics, the mechanisms underlying the insightful search process during problem-solving remain a subject of debate.
Several mechanistic theories have been proposed to elucidate how a problem solver navigates the solution space to reach an initially elusive solution. The constraint relaxation theory proposes that an effective search process for insightful solutions may involve the relaxation or elimination of certain counterproductive constraints, subsequently opening up new areas of the solution space for exploration8–10. Furthermore, the progress monitoring theory suggests that the progress of the search is constantly assessed, and insight involves the detection of insufficient progress and a consequent strategic shift11,12. The former theory describes a de-fixation process aimed at dissolving mental blockages, while the latter theory suggests an exploration process that allows a variety of alternative strategies to be considered. To avoid confusion, it is important to clarify that exploration is a key implication of the progress monitoring theory, but it is not equivalent to monitoring itself.
In this study, we theorize that the processes of de-fixation and exploration interplay dynamically during the search for solutions, and this interplay unfolds distinctively during insightful searches compared to non-insightful searches. To test this hypothesis, we aim to quantitatively control and measure the constraints of the problem and to track the navigation of ideas within the solution space. For this purpose, we employ a Japanese remote associates test (RAT)13, a creativity assessment that challenges participants to find a common association among three seemingly unrelated kanji. This task is actually a compound remote associates (CRA) test14 rather than the original RAT15, as it has a narrower focus on compound word generation. Although the RAT and CRA are often categorized as a convergent thinking task, divergent thinking also plays a key role, particularly when the solutions are not already known or cannot be approached analytically16. Furthermore, this test is commonly utilized to evaluate individuals’ insight problem-solving abilities17, and to investigate the internal cognitive processes and physiological mechanisms involved18,19. Importantly, the RAT enables us to control and monitor the search process in the word space20–22.
We conduct two experiments using variations of the RAT. In Experiment 1, we controlled the level of fixation in a fixation-controlled RAT (FC-RAT) by providing additional cues to influence fixation13 and examine how this affects performance. In Experiment 2, we quantify step-by-step exploration in a thought-tracing RAT (TT-RAT), where participants are instructed to record each thought as it occurred. To further explore the influence of fixation on the search process, we introduce a simulation model that dynamically navigates through the word space outlined by a Japanese language corpus to solve RAT problems. This model is driven by two key factors: de-fixation and exploration capacity. De-fixation represents the ability to set aside incorrect thoughts to allow for the pursuit of alternative possibilities, while exploration capacity represents the range of alternative solutions available for consideration.
Methods
Experiment 1: FC-RAT
Participants
Behavioral data were collected from 372 participants. We screened out participants who provided no correct answers or had an average reaction time of less than 5 seconds (suggesting random button pushing). This resulted in a final sample of 349 participants (222 male participants and 127 female participants, sex determined by self-report, age = 33.9 ± 12.7 years, mean ± standard deviation). Of these, 244 participants completed the experiment online, and 105 participants completed the experiment on-site. Participants for on-line experiment checked an anonymous informed consent before the task, whereas participants for on-site experiment provided signed informed consent before the task. All study protocols were approved by the Research Ethics Committee, and there was no preregistration for this study.
Task and procedure
In FC-RAT, the participants were shown three kanji simultaneously (denoted as Q1, Q2, and Q3) and asked to think of a fourth kanji (denoted as A) that forms a compound word with each of the given kanji (see Fig. 1). Notably, unlike the English RAT where the answer can be positioned before or after the question words, the answer kanji in the Japanese version must come after the given kanji to construct correct compounds. A “Fixation” condition was implemented to anchor the search process, thereby increasing the difficulty of the exploration. In this setup, participants were presented with three additional kanji, termed fixation cues (labeled C1, C2, and C3 in Fig. 1). Each cue was paired with one of the question kanji to form a meaningful word. Specifically, Q1-C1, Q2-C2, and Q3-C3 formed meaningful two-kanji words and were always presented as pairs in the Fixation condition. Crucially, these fixation cues were not the solution. As a control, a “Neutral” condition was introduced. In this condition, the participants were also presented with three additional kanji, known as neutral cues (labeled C1’, C2’, and C3’ in Fig. 1), but these cues had no relation to the question kanji (Q1-C1’, Q2-C2’, and Q3-C3’ were not meaningful 2-kanji words). To further clarify the concept of fixation cues, we provide an example in English. The question words are Q1 = “Board,” Q2 = “Magic,” and Q3 = “Death,” with the correct answer being “Black.” In this scenario, the fixation cues could be C1 = “Game,” C2 = “Show,” and C3 = “Match.” Although none of these cues are the correct answer, they might distract participants and impede them from arriving at the correct solution.
Fig. 1. Japanese FC-RAT.
The sequence of events within a single trial in the Fixation condition (top row) and the Neutral condition (bottom row). In both conditions, participants receive a question and are instructed to press a button once they have come up with an answer. Following this, they provide their answer; if it is correct, they will rate their Aha experience. Examples of the question and the corresponding cues are shown on the right. All the questions and cues are listed in Supplementary Table 1. The icon used in this figure was obtained from Iconfinder (https://www.iconfinder.com/) and is available under a free-to-use license.
Participants took the FC-RAT via a custom-designed website. First, they submitted basic personal information such as age, gender, and education level. A practice question was provided to familiarize them with the test format and procedure. Subsequently, they proceeded to answer 16 RAT questions, with 8 questions assigned to the Fixation condition and the remaining 8 to the Neutral condition. To ensure comparable question difficulty across both conditions, the 16 questions were selected from a pool of 79 questions by Terai et al.13, utilizing the accuracy data from the Neutral condition. From 8 distinct difficulty levels, matching question pairs were chosen, each pair reflecting similar levels of difficulty. For each participant, the pairs of questions within the same difficulty level were randomly distributed between the Fixation and Neutral conditions, thus maintaining consistent difficulty across conditions. Details of the 16 RAT questions, along with the specific fixation and neutral cues, are provided in Supplementary Table 1.
For each question, participants were given a 45-second time limit to respond. They were instructed to click a button when they believed they had the answer and then type and submit their response in a provided text box. A timer was displayed next to the answer button to remind participants of the remaining time. If the response was correct, they proceeded to rate the intensity of their Aha experience using a 7-point Likert scale, ranging from 1 for “no Aha” to 7 for “very strong Aha.” The Aha experience was defined as a sudden comprehension of the correct answer, using Japanese instructions equivalent to those described by Webb et al.7. An “Aha” trial was defined as a rating higher than 4 on the 7-point Likert scale, while a “No-Aha” trial was defined as a rating of 4 or lower. If they provided an incorrect answer or failed to respond within the time limit, the correct answer was displayed. Afterward, participants clicked a transition button to proceed to the next question. To prevent participants from searching for answers on other websites, the FC-RAT site blanked out if the participant switched windows.
Word distance measures
We retrieved the usage frequency of 2-kanji words from the Tsukuba Web Corpus (TWC), an extensive database of around 1.1 billion words collected from Japanese websites. Utilizing the NINJAL-LWP (National Institute for Japanese Language and Linguistics – Lago Word Profiler, https://tsukubawebcorpus.jp/en/), a search system co-developed by National Institute for Japanese Language and Linguistics and Lago Institute of Language, we extracted relevant data from the TWC. This resulted in a 2-kanji word corpus, which describes the occurrences of 32,688 2-kanji words composed of 3,819 unique kanji characters. We normalized these frequencies against the occurrence of the most frequent word (場合, “case” or “situation”), establishing a relative frequency scale from 0 to 1 (as shown in Fig. 2A).
Fig. 2. Word distance measures.
A The 2-kanji word corpus. The first and second kanji correspond to the y and x axes, respectively. The size of the circle represents the normalized frequency W ranging from 0 to 1 (the scale is shown at the upper right corner). The two most frequent words are highlighted in red. B The corpus locations of the normalized frequencies for WQ1, WQ2, WQ3, WC1, WC2, and WC3. C The visualization of the measurement of danswer. Each circle represents a kanji, with question kanji displayed in black and the answer kanji in red. The arrows from the question kanji to the answer kanji, along with their normalized frequencies, are indicated in red. D The visualization of the measurement of dcues. The arrows from the question kanji to the cue kanji, along with their normalized frequencies, are indicated in blue.
For each RAT question, we calculated the normalized frequencies from the question kanji Q1, Q2, and Q3 to the solution A, denoted as WQ1, WQ2, and WQ3, respectively (as illustrated in Fig. 2B). Additionally, we calculated the normalized frequencies from the question kanji Q1, Q2, and Q3 to their corresponding fixation cues C1, C2, and C3, denoted as WC1, WC2, and WC3, respectively. The values of WQ1, WQ2, and WQ3 describe how frequently the question kanji are associated with the answer, which can be used to quantify the difficulty of finding the answer. From this, we formulated a metric for the distance to answer, denoted as danswer (also see Fig. 2C):
| 1 |
Essentially, it is the mean of the logarithmic values of WQ1, WQ2, and WQ3. The logarithm was applied to account for the distribution’s skewness towards more commonly occurring words (see Supplementary Fig. 1). Similarly, the values of WC1, WC2, and WC3 describe how frequent the fixation cues are associated with the question kanji, which can be used to quantify the cues’ fixation effect. Therefore, we formulated a metric for the distance to cues, denoted as dcues (Fig. 2D):
| 2 |
Note that the measure for dcues is non-applicable for neutral cues, as their frequencies are zero.
For each of the 16 questions, characterized by distinct danswer and dcues values, we conducted a regression analysis. This analysis evaluated the relationship between these values and the participants’ average accuracy and average reaction times. Note that the regression analysis included both danswer and dcues for the Fixation condition, but was limited to danswer for the Neutral condition. This regression was repeated 1000 times, employing bootstrapping for resampling each time, to generate 1000 sets of regression coefficients. The mean and the 95% confidence intervals were then calculated from the 1000 values, and the significant coefficients were identified.
Contrast analysis
In all analyses comparing sample means, we performed two-sample t-tests, reporting the t-statistics and p-values. We also calculated Bayes factors (BF10) using the MATLAB function bf.m (bayesFactor toolbox: https://klabhub.github.io/bayesFactor), with a Cauchy prior scale of sqrt(2)/2 for the two-sample t-test. The effect size measure (Cohen’s d) was computed using the MATLAB function meaneffectsize.m.
Regression analysis
To identify the distinct factors that contribute to the search process resulting in an Aha moment, we used a mixed-effects regression model to analyze the effects of danswer and dcues values on RAT performance, focusing on answer accuracy, reaction time, and Aha rate. For accuracy, we coded correct trials as 1 and incorrect trials as 0, then regressed accuracy against danswer and dcues values in the Fixation condition, and only danswer values in the Neutral condition. This analysis used a logistic regression model (via MATLAB function fitglme.m), incorporating random intercepts for participant labels. From this, we obtained coefficient estimates for danswer and dcues (in the Fixation condition) and for danswer (in the Neutral condition), along with corresponding 95% confidence intervals and p-values. For reaction time, analyzed only in correct trials, we regressed reaction time for each trial using a linear regression model (via fitlme.m), also including random intercepts for participant labels. For Aha rate, also analyzed only in correct trials, we coded Aha trials as 1 and No-Aha trials as 0, and then performed the same logistic regression analysis as was done for accuracy.
Simulation-based power analysis
To determine the minimum number of trials needed to achieve sufficient power for our regression analysis using a mixed-effects model, we conducted a simulation-based power analysis. The specific power we aimed to achieve was 80% with the smallest effect size of theoretical interest (SESOI) of 0.2, which has been identified as a commonly threshold for a small but theoretically meaningful effect23.
For each simulation, we varied the number of trials from 100 to 1000, in increments of 100, and considered effect sizes ranging from 0.1 to 0.6, increasing by 0.05 each time. In each simulation, participants and question labels were randomly assigned. For each question, values for danswer and dcues were also assigned randomly, ranging from 0 to 1. The reaction time was then generated as a linear combination of danswer and dcues values, weighted by the set effect size, with added Gaussian noise (mean = 0, standard deviation= 0.5). We fitted a mixed-effects model, including random intercepts for participants as described in the earlier regression analysis. The significance of the fixed effects (i.e., the coefficients for danswer and dcues) was assessed at an alpha level of 0.05. We repeated this process for 100 simulations, and calculated empirical power as the proportion of simulations in which each coefficient was found to be significant. This sensitivity analysis resulted in two power plots, one for danswer and one for dcues. Supplementary Fig. 2 shows the power achieved at various trial numbers and effect sizes based on the simulation. For both danswer and dcues coefficients, achieving SESOI of 0.2 with a power greater than 80% required a minimum of 600 trials.
In our regression analysis, we focused on two comparisons: the difference between the Fixation and Neutral conditions and the difference between Aha and No-Aha trials. Across all 349 participants, the total numbers of trials for each category were as follows: Fixation with Aha (674 trials), Fixation with No-Aha (2118 trials), Neutral with Aha (909 trials), and Neutral with No-Aha (1883 trials). Each of these counts exceeded the 600-trial requirement, indicating that our data was sufficiently powered for the regression analysis.
Experiment 2: TT-RAT
Participants
Behavioral data were collected on-site from 105 participants (64 male participants and 41 female participants, age = 22.5 ± 2.4 years, mean ± standard deviation). Note that these were the same 105 on-site participants as in Experiment 1. All participants first completed the FC-RAT followed by the TT-RAT on the same day, with a break in between.
Task and procedure
In TT-RAT, the participants were instructed to sequentially enter any candidate kanji that came to mind, regardless of it was correct or not (see Fig. 3). This procedure has been used to examine participants’ strategies to solve RAT20,24. The test was conducted through another custom-designed website. This version presented a distinct set of 16 RAT questions with smaller danswer values (see Supplementary Table 2). This modification was designed to increase the chances of finding a solution, making the exploration phase more meaningful. As this test aimed to investigate fixation effects during search, it provided only the fixation cues and excluded the neutral cues. The response time limit remained at 45 seconds, and participants were introduced to the procedure with a practice question. They were instructed to sequentially enter any candidate kanji that came to mind, one at a time, into a text box. Incorrect entries would fade from the screen after 0.5 seconds, to avoid fixation on past attempts. A small counter next to the input box tracked the number of wrong candidate kanji provided for each question. Additionally, a timer below the input box reminded participants of the remaining time for solving the current question. Upon entering the correct kanji, the screen would update to notify success, and participants would then evaluate the intensity of their Aha experience using a 7-point Likert scale as in FC-RAT. We also incorporated the feature that automatically clears the TT-RAT webpage display if a participant navigates away to another window.
Fig. 3. TT-RAT and the measurement of the exploration distance.
The sequence of events within a single trial where participants are given a question with fixation cues and prompted to enter any kanji that occurs to them, as depicted by thought processes T1, T2, T3, and T4. In this example, T4 represents the correct answer (A), after which participants will rate their Aha experience. The normalized frequency between T3 and its connected question kanji is exemplified and marked in green. The calculation of the average exploration distance Explore_dist is shown in the lower right corner. See the detailed description in Methods. The icon used in this figure was obtained from Iconfinder (https://www.iconfinder.com/) and is available under a free-to-use license.
Explore distance analysis
To quantify the search process before reaching the answer, we quantified the distance between each though (Ti, where i = 1, 2,…) and the question kanji. For example, as depicted in Fig. 3, if the third thought, T3, is linked to Q1 and Q3 with respective weights in the corpus of WT3(1) and WT3(2), then the distance between T3 and the question, denoted as dT3, is determined as:
| 3 |
Or in a more generalized expression:
| 4 |
where mean{}j denotes the average over existing connections between Ti and Qj. Consequently, the average exploration distance for a search is defined as:
| 5 |
where mean{}i represents the average across all thoughts during the search.
Simulation model
Model design and simulation
We proposed a simulation model that mimics the search process involved in solving RAT with fixation cues, utilizing the established corpus (see Fig. 4). The model is composed of three modules: Input, Search, and Evaluation, with three parameters: (1) the fixation factor (Fix_factor), which represents the likelihood of being fixated by incorrect thoughts, (2) the de-fixation factor (Defix_factor), which represents the ease of suppressing incorrect thoughts, and (3) the exploration capacity (Explore_cap), which represents the number of potential solutions available for selection.
Fig. 4. Simulation model to solve RAT with fixation.
The simulation proposed to mimic the search process involved in solving RAT with fixation cues. See the details of its operation in Methods.
For each simulation, the Input module receives two types of input: Cue and Question. Each input is represented by a vector of 3819 dimensions (matching the number of kanji in the corpus shown in Fig. 2A). The Cue vector is predominantly composed of zeros, with the exception that the elements corresponding to the fixation cues C1, C2, and C3 are assigned values of 1. Similarly, the Question vector consists mainly of zeros, except for the elements representing the question kanji Q1, Q2, and Q3, which are set to 1 s. The Search module receives the Question as input and generates Activity, which is another vector of 3819 dimensions. This Activity vector is the result of multiplying the Corpus matrix (3819 by 3819) with the Question vector (3819 by 1), which represents how much each kanji is activated by Q1, Q2, and Q3. The elements in the Activity vector that correspond to C1, C2, and C3 are then scaled by Fix_factor, which represents the additional activation attributed to the fixation cues. Next, the kanji with top activation values will be selected, where the number of top candidates is determined by Explore_cap. If Explore_cap equals 1, the kanji with the highest activation value is selected as the Thought, which will be evaluated by the Evaluation module. If Explore_cap is set to 5, the top 5 kanji in terms of activation values will each have an equal probability of being chosen as the Thought. The Evaluation module then evaluates the Thought based on the corpus. If the Thought is the answer, the simulation will end. If not, the corresponding activation value in the Activity vector is reduced by Defix_factor, thereby reducing the likelihood that the incorrect Thought is activated in subsequent searches. The search will continue, with each step involving the suppression of an incorrect Thought, until the correct Thought is ultimately selected.
For each model configuration—comprising 250 variations from a combination of 5 Fix_factor values (ranging from 10 to 50 in increments of 10), 5 Defix_factor values (anging from 0.2 to 1 in increments of 0.2), and 10 Explore_cap values (incrementing from 1 to 10)—we conducted 7900 simulations. This consists of conducting 100 runs for each set of the 79 RAT questions and fixation cues from Terai et al. 13, with each run incorporating a stochastic thought selection process.
Exploration capacity vs distance
It is important to note that in the model, exploration capacity represents the number of potential solutions available, whereas exploration distance in Experiment 2 (as defined in Eq. 5) reflects the range of exploration within the corpus space. This modification was made to simplify model implementation, as these two measures are effectively equivalent. Intuitively, access to a broader set of potential solutions increases the likelihood of selecting a thought that is not directly triggered by the question, resulting in a greater measured distance between the thought and the question. Conversely, if solutions distant from the question are reachable within the solution space, it is also likely that closer solutions are accessible.
To verify this, we examined the relationship between exploration distance and capacity for the 79 RAT questions and fixation cues. For each question, we first calculated the Activity vector (within the Search module shown in Fig. 4). Next, we measured (from Eq. 4) for each thought available under various combinations of Explore_cap value (1 to 30) and Fix_factor value (ranging from 10 to 50 in increments of 10). Under each Fix_factor value, we then plotted the average value against the Explore_cap value for each question. The results are shown in Supplementary Fig. 3, confirming a linear relationship between exploration capacity and distance. Moreover, in our model, the time required to reach the solution is determined by the number of non-solutions that are closer to the question than the solution itself, denoted as Nnonsol, rather than being directly determined by the value of danswer. To verify their equivalence, we examined the relationship between log(Nnonsol) and danswer (see Supplementary Fig. 4). Again, a one-to-one relationship between capacity and distance was confirmed.
Results
Experiment 1: FC-RAT
Fixation effects in FC-RAT
The accuracies of all 349 participants under the Fixation and Neutral conditions are shown in Fig. 5A. The accuracy in the Fixation condition (39.5 ± 1.8%, mean ± 95% confidence interval, n = 349 participants) was significantly lower than that in the Neutral condition (53.8 ± 2.0%) (t(696)= –10.61, p = 1.74e-24, Bayes factor BF10 = 1.87e21, two-sample t-test; Cohen’s d = –0.80, 95% CI [–0.96, –0.65]) (Fig. 5B). Among the incorrect trials, the rate of no answer (failure to respond within the time limit) was significantly higher in the Fixation condition (33.1 ± 2.3%) compared to the Neutral condition (21.8 ± 1.9%) (t(696) = 7.54, p = 1.45e−13, BF10 = 4.02e10; d = 0.57, 95% CI [0.42, 0.72]), while no statistically significant difference was observed in the rate of wrong answer between the Fixation condition (27.2 ± 2.2%) and the Neutral condition (24.3 ± 2.1%) (t(696) = 1.88, p = 0.06, BF10 = 0.47; d = 0.14, 95% CI [–6.53e-4, 0.29]) (Fig. 5C). This indicates that the fixation cues disrupted the search for the solution by making it harder to find.
Fig. 5. Fixation effects in FC-RAT.
A The mean accuracy for all participants. Each participant’s average accuracies in the Neutral and Fixation conditions are represented by a cross and a circle, respectively. If the accuracy is lower in the Fixation condition, this fixation-induced decrease is highlighted by a red vertical line, while a blue vertical line denotes the opposite effect. These data are from n = 349 participants. B The fixation effects on the accuracy. The means and the corresponding 95% confidence intervals and a violin plot, with individual points (black circles) and outliers (black crosses), are shown. For the comparison, the p-value, Bayes factor BF10, and Cohen’s d are shown on top. C The fixation effects on the no answer rate (left) and wrong answer rate (right). D The mean reaction time (RT) for all participants, following the same presentation as in panel A. E The fixation effects on RT. F RT under trials with correct answers and wrong answers. G RT under trials with Aha and no Aha. H The fixation effect on the Aha rate. I The regression coefficients of accuracy for danswer and dcues are shown separately for the Fixation conditions (blue) and the Neutral condition (red). The means and the corresponding 95% confidence intervals are shown. The p-value for each coefficient is displayed next to it, using the corresponding color. Note that no regression was conducted for dcues in the Neutral condition due to its non-value. J The regression coefficients of RT for danswer and dcues are shown for the Aha trials (left) and the No-Aha trials (right). The same presentation as in panel I is used. K The regression coefficients of the Aha rate for danswer and dcues. The same presentation as in panel I is used.
The reaction times of all 349 participants are shown in Fig. 5D. Reaction time in the Fixation condition (18.5 ± 0.6 sec) was significantly longer than in the Neutral condition (14.7 ± 0.6 sec) (t(695) = 8.65, p = 3.61e-17, BF10 = 1.29e14; d = 0.65, 95% CI [0.50, 0.81]) (Fig. 5E), confirming that fixation cues made finding the solution more challenging. Additionally, reaction times were significantly shorter for trials with correct answers (12.4 ± 0.4 sec) compared to those with incorrect answers (24.8 ± 1.0 sec) (t(664) = 24.34, p = 5.30e−94, BF10 = 1.80e91; d = 1.89, 95% CI [1.70, 2.07]) (Fig. 5F). This suggests that incorrect answers were often provided close to the time limit, implying forced guessing under time pressure. Furthermore, among the correct trials, reaction times were significantly shorter for those with an Aha moment (Aha) (12.5 ± 0.59 sec) compared to those without an Aha moment (No-Aha) (14.1 ± 1.0 sec) (t(586) = 3.05, p = 0.0025, BF10 = 7.94; d = 0.25, 95% CI [0.09, 0.42]) (Fig. 5G). This suggests that the Aha moment was associated with a distinct type of search process.
Surprisingly, no statistically significant difference in Aha rate was found between the Fixation condition (59.8 ± 4.0%) and the Neutral condition (61.8 ± 3.8%) (t(685)= –0.57, p = 0.57, BF10 = 0.10; d= –0.04, 95% CI [–0.19, 0.11]) (Fig. 5H). This indicates that while fixation cues reduced both the overall likelihood and speed of finding a solution, they did not affect the probability of arriving at a solution through insight. This further suggests that the degree of fixation overcome to arrive at a solution is not a determining factor for the occurrence of insights.
Average accuracy and reaction time across the 16 RAT questions under different conditions are shown in Supplementary Fig. 5. As designed (see Methods), increasing difficulty was observed, indicated by lower accuracy and longer reaction times. Additionally, fixation effects were not limited to specific questions.
Unexpected impacts of fixation cues
Our next step was to conduct regression analysis to identify specific factors that contributed to the search process resulting in an Aha moment. For accuracy, the regression coefficient for danswer was significantly smaller than zero in the Fixation condition (Beta = –0.48, 95% CI [–0.56, –0.40], t(2789) =–11.35, p = 3.06e−29, logistic regression analysis, see details in Methods) and Neutral conditions (Beta = –0.34, 95% CI [–0.42, –0.26], t(2790)= –8.66, p = 7.78e−18) (Fig. 5I). As expected, this indicated that a lower accuracy was associated with a longer distance between the question and answer. On the other hand, the regression coefficient for dcues was significantly smaller than zero for the Fixation condition (Beta = –0.17, 95% CI [–0.24, –0.10], t(2789) = –4.75, p = 2.11e−6), indicating that a lower accuracy was associated with a longer distance between the question and cues. This was somewhat unexpected, as we had anticipated that closer cues would exert a stronger fixation effect and lead to reduced accuracy, which would be reflected by a positive coefficient.
For reaction time during the correct trials, we examined its correlation under Aha and No-Aha trials. The regression coefficient for danswer was significantly greater than zero in the Aha trials (Fixation: Beta = 1.62, 95% CI [0.99, 2.25], t(671) = 5.01, p = 6.91e-7; Neutral: Beta = 1.27, 95% CI [0.78, 1.76], t(907) = 5.07, p = 4.89e−7; linear regression analysis, see details in Methods) and No-Aha trials (Fixation: Beta = 1.83, 95% CI [1.21, 2.44], t(2115) = 5.80, p = 7.41e−9; Neutral: Beta = 1.85, 95% CI [1.16, 2.53], t(1881) = 5.29, p = 1.35e−7) (Fig. 5J). As expected, this indicated that a slower reaction was associated with a longer distance between the question and answer. On the other hand, the reaction time was significantly and positively correlated to dcues for Aha trials (Beta = 1.25, 95% CI [0.73, 1.78], t(671) = 4.67, p = 3.71e−6), but was significantly but negatively correlated to dcues for No-Aha trials (Beta = –1.22, 95% CI [–1.76, –0.68], t(2115)= –4.44, p = 9.49e−6). While the negative coefficient in No-Aha trials was expected, that is, closer cues had stronger fixation effects which led to longer reaction time, the positive correlation observed in Aha trials was unexpected.
For the Aha rate during correct trials, the regression coefficient for danswer was significantly less than zero in both the Fixation condition (Beta = –0.40, 95% CI [–0.49, –0.30], t(2789)= –8.26, p = 2.18e−16, logistic regression analysis; see details in Methods) and the Neutral condition (Beta = –0.25, 95% CI [–0.34, –0.17], t(2790)= –6.04, p = 1.77e−9) (Fig. 5K). This indicated that Aha moments occurred more frequently with easier questions, suggesting that an Aha moment is likely to arise when the solution is reached quickly. Conversely, the regression coefficient for dcues was also significantly less than zero in the Fixation condition (Beta = –0.13, 95% CI [–0.21, –0.05], t(2789)= –3.24, p = 1.20e−3). This indicated that Aha moments occurred more frequently when closer fixation cues were present, suggesting that an Aha moment is also likely to arise when a strong impasse is resolved (see further discussion in Discussion).
We also tested regressions with different combinations and forms of danswer and dcues (see Supplementary Table 3). For example, we considered the ratio danswer/dcues as a factor in determining question difficulty, and the ratio dcues/danswer as a factor for the fixation effect. We also considered the interaction between danswer and dcues (denoted as danswer*dcues). The fitting results indicated that danswer and dcues best explained accuracy and Aha rate, while danswer/dcues and dcues better explained reaction time for both Aha and No-Aha conditions. This suggests that, for the Fixation condition, the relative distance between the answer and the cues is a better indicator of question difficulty. However, this does not affect the unexpected effect observed regarding dcues.
In summary, our analysis revealed two unexpected results: (1) dcues was negatively correlated with accuracy, and (2) dcues was correlated with reaction times in opposite ways for Aha and No-aha trials. These results suggest that closer fixation cues not only help in solution discovery but also accelerate the insightful search process involving an Aha moment.
Experiment 2: TT-RAT
Exploration distance vs Aha moment
We further analyze the TT-RAT to examine the relationship between the exploration distance and Aha moment. The average accuracy was 51.8 ± 14.3% (mean ± std, n = 105 participants) (Fig. 6A), exceeding the 39.1% observed in the original setup (as shown in Fig. 5B). This improvement is deliberate, resulting from the use of new questions with smaller danswer values (see details in Methods). Additionally, for the trials answered correctly, participants recorded an average of 1.82 ± 0.59 thoughts per question, not including the answer itself (see Fig. 6B). For trials with no answers or wrong answers, participants reported an average of 3.61 ± 1.12 thoughts per question, which was significantly higher than the number reported for trials with correct answers (t(104) = 20.90, p = 5.13e−39, paired t-test, n = 105 participants). This suggests that participants experienced greater difficulty while searching for solutions.
Fig. 6. Exploration distance vs Aha moment in TT-RAT.
A The mean accuracy for all participants. These data are from n = 105 participants. B The number of thoughts for all participants. The mean and the standard devastation are shown. Gray circles represent the data from individual trials. C The correlation between Explore_dist and the Aha scale (left). Each circle represents a trial from a participant. For visualization purposes, a random jitter is added at the Aha scale. The Spearman’s rank correlation coefficient and the corresponding p-value are shown, and a fitting red line is shown for illustrative purposes. The Explore_dist in No-Aha and Aha trials (right). The mean and the corresponding 95% confidence interval and a violin plot, with individual points (black circles) and outliers (black crosses), are shown. For the comparison, the p-value, Bayes factor BF10, and Cohen’s d are shown on top. D The correlation between the number of thoughts taken to reach the solution and the Aha scale (left), and the number of steps taken in No-Aha and Aha trials (right). The same presentation is used as in panel C.
Subsequently, we performed a correlation analysis between the Explore_dist (derived from Eq. 5) for each correct trial and the corresponding Aha ratings on a scale from 1 to 7. The result showed a significant positive correlation (r = 0.139, 95% CI [0.070, 0.205], p = 3.84e-5, Spearman’s rank correlation coefficient), suggesting that stronger Aha experiences were linked to more extensive exploration distances (Fig. 6C, left). We also classified trials with an Aha rating of 5 or higher as Aha trials (yielding 481 trials), and those with a rating of 4 or lower as No-Aha trials (yielding 389 trials), and showed that the Explore_dist for Aha trials was 5.7 ± 0.1 (mean ± 95% confidence interval), significantly greater than 5.4 ± 0.1 for No-Aha trials (t(868) = 3.53, p = 0.0004, BF10 = 34.22, two-sample t-test; d = 0.24, 95% CI [0.11, 0.37]) (see Fig. 6C, right). Furthermore, we analyzed the correlation between the number of thoughts per correct trial and the corresponding Aha ratings, finding no statistically significant relationship (r = –0.002, 95% CI [–0.066, 0.065], p = 0.95). Similarly, no statistically significant difference in the number of thoughts was observed between Aha and No-Aha trials (t(868) = 0.17, p = 0.87, BF10 = 0.08; d = 0.01, 95% CI [–0.12, 0.15]) (see Fig. 6D). These findings suggest that the Aha experience is linked with greater exploration distances rather than with the quantity of thoughts.
Simulation model
Model explanation of distinctive fixation effect in Aha
An example of the search process is shown in Fig. 7A. With Explore_cap set to 1 (depicted in the left column), the search is deterministic, with the activation of incorrect Thoughts being progressively reduced through de-fixation until the correct answer emerges as the most activated kanji. When Explore_cap is increased to 5, the selection process for the Thought becomes stochastic, drawing from the top 5 activated kanji. The middle and right columns of Fig. 7A depict examples from two separate runs, in which the correct answer emerged as a candidate for several steps prior to its final selection in both instances. From the 1,975,000 simulations (the product of 250 model variations and 7900 runs), we analyzed the effects of the three parameters on reaction time, defined as the number of steps needed to arrive at the answer. As expected, reaction time lengthened with increased fixation, and shortened with stronger de-fixation and a greater exploration capacity (see Fig. 7B).
Fig. 7. Simulation model behavior.
A Example behavior for a specific question (Question #75 in Supplementary Table 1) and parameters (Fix_factor = 20 and Defix_factor = 0.4) is shown when Explore_cap = 1 (left column) and when Explore_cap = 5 (one run shown in the middle column and another run shown in the right column). For Activity (top row), the activation of each kanji over simulation steps is shown as a curve. For Thought (middle row), the candidate thoughts at each simulation step are shown as circles, and the selected thought is marked as a red asterisk. The question kanji, the cue kanji, and the answer kanji are indicated by horizontal black, blue, and red lines, respectively. The y-axis corresponds to the indices of kanji in the corpus. The step-by-step progression of thought toward reaching the answer is shown through the activations of Thought vs Answer (bottom row). The activations of Thought and Answer are represented in blue and red, respectively. B The relationships between log(RT) and parameters Fix_factor (top), Defix_factor (middle), and Explore_cap (bottom). The means and the corresponding 95% confidence intervals are shown. These data are from n = 1,975,000 simulations (250 model variations, each with 7900 runs).
Next, we examined whether the model can explain the two unexpected results from FC-RAT. First, we examined the correlations between dcues and the reaction time under different parameters. To achieve this, for each model configuration, we first computed the correlation coefficient between the dcues of the 79 RAT sets and their associated average reaction time (derived from 100 runs). This yielded 250 correlation coefficients, one for each model configuration. We then analyzed these coefficients across different Fix_factor values, disregarding the other two parameters. No statistically significant correlation was found across Fix_factor (Fig. 8A). Similarly, no statistically significant correlation was found across Defix_factor, except for Defix_factor = 1 (Fig. 8B). At this setting, any incorrect Thought is completely suppressed and will not be selected in subsequent steps.
Fig. 8. Model explanation of distinctive fixation effects in Aha and accuracy.
A Effects of Fix_factor on the correlation between dcues and log(RT). For each Fix_factor value, the correlation coefficient from each model configuration is marked by a cross, and their mean and the corresponding 95% confidence interval are shown in blue. These data are from n = 1,975,000 simulations. B Effects of Defix_factor on the correlation between dcues and log(RT). The same presentation is used as in (A). Correlation coefficients significantly different from zero are marked by a red asterisk. C Effects of Explore_cap on the correlation between dcues and log(RT). The same presentation is used as in (B). D Examples of the negative correlation between dcues and log(RT) were found when Explore_cap = 1 (left), and the positive correlation was found when Explore_cap = 8 (right). Each cross represents the dcues for a question and the corresponding log(RT) from a model configuration. The correlation, as shown in (C), was calculated from 79 questions for each model configuration and is illustrated by the red fitting line. Note that we calculated the correlation coefficient using Spearman’s rank correlation coefficient, while the fitting line is derived from linear regression and is included solely for illustrative purposes. E The mean reaction time for each question under each Explore_cap (left). The color represents the log(RT) value. For each question, the shortest reaction time across Explore_cap is indicated by a black circle. These optimal Explore_cap is plotted against the difficulty of the question’s danswer (right). Each circle represents a question. The Spearman’s rank correlation coefficient and the corresponding p-value are shown, and a fitting red line is shown for illustrative purposes. F Effects of the step threshold on the correlation between dcues and accuracy. For each threshold value, the correlation coefficient from each model configuration is marked by a cross, and their mean and the corresponding 95% confidence interval are shown in blue. Correlation coefficients significantly different from zero are marked by a red asterisk. G Effects of the time limit (an analytically determined threshold) on the correlation between dcues and accuracy from crowdsourced behavioral data from Experiment 1 (n = 349 participants).
For Explore_cap, significant negative correlations were found for smaller values (1 to 3), while significant positive correlations were observed for larger values (4 and above) (Fig. 8C). The actual reaction times and their corresponding dcues values for Explore_cap = 1, which showed a negative correlation, and for Explore_cap = 8, which showed a positive correlation, are presented in Fig. 8D. These contrasting correlations are similar to the opposing coefficients observed in Fig. 5J, suggesting that No-Aha, which have a negative correlation between dcues and reaction time, is associated with a smaller exploration capacity, whereas Aha, which have a positive correlation, corresponds to a greater exploration capacity.
Our model not only suggests that Aha moments are associated with a greater exploration capacity, allowing for a broader range of potential solutions to be considered, but it also suggests the existence of an optimal exploration capacity for each problem. Specifically, a greater exploration capacity is not always preferable. Intuitively, if the problem is simple and the answer lies among the highly activated thoughts, an excessively large exploration capacity can introduce distractions and prolong the search. Figure 8E depicts the optimal exploration capacity, which is the value of Explore_cap that resulted in the shortest reaction time, across all 79 question and cue sets. We observed that the optimal exploration capacity had a significant correlation with the difficulty of the problem (as measured by danswer) (r = 0.637, 95% CI [0.464, 0.773], p = 4.63e−10, Spearman’s rank correlation coefficient).
Model explanation of distinctive fixation effect in accuracy
We then turned our focus to the other unexpected behavioral finding, which was the negative correlation observed between dcues and accuracy (shown in Fig. 5I). To explore this, we set various thresholds for the number of steps required in the model, ranging from 10 to 200 steps. Runs that required steps exceeding the threshold were categorized as incorrect trials. We then computed the correlation coefficient between the dcues of the 79 RAT sets and their associated accuracy (derived from 100 runs). Significant positive correlations were found for smaller thresholds (below 50 steps), while significant negative correlations were observed for larger thresholds (above 100 steps) (Fig. 8F). This result suggests that the negative correlation observed in the crowdsourced behavioral data emerged when sufficient time was available for problem-solving, corresponding to scenarios with higher threshold values. Furthermore, this finding predicts that with a more restricted time limit (smaller threshold values), the correlation between dcues and accuracy would turn positive.
This model prediction was supported by the crowdsourced behavioral data from FC-RAT. By setting thresholds ranging from 5 seconds to the actual time limit of 45 seconds, we re-categorized trials with reaction times exceeding the threshold as analytically incorrect. The accuracy for each question was then calculated across participants, and Pearson’s correlation coefficient was computed between the accuracies across questions and their corresponding dcues values. The results are shown in Fig. 8G.
These findings suggest that under a short time constraint, where only easy questions are solvable, nearby fixation cues diminish accuracy. This seems logical as cues that clustered around the answer may sidetrack the search. In contrast, with a longer time limit, allowing for difficult questions to be tackled, distant fixation cues are more likely to reduce accuracy. This makes sense because the answers to difficult questions are located farther away, requiring a wider exploration range and consequently increases the likelihood of encountering distant fixation cues.
Discussion
We theorize that the process of arriving at insightful solutions involves a distinctive dynamic between de-fixation and exploration. By integrating two variations of RAT with a simulation model, we demonstrate that while de-fixation is critical for problem-solving, it is not a determining factor in generating insights. Instead, the signature of an insightful search process is a broader scope of exploration, enabling access to a larger number of potential solutions or more distant associations. While our model does not seek to precisely optimize and match behavioral data, it effectively elucidates three key aspects of the crowdsourced behavior: the contrasting roles of fixation in insightful versus non-insightful experiences (Fig. 5J), the impact of fixation on problem-solving accuracy (Fig. 5I), and the effects of fixation under varying time constraints (Fig. 8G).
Insightful search process versus Aha experience
Our study focuses on the unique search process leading to insightful solutions, without explicitly modeling the generation of the Aha experience. Our model aligns with the associationistic view that knowledge is encoded as a network of interconnected concepts (a knowledge graph), and problems are solved by retrieving the correct associative links10. However, rather than the perspective that insightful search is not fundamentally distinct but simply the retrieval of an unlikely association9,25, we argue that insight involves retrieving an unlikely association via an optimal exploration distance, which we demonstrate tends to be longer for difficult questions. In contrast, identifying an unlikely association through numerous small-range explorations is less likely to represent insight and more indicative of an analytical approach. We use the subjective Aha experience as a behavioral indicator for insight, since retrieving an improbable association with more jumpy long-range exploration could enable faster problem-solving than initially expected. This interpretation is supported by findings that Aha experiences correlate with a sudden decrease in the problem’s perceived difficulty26,27. However, faster solving times are not consistently linked to the Aha experience4,28. Moreover, subjective Aha experiences can also accompany incorrect solutions, known as false insights29, and may even bias the evaluation of solutions30. Therefore, an Aha experience alone may not definitively indicate true insight. Other phenomenological and behavioral variables, such as the suddenness, certainty, and pleasure associated with the experience, should also be considered29.
Beyond tasks with definite solutions, such as the RAT, a few studies have investigated insight in the context of divergent thinking tasks, where solutions lack specific criteria for correctness16,31. However, the search for insight in these open-ended problems is still defined primarily by the Aha experience rather than the attributes of the solutions themselves (e.g., fluency, novelty, feasibility). In other words, insight research tends to prioritize the subjective experience during problem-solving over evaluating the insightfulness of the generated solutions.
Key elements in our modeling approach
In the Input module of our model, all the question kanji are delivered simultaneously. Although the brain is capable of processing multiple information streams simultaneously, this parallel search process—despite producing one thought at a time—differs from the typical conscious approach we typically take, where potential solutions are often linked to only one of the question words24. Furthremore, although our model does not explicitly focus on a single question kanji at a time, it still initiates searches with words strongly linked to just one of the question kanji. To model a more realistic search process, it is necessary to understand the mechanisms that determine the selection of the question kanji (or English word) for the initial search and the criteria for switching to the others in subsequent search iterations. One potential mechanism is the spreading activation within semantic memory, where a word in working memory activates its corresponding representation in semantic memory and propagates activation along associative links to related concepts32. This process can represent a parallel and unconscious search mechanism.
In the Search module, the fixation process is represented as a constant amplification on the Activity vector. This design choice stems from the fact that the fixation cues were continuously visible on the screen throughout the questions in our FC-RAT and TT-RAT. This contrasts with the approach used in other studies, such as those involving misleading RAT problems, where fixation was induced by embedding it within two of the three question words33. Even when fixation is not directly delivered through stimuli, the fixation effect can still be modeled as a constant influence, as it is embedded in long-term memory. Fixation has also been introduced through recently-learned word pair associations21,22. In such cases, the fixation effect would need to incorporate a forgetting factor and dissipate over time.
Regarding de-fixation, another key element in the Search module, one potential underlying mechanism is retrieval-induced forgetting, an inhibitory process that suppresses non-target items to reduce competition and facilitate the retrieval of the target item34. Studies have demonstrated that individuals exhibiting more pronounced retrieval-induced forgetting tend to outperform others in overcoming fixation when solving RAT21,22. In computational models of RAT solving, de-fixation has been modeled as complete suppression of incorrect solutions (equivalent to Defix_factor = 1), and the capacity to solve RAT is attributed to the permitted number of retrieval attempts32. However, our findings indicate that the number of intermediate thoughts prior to arriving at the correct solution is not associated with insight (as shown in Fig. 6D). This introduces an additional variable, exploration capacity, into the equation for understanding proficiency in solving RAT problems. It is important to note that exploration capacity may be supported by the flexibility of brain dynamics, such as the number of brain states and state transitions. Moreover, it can serve as a predictive factor for insight, even before the problem is presented35. Another potential neural mechanism for adjusting exploration capacity or distance involves the relative engagement of the right and left brain hemispheres in coarse and fine semantic coding, respectively. The right hemisphere supports broader, higher-level concepts, and its activation facilitates insightful problem-solving14,36.
De-fixation has been found to be closely related to ideational fluency, a core component of creative thinking characterized by the ability to generate numerous ideas in response to open-ended prompts37. This suggests that the ability to suppress dominant or repetitive responses enables individuals to explore more novel and diverse ideas, emphasizing the strong (and potentially causal) link between de-fixation and exploration capacity. Furthermore, while we model de-fixation and exploration as alternating processes, it is important to acknowledge that other possibilities exist. For instance, in a computational framework that models creative processes through neurodynamics, these two processes can occur not only in series but also in parallel. In such cases, multiple attractors (representing stable brain states associated with specific thoughts) can be activated or destabilized simultaneously38.
Lastly, our model is designed to emulate the collective performance across individuals rather than individual variances. Previous research suggests that the RAT performance depends primarily on problem difficulty, particularly the degree of association remoteness, rather than differences in individual linguistic capabilities39. Contradictory findings, however, indicate that higher intelligence40 and advanced literacy skills41 are associated with better performance on RAT problems. The impact of these individual traits on the Aha experience, however, remains unclear and largely unexplored. Understanding the factors that contribute to individual differences in achieving insight is essential for understanding creativity, an inherently individual trait. To achieve this, in-depth behavioral and neural investigations, with questions and cues customized for individuals or by incorporating an individual’s vocabulary capacity into the model for data fitting, are required to shed light on the intricate dynamics of the insightful problem-solving process.
Fixation vs biased search
The fixation cues in our study aim to trap participants’ search processes; however, this might only bias the search path and does not directly imply true fixation or functional fixedness, which typically refer to mental rigidity or an inability to shift perspectives42. In other words, the cues may steer participants toward incorrect mental paths rather than completely trapping them in a rigid mental state.
If we assume that fixation cues only bias the search process, where distant cues drag the starting point further away from the solution, this could explain the observed negative correlation with accuracy (Fig. 5I). Specifically, the smaller the bias, the easier it is to find the solution. Similarly, a smaller bias correlates with a higher Aha rate (Fig. 5K) and shorter reaction times in Aha trials (Fig. 5J, left). These results straightforwardly indicate that the Aha experience is associated with faster solutions. However, it remains challenging to explain why smaller bias leads to longer reaction times in No-Aha trials (Fig. 5J, right). Furthermore, modeling this bias effect is complex. If we assume the bias alters the starting point of the search, what mechanism determines the subsequent search path? Does the search process simply take more time to start (e.g., to overcome a strong bias) compared to typical searches starting from more frequently linked words?
Aligned with this perspective, another potential theory is that the de-fixation process serves to rule out biases and resume searching. If we assume that closer cues are prioritized for elimination, reaction times would be shorter for closer cues (as shown in Fig. 5J, left). This suggests that Aha experiences are linked to the ability to effectively rule out cues (de-fixation). In contrast, this ability could be absent in No-Aha trials, where closer cues are not effectively removed, leading to a stronger delay in the search process (Fig. 5J, right).
Nonetheless, it is important to distinguish between bias and fixation as potential factors. To address this, future studies could measure whether participants experience difficulty breaking away from the influence of fixation cues even when new cues are provided. If their thought processes remain trapped and resistant to alternative paths, this would support the interpretation of true fixation rather than mere bias. Another approach to test fixation strength would involve conditions where participants are explicitly encouraged to ignore the cues or provided hints to break fixation. If participants still struggle under these conditions, this would further support the idea that the cues induce genuine fixation.
Beyond RAT
Insight has been demonstrated across a diverse range of tasks and domains, suggesting that it may involve domain-general cognitive processes43. This phenomenon is not limited to verbal challenges, such as RAT, but extends to visual and arithmetic puzzles like the nine-dot problem44, the two-string problem45, the matchstick problem46, and the hidden-rule discovery task47,48. These examples all share a common underlying mechanism: the need for a solver to restructure or reinterpret the problem space in novel ways, illuminating new paths to solution— an idea central to the representational change theory46,49. While our model highlights the unique search processes involved, it does not account for the aspect of representational change. Although RAT problems typically require straightforward strategies to solve, they often involve interpreting the question words in different ways to broaden the search for a solution.
Limitations and directions for future research
One clear limitation of our study is the small number of questions used, as the tasks were designed to be bundled with others. Increasing both the number and variety of questions and cues would certainly strengthen the study. On a more conceptual note, a key limitation is whether relying on the Aha experience alone is sufficient to identify insightful problem-solving. Insight can occur when the solution is discovered quickly, often resulting in a short reaction time. This explains why easier questions are associated with more Aha experiences (reflected in the negative coefficient between Aha rate and danswer in Fig. 5K). However, insight can also arise after a prolonged search, where resolving an impasse leads to the solution. This suggests that stronger fixation cues may result in more Aha experiences (reflected in the negative coefficient between Aha rate and dcues in Fig. 5K). To evaluate this interplay and gain a more comprehensive understanding of insight, it will be essential to incorporate self-reports of impasse experiences alongside Aha ratings. Furthermore, this approach will also help test the theories of fixation versus biased search and de-fixation versus exploration (as discussed earlier).
To develop a more comprehensive model of insight, future work should explore tasks allowing multiple strategies and quantifiable solution spaces. Capturing how people transition between different representations of a problem could elucidate the mechanisms driving insightful solutions. Additionally, individual differences in insight ability remain poorly understood. Investigating how factors like cognitive flexibility, working memory, and previous knowledge influence insightful problem-solving could identify the key cognitive capacities involved. Finally, combining computational modeling with neuroimaging techniques could reveal the neural underpinnings of insight and how they interact with the postulated cognitive processes.
Supplementary information
Acknowledgements
We thank Dr. Felix B. Kern for valuable suggestions and proofreading. We also thank Junko Taniai, Mika Matsuo, and Dr. Miyoko Street for experimental support and Japanese translation. This work was supported by the World Premier International Research Center Initiative (WPI), MEXT, Japan (to Z.C.C.), IRCN-Daikin SCP (to Z.C.C.), and JSPS KAKENHI (23K10397) (to C.W.). The funders had no role in study design, data collection, and analysis, the decision to publish, orthe preparation of the manuscript.
Author contributions
Z.C.C. conceptualized the study, designed and conducted the analysis and modeling, designed the behavioral tasks, designed the websites, and wrote the first draft of the paper. F.H. designed the websites, implemented the websites, and helped with the editing. C.W. designed the behavioral tasks, designed the websites, and helped with the editing. All authors contributed to and have approved the final paper.
Peer review
Peer review information
Communications Psychology thanks Melissa Schilling, Edward Bowden and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Jixing Li and Marike Schiffer. A peer review file is available.
Data availability
All the study materials and behavioral data are publicly available (https://osf.io/zbv9t/).
Code availability
All the analysis and modeling scripts are publicly available (https://osf.io/zbv9t/).
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
The online version contains supplementary material available at 10.1038/s44271-025-00235-4.
References
- 1.Bowden, E. M., Jung-Beeman, M., Fleck, J. & Kounios, J. New approaches to demystifying insight. Trends Cogn. Sci.9, 322–328 (2005). [DOI] [PubMed] [Google Scholar]
- 2.Kounios, J. & Beeman, M. The Eureka Factor: Creative Insights and the Brain. (Random House, 2015).
- 3.Halina, M. Insightful artificial intelligence. Mind Lang.36, 315–329 (2021). [Google Scholar]
- 4.Ishikawa, T., Toshima, M. & Mogi, K. How and when? Metacognition and solution timing characterize an “Aha” experience of object recognition in hidden figures. Front. Psychol.10, 1023 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Shen, W. et al. In search of somatic precursors of spontaneous insight. J. Psychophysiol.32, 97–105 (2018). [Google Scholar]
- 6.Webb, M. E., Cropper, S. J. & Little, D. R. “Aha!” is stronger when preceded by a “huh?”: presentation of a solution affects ratings of aha experience conditional on accuracy. Think. Reason.25, 324–364 (2019). [Google Scholar]
- 7.Webb, M. E., Little, D. R. & Cropper, S. J. Insight is not in the problem: investigating insight in problem solving across task types. Front. Psychol.7, 1424 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hélie, S. & Sun, R. Incubation, insight, and creative problem solving: A unified theory and a connectionist model. Psychol. Rev.117, 994–1024 (2010). [DOI] [PubMed] [Google Scholar]
- 9.Mayer, R. E. The search for insight: Grappling with Gestalt psychology’s unanswered questions. in The nature of insight 3–32 (The MIT Press, Cambridge, MA, US, 1995).
- 10.Schilling, M. A. A. ‘Small-World’ network model of cognitive insight. Creat. Res. J.17, 131–154 (2005). [Google Scholar]
- 11.Chu, Y., Dewald, A. & Chronicle, E. Theory driven hints in the cheap necklace problem: a preliminary investigation. J. Probl. Solving1, 18–32 (2007).
- 12.MacGregor, J. N., Ormerod, T. C. & Chronicle, E. P. Information processing and insight: A process model of performance on the nine-dot and related problems. J. Exp. Psychol. Learn. Mem. Cogn.27, 176–201 (2001). [PubMed] [Google Scholar]
- 13.Terai, H., Miwa, K. & Asami, K. Development and evaluation of the Japanese Remote Associates Test. Jpn. J. Psychol.84, 419–428 (2013). [DOI] [PubMed] [Google Scholar]
- 14.Bowden, E. M. & Jung-Beeman, M. Normative data for 144 compound remote associate problems. Behav. Res. Methods Instrum. Comput.35, 634–639 (2003). [DOI] [PubMed] [Google Scholar]
- 15.Mednick, S. The associative basis of the creative process. Psychol. Rev.69, 220–232 (1962). [DOI] [PubMed] [Google Scholar]
- 16.Webb, M. E., Little, D. R. & Cropper, S. J. Unusual uses and experiences are good for feeling insightful, but not for problem solving: contributions of schizotypy, divergent thinking, and fluid reasoning, to insight moments. J. Cogn. Psychol.33, 770–792 (2021). [Google Scholar]
- 17.Wu, C.-L., Huang, S.-Y., Chen, P.-Z. & Chen, H.-C. A systematic review of creativity-related studies applying the remote associates test from 2000 to 2019. Front. Psychol.11, 573432 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Dietrich, A. & Kanso, R. A review of EEG, ERP, and neuroimaging studies of creativity and insight. Psychol. Bull.136, 822–848 (2010). [DOI] [PubMed] [Google Scholar]
- 19.Wu, H.-Y. et al. Think hard or think smart: network reconfigurations after divergent thinking associate with creativity performance. Front. Hum. Neurosci.14, 571118 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Davelaar, E. J. Semantic search in the remote associates test. Top. Cogn. Sci.7, 494–512 (2015). [DOI] [PubMed] [Google Scholar]
- 21.Koppel, R. H. & Storm, B. C. Escaping mental fixation: Incubation and inhibition in creative problem solving. Memory22, 340–348 (2014). [DOI] [PubMed] [Google Scholar]
- 22.Storm, B. C. & Angello, G. Overcoming fixation: creative problem solving and retrieval-induced forgetting. Psychol. Sci.21, 1263–1265 (2010). [DOI] [PubMed] [Google Scholar]
- 23.Lakens, D., Scheel, A. M. & Isager, P. M. Equivalence testing for psychological research: a tutorial. Adv. Methods Pract. Psychol. Sci.1, 259–269 (2018). [Google Scholar]
- 24.Smith, K. A., Huber, D. E. & Vul, E. Multiply-constrained semantic search in the Remote Associates Test. Cognition128, 64–75 (2013). [DOI] [PubMed] [Google Scholar]
- 25.Schooler, J. W. & Melcher, J. The ineffability of insight. in The creative cognition approach 97–133 (The MIT Press, Cambridge, MA, US, 1995).
- 26.Dubey, R., Ho, M. K., Mehta, H. & Griffiths, T. Aha! moments correspond to meta-cognitive prediction errors. PsyArXiv 10.31234/osf.io/c5v42 (2021).
- 27.Moroshkina, N. V., Pavliuchik, E. I., Ammalainen, A. V., Gershkovich, V. A. & Lvova, O. V. The Aha! experience is associated with a drop in the perceived difficulty of the problem. Front. Psychol.15, 1314531 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Webb, M. E., Little, D. R., & Cropper, J. Once more with feeling: Normative data for the aha experience in insight and noninsight problems. Behav. Res. Methods50, 2035–2056 (2018). [DOI] [PubMed] [Google Scholar]
- 29.Danek, A. H. & Wiley, J. What about false insights? Deconstructing the Aha! Experience along its multiple dimensions for correct and incorrect solutions separately. Front. Psychol.7, 2077 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Laukkonen, R. E., Kaveladze, B. T., Tangen, J. M. & Schooler, J. W. The dark side of Eureka: Artificially induced Aha moments make facts feel true. Cognition196, 104122 (2020). [DOI] [PubMed] [Google Scholar]
- 31.Ding, K. et al. Recognizing ideas generated in a creative thinking task: Effect of the subjective novelty. Curr. Psychol.42, 529–541 (2023). [Google Scholar]
- 32.Schatz, J., Jones, S. J. & Laird, J. E. Modeling the remote associates test as retrievals from semantic memory. Cogn. Sci.46, e13145 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Sio, U. N., Kotovsky, K. & Cagan, J. Interrupted: The roles of distributed effort and incubation in preventing fixation and generating problem solutions. Mem. Cogn.45, 553–565 (2017). [DOI] [PubMed] [Google Scholar]
- 34.Anderson, M. C., Bjork, R. A. & Bjork, E. L. Remembering can cause forgetting: Retrieval dynamics in long-term memory. J. Exp. Psychol. Learn. Mem. Cogn.20, 1063–1087 (1994). [DOI] [PubMed] [Google Scholar]
- 35.Kounios, J. et al. The prepared mind: neural activity prior to problem presentation predicts subsequent solution by sudden insight. Psychol. Sci.17, 882–890 (2006). [DOI] [PubMed] [Google Scholar]
- 36.Beeman, M. et al. Summation priming and coarse semantic coding in the right hemisphere. J. Cogn. Neurosci.6, 26–45 (1994). [DOI] [PubMed] [Google Scholar]
- 37.Benedek, M., Jauk, E., Sommer, M., Arendasy, M. & Neubauer, A. C. Intelligence, creativity, and cognitive control: The common and differential involvement of executive functions in intelligence and creativity. Intelligence46, 73–83 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Minai, A. A., Doboli, S. & Iyer, L. R. Models of Creativity and Ideation: An Overview. In Creativity and Innovation: Cognitive, Social, and Computational Approaches (eds. Doboli, S., Kenworthy, J. B., Minai, A. A. & Paulus, P. B.) 21–45 (Springer International Publishing, Cham, 2021). 10.1007/978-3-030-77198-0_2.
- 39.Marko, M., Michalko, D. & Riečanský, I. Remote associates test: An empirical proof of concept. Behav. Res. Methods51, 2700–2711 (2019). [DOI] [PubMed] [Google Scholar]
- 40.Lee, C. S. & Therriault, D. J. The cognitive underpinnings of creative thought: A latent variable analysis exploring the roles of intelligence and working memory in three creative thinking processes. Intelligence41, 306–320 (2013). [Google Scholar]
- 41.Bowden, E. M. & Jung-Beeman, M. Aha! Insight experience correlates with solution activation in the right hemisphere. Psychon. Bull. Rev.10, 730–737 (2003). [DOI] [PubMed] [Google Scholar]
- 42.Duncker, K. On problem-solving. Psychol. Monogr.58, 1–113 (1945).
- 43.Plucker, J. A. Beware of simple conclusions: the case for content generality of creativity. Creat. Res. J.11, 179–182 (1998). [Google Scholar]
- 44.Maier, N. R. F. Reasoning in humans. I. On direction. J. Comp. Psychol.10, 115–143 (1930). [Google Scholar]
- 45.Maier, N. R. F. Reasoning in humans. II. The solution of a problem and its appearance in consciousness. J. Comp. Psychol.12, 181–194 (1931). [Google Scholar]
- 46.Knoblich, G., Ohlsson, S., Haider, H. & Rhenius, D. Constraint relaxation and chunk decomposition in insight problem solving. J. Exp. Psychol. Learn. Mem. Cogn.25, 1534–1555 (1999). [Google Scholar]
- 47.Lacaux, C. et al. Sleep onset is a creative sweet spot. Sci. Adv.7, eabj5866 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Terai, H. & Miwa, K. Sudden and gradual processes of insight problem solving: investigation by combination of experiments and simulations. Proc. Annu. Meet. Cogn. Sci. Soc.28, 28 (2006). [Google Scholar]
- 49.Ohlsson, S. Information-processing explanations of insight and related phenomena. Adv. Psychol. Think. 1, 1–44 (1992).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All the study materials and behavioral data are publicly available (https://osf.io/zbv9t/).
All the analysis and modeling scripts are publicly available (https://osf.io/zbv9t/).








