ABSTRACT
Objective:
With the rise in popularity of the theory of mind (ToM), defined as the ability to understand that others’ beliefs, desires, and intentions may differ from one’s own, numerous tools have been developed since the 1990s. However, the use of disparate tasks to measure the same construct, the lack of a standardized task battery, and the inadequate validity/reliability of existing TOM measures have contributed to inconsistent research findings. This study developed the HACETTEPE-Computer Based Theory of Mind Battery (HACETTEPE CBToM), which utilizes three-dimensional colored animations, focuses on social interactions, and integrates cognitive/affective dimensions. Comprehensive validity and reliability studies were conducted.
Method:
The validity and reliability studies of the battery, which consists of eight scenarios (four second-order false belief tasks: two cognitive/two affective, and four irony tasks: two cognitive/two affective), were carried out with 214 healthy adults aged 18-36.
Results:
Construct validity was evaluated through confirmatory factor analysis, and the fit indices indicated an excellent model fit [χ2(19, N=214) =26.14, p>0.05, χ2/df=1.38, RMSEA=0.042, SRMR=0.05, GFI=0.97, AGFI=0.95, CFI=0.98, TLI(NNFI)=0.97]. For criterion validity, a positive and significant correlation was found between the scores of the HACETTEPE-CBToM Battery and the Dokuz Eylül Theory of Mind Scale (r=0.32, p<0.05). The inter-rater reliability and internal consistency coefficients were r=0.94 and r=0.72, respectively.
Conclusion:
The HACETTEPE-CBToM Battery is a culturally appropriate, ecologically valid, and psychometrically robust tool for detailed assessment of ToM.
Keywords: Cognitive sciences, neuropsychologic test, reliability and validity, theory of mind
INTRODUCTION
Theory of mind (ToM) is defined as the ability to understand that the knowledge, emotions, desires, intentions, and wishes of others may differ from one’s own. It is considered one of the fundamental cognitive skills at the core of social interaction (Wellman 1990). According to Frith (1992), ToM is an internal or mental meta-representation of oneself and others. In this context, ToM, generally described as the ability to make inferences about the mental states of others, is a fundamental mechanism that enables individuals to gain insight into their own minds, including their psychiatric issues (Wiffen et al. 2013). For this reason, the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) (American Psychiatric Association 2013) identifies ToM as one of the core cognitive functions that can be affected by neurocognitive disorders.
Various tasks are used to measure ToM, categorized as cognitive and affective, including first-order true and false belief, second-order true and false belief, metaphor comprehension, irony comprehension, faux pas recognition, white lie detection, and bluff understanding (Happé 1994, Corcoran et al. 1995, Happé et al. 1998, Stone et al. 1998, Rowe et al. 2001, Gregory et al. 2002, German and Hehman 2006, Sprong et al. 2007, Değirmencioğlu 2008, Bora 2009).
Cognitive ToM is defined as the ability to infer others’ thoughts and beliefs, while affective ToM refers to the ability to infer others’ emotions and feelings. The most valid method for measuring cognitive ToM is through true and false belief tasks (Wimmer and Perner 1983). First-order true belief tasks involve scenarios where both the protagonist and the participant share the same accurate knowledge about reality, requiring the participant to understand what the protagonist knows. In contrast, first-order false belief tasks involve situations where the knowledge of the protagonist and the participant about reality diverges, requiring the participant to reference the false belief of the protagonist. In this case, a participant capable of understanding that others may hold false beliefs can predict what the person knows accordingly (Stone et al. 1998). Second-order true and false belief tasks, on the other hand, involve scenarios where one person reflects on what a second person knows about a third person’s belief. In essence, the primary difference between first-order and second-order false belief tasks is the introduction of a third person, making the task cognitively more demanding. False belief tasks are crucial for ToM as they assess the ability to disregard one’s own knowledge and understand that another person may possess information different from one’s own. This ability is a key indicator of ToM competence.
Affective ToM is often assessed using tasks that involve pictorial (visual) stimuli representing complex emotional states or narrative (verbal) tasks describing the emotions of a protagonist (Baron-Cohen et al. 2001, Shamay-Tsoory et al. 2006, Shamay-Tsoory et al. 2007, Bottiroli et al. 2016). On the other hand, tasks involving indirect linguistic expressions, such as social faux pas, irony, metaphor, bluff, and white lies, evaluate individuals’ ability to understand the true message underlying the words (Happé 1994, Brüne and Brüne-Cohrs 2006, Sprong et al. 2007). According to Youmans (2004, as cited in Değirmencioğlu, 2008), these tasks represent more complex and nuanced ToM evaluations, requiring the interpretation of nonliteral speech. Harrington et al (2005) suggest that these tasks reflect the pragmatic understanding of speech and require comprehension of what the speaker knows, believes, or intends (Baron-Cohen et al. 1999).
Irony, one of the indirect linguistic expression tasks that can have cognitive and emotional types, is associated with the second-order false-belief tasks (Herold et al. 2002, Shamay-Tsoory et al. 2005, Shamay-Tsoory and Aharon-Peretz 2007). Irony is often used to implicitly or subtly criticize a person or situation or to express negativity, frequently accompanied by disdain, contempt, or disapproval. The individual employing irony creates a contextually incongruent situation, assuming that the recipient recognizes this discrepancy and understands the underlying meaning (i.e., the opposite of what is explicitly stated is intended). If the recipient accurately interprets the statement, they do not perceive the irony as a mistake or a lie.
The lack of comprehensive and methodologically controlled psychometric evaluations (particularly construct and criterion validity) of existing ToM tasks in the literature (Harrington et al. 2005, Sprong et al. 2007, Bora 2009, Bora et al. 2009), the limited number of studies addressing their face and ecological (external) validity (Corcoran and Frith, 2003), and the lack of comparability among findings obtained from different ToM tasks have led to inconsistent results and raised questions about the robustness of the measurement tools. Consequently, the need for a universally agreed-upon, objective, and reliable ToM assessment has been frequently emphasized in the literature (Harrington et al. 2005).
ToM tasks, initially developed for children and later applied to adults without any structural or content modifications, often result in ceiling effects due to their simplicity (Saltzman et al. 2000, Henry et al. 2013). It is suggested that this issue could be resolved by developing more challenging ToM tasks. In Turkey, ToM has been predominantly studied in child samples (Sarı 2011, Karakelle and Ertuğrul 2012, Kaysılı 2013, Şahin et al. 2020, Tülü and Ergül 2022), with a growing emphasis on the need for its investigation in adult samples as well (Küçük 2018, Ertuğrul Yaşar 2022).
In recent years, the increasing demand for ToM measurement in both clinical and healthy adult populations has led to a rise in the adaptation of scales developed for Western cultures into Turkish (Yıldırım et al. 2011, Tanrıverdi 2022, Törenli Kaya et al. 2023). However, existing ToM tasks, often composed of comic-strip stories with hand-drawn illustrations, have been criticized for low ecological validity, making it difficult to generalize findings to real-life contexts (Henry et al. 2013). Tests such as the Movie for the Assessment of Social Cognition (MASC) (Dziobek et al. 2006), which use video stimuli and claim to address external validity issues, still face limitations in psychometric evaluations, task details, diversity, and the number of videos included. Similarly, the Edinburgh Social Cognition Test (ESCoT) (Baksh et al. 2018), which comprises of 11 videos featuring animations with social interactions and distinguishes between cognitive and affective ToM, has been criticized for cultural inappropriateness in some visuals (e.g., right-hand drive vehicles and left-hand traffic) and for leaving the scoring of emotions to the discretion of the test administrator, complicating standardization (Tanrıverdi 2022). Additionally, ESCoT’s psychometric evaluation in Turkish samples is limited by its small sample size, which is only 37 participants, and the absence of construct validity analysis.
The aim of this study is: a) to develop the HACETTEPE-Computer-Based Theory of Mind (HACETTEPE-CBToM) Battery, which addresses the issues in existing ToM tasks, demonstrates high external validity, avoids ceiling effects, includes different types of ToM, and is psychometrically and methodologically robust, and b) to experimentally measure the cognitive and affective dimensions of ToM.
METHOD
The ethical approval required for the applications was obtained from the Hacettepe University Non-Interventional Clinical Research Ethics Committee with the decision dated 13.06.2017 and numbered 16969557-868.
Stages of the HACETTEPE-CBToM Battery Development Process
Stage I: Development of Second-Order False Belief and Irony Comprehension Scenarios
Within the scope of the battery, 14 cognitive second-order false belief (2FB_C), 14 affective second-order false belief (2FB_A), 14 cognitive irony comprehension (I_C), and 14 affective irony comprehension (I_A) tasks were developed, resulting in 28 second-order false belief (2FB) and 28 irony comprehension (I) tasks. The task scenarios were created by a seven-member expert consisting of academic psychologists. The expert group divided the tasks into 28 cognitive (2FB and I scenarios) and 28 affective (2FB and I scenarios), producing a total of 56 scenarios. Subsequently, adjustments were made to ensure equivalence in terms of the number of characters and whether the characters had knowledge of reality (for details see Aslankara 2019). After these adjustments, evaluation forms designed to assess the scenarios’ ability to represent ToM and its dimensions (cognitive/affective) were administered to 126 university students (x=20.19, SD=3.73). The forms used a 5-point Likert scale (Does Not Represent at All, Does Not Represent, Neutral, Represents, Completely Represents).
The cognitive and affective scenarios were presented in a randomized order, ensuring that the same type of scenario (e.g., cognitive/cognitive/cognitive) did not appear consecutively. To reduce systematic error, four different versions of the evaluation forms (Forms A, B, C, and D) with varied scenario sequences were used.Formun Altı
Stage II: Item Difficulty Analyses
Based on the item difficulty analysis for 2FB and I scenarios, one 2FB scenario (Scenario 6) was excluded as it was classified as “difficult but not distinctive.” Another scenario (Scenario 4), while categorized as “difficult but distinctive,” was also eliminated because its average ability to represent second-order false belief tasks, as assessed using a 5-point Likert scale, was below 3 (x=2.98, SD=1.25).
After this elimination process, 26 2FB scenarios remained. Among these, the 4 cognitive and 4 affective 2FB scenarios with the highest representational strength and categorized as “difficult but distinctive” were selected, totaling 8 2FB scenarios. Similarly, from the I scenarios, all of which were deemed “difficult but distinctive,” the 4 cognitive and 4 affective I scenarios with the highest representational strength were chosen.
In total, 16 ToM scenarios were selected: 8 2FB scenarios (4 cognitive and 4 affective) and 8 I scenarios (4 cognitive and 4 affective).
Stage III: Expert Evaluation
The 16 ToM scenarios selected in Stage II, along with the referee evaluation form used previously, were presented to a group of 23 experts (2 psychiatrists and 21 psychologists) for their assessment. Based on expert recommendations, the 8 scenarios with the highest representational strength were selected, consisting of 4 2FB (2 cognitive and 2 affective) and 4 I scenarios (2 cognitive and 2 affective).
The selected I scenarios were balanced in terms of the gender of the characters, while the 2FB scenarios were balanced both in terms of character gender and the number of individuals in the scenario aware of the “final reality.” For example, in one cognitive 2FB scenario, only one person knows the “final reality,” while in the other cognitive scenario, two people know it.
The mean and standard deviations of the selected 8 scenarios in terms of their representational strength for cognitive/ affective and 2FB/I tasks are provided in Table 1.
Table 1.
The Means and Standard Deviations of the Representational Power of Selected 8 ToM Scenarios for Cognitive/Affective and Second-Order False Belief/Irony Comprehension Tasks Assessed by a 5-Point Likert Scale
| Scenario No | Dimension of Task | x– | SD | ToM Task | x | SD |
|---|---|---|---|---|---|---|
| 12 | Affective | 4.67 | 0.48 | Second-Order False Belief | 4.43 | 0.6 |
| 22 | Affective | 3.76 | 1.64 | Second-Order False Belief | 4.24 | 1 |
| 9 | Cognitive | 3.71 | 1.71 | Second-Order False Belief | 4.29 | 0.85 |
| 11 | Cognitive | 4.62 | 0.5 | Second-Order False Belief | 4.71 | 0.46 |
| 22 | Affective | 4.1 | 1.14 | Irony | 4.71 | 0.56 |
| 26 | Affective | 4.14 | 0.79 | Irony | 4.67 | 0.48 |
| 5 | Cognitive | 4.43 | 0.6 | Irony | 4.67 | 0.48 |
| 25 | Cognitive | 4.57 | 0.51 | Irony | 4.62 | 0.5 |
x– = Mean, ToM Task = Theory of Mind Task, SD = Standard Deviation,
As a result of the three stages outlined above, 8 ToM scenarios were selected from the initial pool of 56 scenarios to form the HACETTEPE-CBToM Battery. These 8 scenarios were converted into animated, colorful videos, and an original software program for the ToM Battery was developed.
Stage IV: Development of Questions for ToM Scenarios
For the 8 selected ToM scenarios, a total of 53 questions were created for 2FB scenarios, considering all possible question combinations based on the temporal precedence and succession of each character’s perspective. However, due to concerns that the number of questions might be excessive for the potential sample (healthy adults and psychiatric/neurological patient groups), the number of questions was reduced to 14.
For I scenarios, a total of 10 questions were developed to determine whether the irony was understood, along with logic and reality-based questions. (For detailed information on the criteria used to determine the questions for 2FB and I scenarios, see Aslankara 2019.)
Stage V: Expert Evaluation of Questions for ToM Scenarios
The questions updated based on the recommendations of psychiatrists specializing in Theory of Mind were answered by 10 doctoral students studying in the field of psychology. Separate answer keys were then created for each ToM scenario.
Stage VI: Refinements to the Answer Key Pool for ToM Tasks
The created answer keys were revised to ensure they were more detailed, clear, and understandable for scorers. Updates were made in terms of format (language, expression, and Turkish grammar rules) and content (sentence and word restrictions, and specific informational notes added to each scenario’s answer key for practitioners). (For detailed information, see Aslankara 2019).
Stage VII: Validity Study
Content and Face Validity
The content and face validity of the ToM scenarios were assessed through the feedback and revisions provided by 23 experts, excluding the researchers (For detailed information, see Stage III: Expert Evaluation).
Construct Validity
A review of the literature reveals that the psychometric properties (particularly construct and criterion validity) of many existing ToM tasks have not been thoroughly examined (Sprong et al. 2007). Similarly, studies on the face and ecological validity of these tasks are limited (Concoran and Frith 2003).
In this study, a hypothesis was developed regarding the structure of the latent variables (2FB and I) and their relationships, based on a theoretical framework. The validity of these variables was examined using confirmatory factor analysis (CFA).
Data for the CFA were collected from 214 healthy university students (mean age=20.56, SD=3.19) studying in various departments of universities in Ankara. None of the participants had participated in the earlier stages of the study, and they reported no psychiatric diagnoses. Of the participants, 77.1% were female and 22.9% were male.
Participants were excluded from the study if they had untreated hearing or vision impairments, physical disabilities preventing them from completing the computer-based application, color blindness, a history of psychiatric, neurological, or psychological disorders requiring hospitalization, ongoing psychiatric medication use, recent cessation of such medication, or the use of other drugs that could cause cognitive impairment.
Since the scores from the developed ToM tasks were ordinal (0, 0.5, and 1 point), z-scores (±3.29) typically used for continuous variables to detect outliers (Field 2009) could not be applied during data cleaning. Normality tests indicated that the dataset did not exhibit skewness or kurtosis and followed a normal distribution.
The ToM battery tasks were composed of experimentally developed scenarios and scenario-specific, equivalent questions. For instance, Question 3 in the 2FB_C1 scenario is equivalent to Question 3 in the 2FB_C2 scenario in terms of the timing of the event in the scenario, the number of characters, and thus the level of difficulty. The same equivalence applies to the four scenarios under each latent variable (2FB and I), making it necessary to test whether the scenarios indeed belong under the targeted latent variables. To achieve this, a model was constructed using the mean scores obtained from the 8 ToM scenarios (10 questions for 2FB and 6 questions for I) and subjected to CFA.
In CFA, it is first necessary to define the model to test the proposed hypotheses. A model is considered identified if a unique numerical solution or value can be assigned to each parameter (Eroğlu 2003). For testing hypotheses generated based on relationships between variables, the model must be overidentified, meaning it has more known data points than estimated parameters (Khine 2013). Using the T-rule (Kenny 2011) to calculate the degrees of freedom, it was determined that the proposed model is overidentified and achieves good model-data fit.
Another assumption of factor analysis is that the data should exhibit multivariate normality. Using the AMOS 24 software package, the Mardia standardized multivariate kurtosis value was calculated as 5.86. A comparison (5.86 < 8) indicated that the dataset satisfies the assumption of multivariate normality.
Based on these preliminary evaluations, a first-order CFA model was constructed to test the latent factors in the structure of the ToM battery and their interdependent effects using the Maximum Likelihood (ML) estimation method in the AMOS 24 software. The latent variables 2FB and I were represented as ellipses (Figure 1). These two factors were interconnected and shown with a bidirectional arrow. The eight observed variables representing the factors were depicted as rectangles. 2FB_C1, 2FB_C2, 2FB_A1, and 2FB_A2 loaded onto the 2FB factor; I_C1, I_C2, I_A1, and I_A2 loaded onto the I factor.
Figure 1.

Standardized Results and First-Order CFA Results of HACETTEPE-CBToM (Two-Factor Model)
Note: The values shown in the figure (from right to left) are: error variances (unique/error variances) for ToM subtests, standardized factor loadings, and the correlation coefficient between the two factors.
The results of the CFA are illustrated in Figure 1, and the fit indices for the proposed model are presented in Table 2. Since there is no definitive consensus on which fit indices should be evaluated or accepted as standard (cited in Çapık 2014, Karagöz and Ağbektaş 2016), commonly used fit indices, including χ², df, χ²/df, RMSEA, SRMR, GFI, AGFI, CFI, and TLI (NNFI), were reported. The CFA results showed that the model had an excellent fit based on the following fit indices: χ² (19, N=214)=26.14, p>0.05, χ²/df=1.38, RMSEA=0.042 (90% CI: 0.000–0.078), SRMR=0.05, GFI=0.97, AGFI=0.95, CFI=0.98, TLI (NNFI)=0.97. These results indicate that the model achieved an excellent level of fit.
Table 2.
CFA Fit Index Values for HACETTEPE-CBToM
| Model | χ2 | df | χ2/df | RMSEA | SRMR | GFI | AGFI | CFI | TLI (NNFI) |
|---|---|---|---|---|---|---|---|---|---|
| CFA | 26.14 | 19 | 1.38 | 0.042 (0.000 – 0.078)* | 0.047 | 0.971 | 0.946 | 0.978 | 0.967 |
N=214, p>0.05.
χ2 = Chi-Square, df = Degrees of Freedom, RMSEA = Root Mean Square Error of Approximation, SRMR = Standardized Root Mean Square Residuals, GFI = Goodness of Fit Index, AGFI = Adjustment Goodness of Fit Index, CFI = Comparative Fit Index, TLI (NNFI) = Tucker-Lewis Index (Bentler-Bonnet Non-normed Fit Index)
The values given in parentheses are the 90% confidence interval values.
The subtest factor loadings for the model ranged from 0.39 to 0.83 for 2FB scenarios and 0.35 to 0.69 for I scenarios. All subtests loaded onto the latent factors were found to be statistically significant (p<0.001). Additionally, all factor loadings exceeded 0.32, the acceptable threshold for factor loadings suggested by Tabachnick and Fidell (2014). Although this threshold primarily applies to exploratory factor analysis (EFA), these and other loading ranges (≥0.70 excellent, 0.63 very good, 0.55 good, 0.45 moderate, and 0.32 weak) are also considered useful guidelines for CFA coefficients, which account for measurement error (DiStefano and Hess 2005). Finally, the explanatory power of the model was evaluated by calculating the determination coefficient (R²) for the constructs. According to Falk and Miller (1992), R² values should be ≥0.10. The findings showed that all R² values exceeded this threshold. The CFA results are presented in Table 3.
Table 3.
CFA Results of HACETTEPE-CBToM
| β | t | SE | R2 | ||
|---|---|---|---|---|---|
| 2FB | 2FB_C1 | 0.50 | 7.09* | 0.02 | 0.25 |
| 2FB_C2 | 0.39 | 5.37* | 0.02 | 0.15 | |
| 2FB_A1 | 0.83 | 12.29* | 0.02 | 0.70 | |
| 2FB_A2 | 0.76 | 11.06* | 0.02 | 0.57 | |
|
| |||||
| I | I_C1 | 0.35 | 4.44* | 0.01 | 0.12 |
| I_C2 | 0.49 | 6.34* | 0.01 | 0.24 | |
| I_A1 | 0.69 | 8.84* | 0.01 | 0.47 | |
| I_A2 | 0.68 | 8.77* | 0.01 | 0.46 | |
β = Factor Loadings, t = Test Statistic, SE = Standard Error, R2 = Coefficient of Determination
The factor score weights of the scenarios are presented in Table 4. Upon examining the table, it is evident that each of the developed scenarios falls under its respective latent variable as expected. Additionally, the standardized correlation value between the latent variables was found to be 0.49, which is statistically significant (p<0.001) (see Table 5). This result suggests that 2FB and I represent distinct structures within ToM tasks. According to DiStefano and Hess (2005), the correlation between latent variables can serve as evidence of construct validity. Therefore, the statistically significant correlation (r=0.49, p<0.05) provides support for the construct validity of the developed battery.
Table 4.
Factor Score Weights of HACETTEPE-CBToM Scenarios
| 2FB _C1 | 2FB _C2 | 2FB _A1 | 2FB _A2 | I_C1 | I_C2 | I_A1 | I_A2 | |
|---|---|---|---|---|---|---|---|---|
| 2FB | 0.074 | 0.070 | 0.296 | 0.250 | 0.013 | 0.027 | 0.056 | 0.047 |
| I | 0.008 | 0.008 | 0.033 | 0.028 | 0.067 | 0.144 | 0.293 | 0.250 |
Table 5.
Standardized Correlation and t-Value Between Latent Variables of HACETTEPE-CBToM
| β | SE | t | p | |||
|---|---|---|---|---|---|---|
| 2FB | <--> | I | 0.49 | 0.08 | 6.29 | 0.000 |
p<0.001; β: Factor Loadings, SE: Standard Error, t = Test Statistic
In summary, CFA, which is used to determine the construct validity of a measurement tool—that is, how accurately it measures abstract constructs (Çokluk et al. 2014)—evaluates the consistency between the proposed theoretical model and the observed covariance, as well as whether similar constructs are measured with the same level of accuracy across different samples (measurement invariance) (Kline 1998). Within this framework, the findings of the CFA demonstrate that the proposed theoretical construct, ToM, was successfully measured, and construct validity was established.
Based on the CFA results, convergent and discriminant validity, which are methods of construct validity, were also tested for the proposed model. These methods involve evaluating the measurements in relation to each other rather than using an external standard (Kline 2011). For assessing convergent validity, which indicates the extent to which measurements within the same construct are related, standardized factor loadings can be utilized. While factor loadings of ≥0.70 indicate good convergent validity, loadings of ≥0.50 are considered acceptable (Hair et al. 2013, Ekinci Demirelli 2024, Yaşlıoğlu 2017). In this context: The Composite/Construct Reliability (ComR/CR) value for both 2FB and I exceed both the Average Variance Extracted (AVE) value and the 0.70 threshold. The AVE values for both 2FB and I do not exceed 0.50, but the AVE value for 2FB is close to 0.50. Moreover, when the AVE value is below 0.50 but the ComR value exceeds 0.70, the latent factors are assumed to be reliable (Hair et al. 2021).
To assess discriminant validity, which indicates the distinctiveness of constructs, two additional values need to be calculated: the Maximum Squared Variance (MSV) and the Average Shared Squared Variance (ASV). The MSV represents the square of the highest shared variance (the square of the correlation between two constructs) between one factor and any other factor (Hair et al. 2013, Yaşlıoğlu 2017). Since there are only two latent variables in this study, the ASV could not be computed, and discriminant validity was evaluated based on MSV and AVE. The MSV value (0.24) is smaller than the AVE values for both latent variables (2FB: 0.42; I: 0.32). Additionally, the square root of the AVE values (2FB: 0.65; I: 0.57) is greater than the interfactor correlation value (0.49). Beyond these measurements, the Heterotrait-Monotrait ratio of correlations (HTMT), proposed by Henseler et al. (2015) and frequently used in recent years, was also calculated. An HTMT value below 0.90 indicates that discriminant validity is established (Henseler et al. 2015). The HTMT value for this study was calculated as 0.53, providing further evidence for discriminant validity.
In summary, both convergent and discriminant validity have been established for the two latent factors. The β, ComR, AVE, and HTMT values for the latent variables are presented in Table 6.
Table 6.
β, ComR, AVE, and HTMT Values of Latent Variables of HACETTEPE-CBToM
| Latent Variables | Observed Variables | β | ComR | AVE | HTMT | |
|---|---|---|---|---|---|---|
| 2FB | ← | 2FB_C1 | 0.505 | 0.94 | 0.42 | 0.53 |
| 2FB | ← | 2FB_C2 | 0.392 | |||
| 2FB | ← | 2FB_A1 | 0.835 | |||
| 2FB | ← | 2FB_A2 | 0.757 | |||
|
| ||||||
| I | ← | I_C1 | 0.352 | 0.90 | 0.32 | |
| I | ← | I_C2 | 0.491 | |||
| I | ← | I_A1 | 0.687 | |||
| I | ← | I_A2 | 0.68 | |||
β = Factor Loadings, ComR = Composite Reliability, AVE = Average Variance Extracted, HTMT = Heterotrait-Monotrait Ratio of Correlations
Criterion Validity
To assess criterion validity, the correlation between the developed ToM battery and the Dokuz Eylül Theory of Mind Scale (DEZTÖ) (Değirmencioğlu 2008) was examined. While DEZTÖ has been shown to be a valid and reliable scale for use in patients diagnosed with schizophrenia, there is no evidence supporting its validity in healthy adult populations (for details, see Değirmencioğlu 2008). However, the scale was revised and updated in 2018 by Değirmencioğlu and colleagues using the same dataset, under the new name DEZİKÖ. Validity and reliability analyses were conducted again for the updated scale (for details, see Değirmencioğlu et al. 2018).
In the current study, data were collected from 55 volunteer university students who had not participated in the main study to determine criterion validity using DEZTÖ/DEZİKÖ, which has psychometric properties deemed acceptable. A Pearson Product-Moment Correlation Analysis revealed a positive but weak correlation (r=0.32, p<0.05) between the scores from the 8 developed ToM scenarios and the DEZTÖ/DEZİKÖ scores. This pattern is thought to stem from differences in the number of items and the difficulty levels of the items in the two scales.
Stage VIII: Reliability Study
For inter-rater reliability, Spearman’s rho values were used, and for internal consistency reliability, the Cronbach’s α coefficient was calculated. Since the scoring of the battery is ordinal, Cronbach’s α was calculated as the ordinal α (Gadermann et al. 2012, Zumbo et al. 2007). For this purpose, the Product-Moment Correlation Matrix, which is valid for linear relationships, was converted into a Polychoric Correlation (PCC) matrix. This conversion was performed using the PLOYMAT-C program, and the ordinal α coefficient was subsequently calculated using SPSS code written for the PK matrix. The ordinal α coefficients for each scenario are presented in Table 7. For the total internal consistency reliability of the battery, calculated based on average scores, the Cronbach’s α coefficient was found to be 0.72. The correlation matrix between scenarios/items, based on average scores, is provided in Table 8.
Table 7.
Ordinal α Coefficients of HACETTEPE-CBToM
| Factor | Item Number | α |
|---|---|---|
| 2FB_C1 | 10 | 0.931 |
| 2FB_C2 | 10 | 0.827 |
| 2FB_A1 | 10 | 0.925 |
| 2FB_A2 | 10 | 0.918 |
| I_C1 | 6 | 0.496 |
| I_C2 | 6 | 0.417 |
| I_A1 | 6 | 0.488 |
| I_A2 | 6 | 0.478 |
α = Alpha Coefficient
Table 8.
Correlation Matrix for HACETTEPE-CBToM Scenario Items
| 2FB_C1 | 2FB_C2 | 2FB_A1 | 2FB_A2 | I_C1 | I_C2 | I_A1 | I_A2 | |
|---|---|---|---|---|---|---|---|---|
| 2FB_C1 | 1 | |||||||
| 2FB_C2 | 0.30 ** | 1 | ||||||
| 2FB_A1 | 0.41 ** | 0.32 ** | 1 | |||||
| 2FB_A2 | 0.36 ** | 0.26 ** | 0.64 ** | 1 | ||||
|
| ||||||||
| I_C1 | 0.08 | 0.08 | 0.06 | 0.12 | 1 | |||
| I_C2 | 0.25** | 0.19** | 0.20** | 0.22** | 0.24 ** | 1 | ||
| I_A1 | 0.25** | 0.20** | 0.30** | 0.29** | 0.24 ** | 0.28 ** | 1 | |
| I_A2 | 0.15* | 0.08 | 0.25** | 0.18** | 0.23 ** | 0.36 ** | 0.48 ** | 1 |
p<0.05;
p<0.01.
The correlation values highlighted in bold represent the correlations of tasks related to each other. As expected, scenarios under the same latent variable exhibit higher correlation values, while scenarios under different latent variables show lower correlation values. The correlation value between the latent variables is 0.49 and statistically significant.
An α coefficient of around 0.70 is considered “adequate,” around 0.80 is deemed “very good,” and 0.90 or higher is evaluated as “excellent” (Kline 2011). These same criteria apply to the evaluation of ordinal α coefficients as well (Lorenzo-Seva and Ferrando 2015). In this study, the ordinal α coefficient values range between 0.83 and 0.93 for 2FB scenarios, suggesting excellent internal consistency, while they range between 0.42 and 0.50 for I scenarios, indicating poor internal consistency. For the total battery, the α coefficient is 0.72, which is considered adequate. Thus, it can be concluded that the internal consistency is excellent for the 2FB scenarios, but weak for the I scenarios. An α coefficient below 0.50 suggests that most of the variance in the observed scores is due to random error. However, it is noted that in cases where sample sizes are sufficiently large, lower reliability levels may still be acceptable when latent variable methods are employed (Kline 2011). Additionally, the ComR values for both 2FB and I scenarios exceed 0.70, and the AVE values support reliability. Furthermore, given that the questions for the scenarios were developed through experimental methods, no questions were removed from the I scenarios despite their lower reliability scores.
The inter-rater reliability study was conducted using the data from 55 volunteer university students collected for the criterion validity study. Two psychologists, who received detailed training on how to score the responses for each scenario in the battery, independently evaluated the data of the 55 participants. The analysis, performed using Spearman’s rho, revealed that the inter-rater reliability coefficient for the total ToM score (sum of 8 scenarios) was rs=0.94 (p<0.001). For individual scenarios, reliability coefficients ranged from 0.57 (I_C1) to 0.997 (2FB_C2). The correlation matrix, along with the significance (p) values for each scenario and the total ToM score, is presented in Table 9.
Table 9.
Inter-Rater Reliability Coefficients Obtained for HACETTEPE-CBToM Scenarios and the Total HACETTEPE-CBToM Score
| N = 55 | Spearman’s rho | p |
|---|---|---|
| 2FB_C1 | 0.907 | 0.000 |
| 2FB_C2 | 0.997 | 0.000 |
| 2FB_A1 | 0.931 | 0.000 |
| 2FB_A2 | 0.808 | 0.000 |
| I_C1 | 0.573 | 0.000 |
| I_C2 | 0.870 | 0.000 |
| I_A1 | 0.778 | 0.000 |
| I_A2 | 0.750 | 0.000 |
| ToM Total | 0.935 | 0.000 |
Spearman’s rho = Spearman’s Rank Correlation Coefficient
The flowchart summarizing the development stages of the HACETTEPE-CBToM Battery is presented in Figure 2.
Figure 2.

Flowchart Depicting the Development Stages of the Tasks in the HACETTEPE-CBToM Battery
DISCUSSION
The necessity of a test that addresses the methodological issues outlined in the introduction regarding the measurement of ToM, provides an objective and reliable assessment of ToM, and demonstrates robust psychometric properties is frequently emphasized in the literature (Harrington et al. 2005).
In studies conducted with adults, ToM is often examined using a single task, making the evaluation of convergent validity challenging. For example, a review of 30 articles examining ToM in adults diagnosed with schizophrenia using multiple ToM tasks (first- and second-order false belief, irony, and metaphor tasks) revealed that only five of these studies reported convergent validity (Harrington et al. 2005). On the other hand, findings on the discriminant validity of frequently used first- and second-order false belief tasks in ToM measurement are inconsistent. For instance, tasks with discriminant validity are crucial to determine which ToM tasks are independent of intelligence or, in clinical studies, which symptoms are associated with impaired ToM performance (Harrington et al. 2005). As highlighted by Harrington et al. (2005), conducting validity and reliability studies for ToM tasks is an urgent necessity. One reason for this urgency is the growing confusion over which of the increasing number of ToM tasks is the most reliable measure of ToM (Harrington et al. 2005).
The DEZTÖ, developed in Turkey by adapting tasks used in the international literature and addressing various aspects of ToM, has been shown to be a valid and reliable tool for use in patients diagnosed with schizophrenia (Değirmencioğlu 2008). In a subsequent study, the updated version, renamed DEZİKÖ, was also found to be valid for use in healthy adult populations (Değirmencioğlu et al. 2018).
In this study, unlike DEZTÖ/DEZİKÖ and other ToM tasks in the literature, the HACETTEPE-CBToM Battery was developed specifically for healthy adults, featuring tasks with high difficulty levels and ecological validity (reflecting real-life events) presented through three-dimensional, colorful, and vivid animations. A comprehensive series of validity and reliability studies were conducted for this battery. Additionally, to address the disadvantages of the binary scoring system (e.g., 0-1 or Yes-No) used in existing ToM tests, a ranking-based scoring system (0, 0.5, and 1 points) was preferred.
CFA conducted for construct validity demonstrated excellent model fit indices [χ² (19, N=214)=26.14, p>0.05, χ²/df=1.38, RMSEA= 0.042, SRMR=0.05, GFI = 0.97, AGFI =0.95, CFI=0.98, and TLI (NNFI)=0.97]. The subtest factor loadings ranged between 0.39 and 0.83 for 2FB scenarios and 0.35 and 0.69 for I scenarios, with all subtests significantly loading onto their respective latent factors (p<0.001). Furthermore, all factor loadings exceeded the acceptable threshold of 0.32, and all R² values, reflecting the model’s explanatory power, were above 0.10. Based on the CFA results, convergent and discriminant validity, which assess the relationship of measures to one another rather than to an external standard, were tested. It was proven that both convergent and divergent validity were provided for both latent factors. For criterion validity, a positive but weak correlation (r=0.32, p<0.05) was found between scores from the HACETTEPE-CBToM Battery and the DEZTÖ/DEZİKÖ scale. The inter-rater reliability and internal consistency (Cronbach’s α) coefficients were r=0.94 and r=0.72, respectively. When psychometric parameters are evaluated as a whole, the HACETTEPE-CBToM Battery emerges as a psychometrically robust tool for assessing ToM. Compared to other international and national tests/scales measuring ToM, it is one of the rare studies that determines subtest factor loadings and the relationship between latent factors and subtests using the factor analysis method, and it also has the feature of being an ambitious ToM battery developed for a healthy adult sample, featuring high ecological validity and unique, scenario-based tasks.
In conclusion, the HACETTEPE-CBToM Battery has been developed as a psychometrically robust, computer-based tool featuring items of varying difficulty levels. It can be applied either modularly or as a whole and is characterized by high ecological validity, social interaction-based tasks, and cognitive and emotional dimensions. It is recommended that future studies explore the usability of this battery in various clinical samples.
Acknowledgement:
We would like to thank Psk. Özlem DURMAZ for her contribution to the data collection and inter-reviewer reliability study in the pilot study.
Footnotes
Grant and Support Information: The service required for the animations was procured by a private software company. The fee was covered by the Faculty Member Training Program (ÖYP) budget. The payment was made directly to the company by the university. Apart from this, no support was received for the study in question.
*This article was produced from the doctoral thesis of the first author.
REFERENCES
- 1.American Psychiatric Association (2013) Diagnostic and Statistical Manual of Mental Disorders. 5th Edition (DSM-5) Washington DC: American Psychiatric Association; [Google Scholar]
- 2.Aslankara M. Ankara: Hacettepe University; 2019. Investigation of Theory of Mind in the context of cognitive psychology:the study of discrimination from executive functions and computer based battery development (Unpublished doctoral dissertation) [Google Scholar]
- 3.Baksh RA, Abrahams S, Auyeung B, et al. The Edinburgh Social Cognition Test (ESCoT):examining the effects of age on a new measure of theory of mind and social norm understanding. PloS one. 2018;13:1–16. doi: 10.1371/journal.pone.0195818. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Baron-Cohen S, O'Riordan M, Stone V, et al. Recognition of faux pas by normally developing children and children with Asperger syndrome or high-functioning autism. J Autism Dev Disord. 1999;29:407–18. doi: 10.1023/a:1023035012436. [DOI] [PubMed] [Google Scholar]
- 5.Baron-Cohen S, Wheelwright S, Hill J, et al. The “Reading the Mind in the Eyes“test revised version:a study with normal adults, and adults with Asperger syndrome or high-functioning autism. J Child Psychol Psychiatry. 2001;42:241–51. [PubMed] [Google Scholar]
- 6.Bora E. Theory of mind in schizophrenia spectrum disorders. Turkish Journal of Psychiatry. 2009;20:269–81. [PubMed] [Google Scholar]
- 7.Bora E, Yücel M, Pantelis C. Theory of mind impairment in schizophrenia:meta-analysis. Schizophr Res. 2009;109:1–9. doi: 10.1016/j.schres.2008.12.020. [DOI] [PubMed] [Google Scholar]
- 8.Bottiroli S, Cavallini E, Ceccato I, et al. Theory of Mind in aging:comparing cognitive and affective components in the faux pas test. Arch Gerontology Geriatrics. 2016;62:152–62. doi: 10.1016/j.archger.2015.09.009. [DOI] [PubMed] [Google Scholar]
- 9.Brüne M, Brüne-Cohrs U. Theory of mind-evolution ontogeny brain mechanisms and psychopathology. Neurosci Biobehav Rev. 2006;30:437–55. doi: 10.1016/j.neubiorev.2005.08.001. [DOI] [PubMed] [Google Scholar]
- 10.Corcoran R, Frith CD. Autobiographical memory and theory of mind:evidence of a relationship in schizophrenia. Psychol Med. 2003;33:897–905. doi: 10.1017/s0033291703007529. [DOI] [PubMed] [Google Scholar]
- 11.Corcoran R, Mercer G, Frith CD. Schizophrenia symptomatology and social inference:investigating “theory of mind“in people with schizophrenia. Schizophr Res. 1995;17:5–13. doi: 10.1016/0920-9964(95)00024-g. [DOI] [PubMed] [Google Scholar]
- 12.Çapık C. Use of confirmatory factor analysis in validity and reliability studies. J Anatolia Nurs Health Sci. 2014;17:196–205. [Google Scholar]
- 13.Çokluk Ö, Şekercioğlu G, Büyüköztürk Ş. Ankara: Pegem Akademi; 2014. Sosyal Bilimler için Çok Değişkenli İstatistik:SPSS ve LISREL Uygulamaları. [Google Scholar]
- 14.Değirmencioğlu B. Reliability and validity study of Dokuz Eylül Theory of Mind Index (Unpublished doctoral dissertation) Dokuz Eylül University, İzmir. 2008 [Google Scholar]
- 15.Değirmencioğlu B, Alptekin K, Akdede BB, et al. The validity and reliability study of the Dokuz Eylül Theory of Mind Index (DEZİKÖ) in patients with schizophrenia. Turkish Journal of Psychiatry. 2018;29:193–201. [PubMed] [Google Scholar]
- 16.DiStefano C, Hess B. Using confirmatory factor analysis for construct validation:an empirical review. J Psychoeduc Assess. 2005;23:225–41. [Google Scholar]
- 17.Dziobek I, Fleck S, Kalbe E, et al. Introducing MASC:a movie for the assessment of social cognition. J Autism Dev Disord. 2006;36:623–36. doi: 10.1007/s10803-006-0107-0. [DOI] [PubMed] [Google Scholar]
- 18.Ekinci Demirelli A. Investigation of construct validity of the work-life balance scale in female physicians with confirmatory factor analysis. IJMEB. 2024;20:508–31. [Google Scholar]
- 19.Eroğlu E. Analysis of total quality management practices by structural equation modeling (Unpublished doctoral dissertation) Istanbul University, İstanbul. 2003 [Google Scholar]
- 20.Ertuğrul Yaşar Z. A review of theory of mind:different mental states and lifespan development. JODAP. 2022;3:75–92. [Google Scholar]
- 21.Falk RF, Miller NB. A Primer for Soft Modeling. Ohio, University of Akron, s. 1992:80. [Google Scholar]
- 22.Field AP. 3rd Edition. London: Sage; 2009. Discovering Statistics Using SPSS. [Google Scholar]
- 23.Frith CD. UK: Psychology; 1992. The Cognitive Neuropsychology of Schizophrenia. [Google Scholar]
- 24.Gadermann AM, Guhn M, Zumbo BD. Estimating ordinal reliability for Likert-type and ordinal item response data:a conceptual empirical and practical guide. Pract Assess Res Evaluation. 2012;17:1–13. [Google Scholar]
- 25.German TP, Hehman JA. Representational and executive selection resources in 'theory of mind':evidence from compromised belief-desire reasoning in old age. Cogn. 2006;101:129–52. doi: 10.1016/j.cognition.2005.05.007. [DOI] [PubMed] [Google Scholar]
- 26.Gregory C, Lough S, Stone V, et al. Theory of mind in patients with frontal variant frontotemporal dementia and Alzheimer's disease:theoretical and practical implications. Brain. 2002;125:752–64. doi: 10.1093/brain/awf079. [DOI] [PubMed] [Google Scholar]
- 27.Hair JF, Black WC, Babin BJ, et al. UK: Pearson, s; 2013. Multivariate Data Analysis:Pearson New International Edition, 7th Edition; pp. 686–90. [Google Scholar]
- 28.Hair JF, Hult GTM, Ringle CM, et al. New York: SAGE Publications; 2021. A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM), 3rd Edition. [Google Scholar]
- 29.Happé FG. An advanced test of theory of mind:understanding of story characters'thoughts and feelings by able autistic mentally and normal children and adults. J Autism Dev Disord. 1994;24:129–54. doi: 10.1007/BF02172093. [DOI] [PubMed] [Google Scholar]
- 30.Happé FG, Winner E, Brownell H. The getting of wisdom:theory of mind in old age. Dev Psychol. 1998;34:358–62. doi: 10.1037//0012-1649.34.2.358. [DOI] [PubMed] [Google Scholar]
- 31.Harrington L, Siegert R, McClure J. Theory of mind in schizophrenia:a critical review. Cogn Neuropsychiatry. 2005;10:249–86. doi: 10.1080/13546800444000056. [DOI] [PubMed] [Google Scholar]
- 32.Henry JD, Phillips LH, Ruffman T, et al. A meta-analytic review of age differences in theory of mind. Psychol Aging. 2013;28:826–39. doi: 10.1037/a0030677. [DOI] [PubMed] [Google Scholar]
- 33.Henseler J, Ringle CM, Sarstedt M. A new criterion for assessing discriminant validity in variance-based structural equation modeling. Journal of the Academy of Marketing Sci. 2015;43:115–35. [Google Scholar]
- 34.Herold R, Tényi T, Lénárd K, et al. Theory of mind deficit in people with schizophrenia during remission. Psychol Med. 2002;32:1125–9. doi: 10.1017/s0033291702005433. [DOI] [PubMed] [Google Scholar]
- 35.Karagöz Y, Ağbektaş A. Development of syndicate satisfaction scale by structural equation modeling:a sample of Sivas province. Bartın University Faculty Economics Administrative Sci. 2016;13:274–90. [Google Scholar]
- 36.Karakelle S, Ertuğrul Z. Do developmental relationships between theory of mind language working memory and executive functions show differences across early (36-48 months) and late (53-72 months) age groups? Turkish Journal of Psychology. 2012;27:1–21. [Google Scholar]
- 37.Kaysılı B. Theory of Mind:a comparison of children with autism spectrum disorders and typically developing children. A. U. Faculty of Educational Sciences Journal of Special Education. 2013;14:83–103. [Google Scholar]
- 38.Kenny DA. Terminology and basics of SEM. 2011. Retrieved from http://davidakenny.net/cm/basics.htm .
- 39.Khine MS. Application of Structural Equation Modeling in Educational Research and Practice. 2013 NL, Sense. [Google Scholar]
- 40.Kline RB. Software review:software programs for structural equation modeling:Amos EQS and LISREL. J Psychoeduc Assess. 1998;16:343–64. [Google Scholar]
- 41.Kline RB. New York: The Guilford; 2011. Principles and Practice of Structural Equation Modeling. [Google Scholar]
- 42.Küçük Z. Theory of Mind and developmental processes. UUFASJSS. 2018;19:475–503. [Google Scholar]
- 43.Lorenzo-Seva U, Ferrando PJ. POLYMAT-C:a comprehensive SPSS program for computing the polychoric correlation matrix. Behav Res Methods. 2015;47:884–9. doi: 10.3758/s13428-014-0511-x. [DOI] [PubMed] [Google Scholar]
- 44.Rowe AD, Bullock PR, Polkey CE, et al. 'Theory of mind'impairments and their relationship to executive functioning following frontal lobe excisions. Brain. 2001;124:600–16. doi: 10.1093/brain/124.3.600. [DOI] [PubMed] [Google Scholar]
- 45.Saltzman J, Strauss E, Hunter M, et al. Theory of mind and executive functions in normal human aging and Parkinson's disease. J Int Neuropsychol Soc. 2000;6:781–8. doi: 10.1017/s1355617700677056. [DOI] [PubMed] [Google Scholar]
- 46.Sarı OT. İstanbul: Marmara University; 2011. Theory of Mind Stories Test adaptation of Turkish children and comparison of theory of mind development of normal development preschool mentally retarded and children with autism (Unpublished doctoral dissertation) [Google Scholar]
- 47.Shamay-Tsoory SG, Aharon-Peretz J. Dissociable prefrontal networks for cognitive and affective theory of mind:a lesion study. Neuropsychologia. 2007;45:3054–67. doi: 10.1016/j.neuropsychologia.2007.05.021. [DOI] [PubMed] [Google Scholar]
- 48.Shamay-Tsoory SG, Shur S, Barcai-Goodman L, et al. Dissociation of cognitive from affective components of theory of mind in schizophrenia. Psychiatry Res. 2007;149:11–23. doi: 10.1016/j.psychres.2005.10.018. [DOI] [PubMed] [Google Scholar]
- 49.Shamay-Tsoory SG, Tibi-Elhanany Y, Aharon-Peretz J. The ventromedial prefrontal cortex is involved in understanding affective but not cognitive theory of mind stories. Soc Neurosci. 2006;1:149–66. doi: 10.1080/17470910600985589. [DOI] [PubMed] [Google Scholar]
- 50.Shamay-Tsoory SG, Tomer R, Aharon-Peretz J. The neuroanatomical basis of understanding sarcasm and its relationship to social cognition. Neuropsychology. 2005;19:288–300. doi: 10.1037/0894-4105.19.3.288. [DOI] [PubMed] [Google Scholar]
- 51.Sprong M, Schothorst P, Vos E, et al. Theory of mind in schizophrenia. Br J Psychiatry. 2007;191:5–13. doi: 10.1192/bjp.bp.107.035899. [DOI] [PubMed] [Google Scholar]
- 52.Stone VE, Baron-Cohen S, Knight RT. Frontal lobe contributions to theory of mind. J Cogn Neurosci. 1998;10:640–56. doi: 10.1162/089892998562942. [DOI] [PubMed] [Google Scholar]
- 53.Şahin B, Önal BS, Hoşoğlu E. Adaptation of Faux Pas Recognition Test Child Form to Turkish and investigation of psychometric properties. Anadolu Psikiyatri Derg. 2020;21:54–63. [Google Scholar]
- 54.Tabachnick BG, Fidell LS. 6th Edition. UK: Pearson, s; 2014. Using Multivariate Statistics:Pearson New International Edition; p. 699. [Google Scholar]
- 55.Tanrıverdi O. İstanbul: İstanbul Medipol University; 2022. Turkish adaptation reliability and validity of ESCoT (Edinburgh Social Cognition Test) (Unpublished master's thesis) [Google Scholar]
- 56.Törenli Kaya Z, Alpay EH, Türkkal YenigüçŞ, et al. Validity and reliability of the Turkish version of the Mentalization Scale (MentS) Turk Psikiyatri Derg. 2023;34:118–24. doi: 10.5080/u25692. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Tülü B, Ergül C. The test of Theory of Mind for children aged 3-5 years:a validity and reliability study. Marmara Üniversitesi Atatürk Eğitim Fakültesi Eğitim Bilimleri Dergisi. 2022;55:31–61. [Google Scholar]
- 58.Wellman HM. (1990) The Child's Theory of Mind. Cambridge: The MIT Press; [Google Scholar]
- 59.Wiffen BD, O'Connor JA, Gayer-Anderson C, et al. “I am sane but he is mad“:insight and illness attributions to self and others in psychosis. Psychiatry Res. 2013;207:173–8. doi: 10.1016/j.psychres.2013.01.020. [DOI] [PubMed] [Google Scholar]
- 60.Wimmer H, Perner J. Beliefs about beliefs:representation and constraining function of wrong beliefs in young children's understanding of deception. Cogn. 1983;13:103–28. doi: 10.1016/0010-0277(83)90004-5. [DOI] [PubMed] [Google Scholar]
- 61.Yaşlıoğlu MM. Factor analysis and validity in social sciences:application of exploratory and confirmatory factor analyses. Istanbul Bus Res. 2017;46:74–85. [Google Scholar]
- 62.Yıldırım EA, Kaşar M, Güdük M, et al. Investigation of the reliability of the “Reading the Mind in the Eyes Test“in a Turkish population. Turk Psikiyatri Derg. 2011;22:177–86. [PubMed] [Google Scholar]
- 63.Zumbo BD, Gadermann AM, Zeisser C. Ordinal versions of coefficients alpha and theta for Likert rating scales. J Mod Appl Stat Methods. 2007;6:21–9. [Google Scholar]
