Table 3.
Measure | Strengths | Limitations |
---|---|---|
Sally Anne Task | • Can be used with children • Tests understanding of both first- and second-order belief • False belief tasks in general are established tests of ToM available in a variety of forms • Relatively pure measure of cognitive ToM |
• Not originally designed for adults • Executive functions affect performance (93, 94, 139) • Format of presentation will also influence performance (139, 470) |
Strange Stories | • Validity, e.g., correlated with measures of relational perspective taking (156, 471) and the Faux Pas Task (157) • Associated with social competence in epilepsy (157) • Includes control-type “physical” stories • Insight offered by multiple scoring techniques including number of mental states attributed, appropriateness and quality (149) • Naturalistic style task (149) |
• Performance is affected by reading comprehension (155), IQ (153, 163, 471–473), and executive function (145, 161, 163) • General inferential ability, social norms, and autobiographical memory may influence performance (474) • Typical children don’t reach ceiling (474) • Different studies use different length versions • Lack of vocal cues limits ecological validity (146) • Physical (control) stories are not well matched (152) • Age effects (145, 475) |
The Yoni Task | • Tests both cognitive and affective mental states, and first- and second-order belief • Visual task which could reduce working memory demand • Ease of presentation and can be used well with children (175) • Validity supported by correlations with, e.g., false belief tasks (67) • Affective trials can be related to quality of life measure in Parkinson’s disease (172) • The authors also developed a related task to assess understanding of socially competitive emotions |
• Executive functions (175, 176, 454) and IQ can affect performance (176, 454) • It is not clear if these factors differentially influence the cognitive and affective aspects, i.e., that the demands of all trials are comparable • Simply relying on eye gaze direction may help answer some trials, although there are some control trials with eye gaze straight ahead |
Animations Task | • Can be used to reveal both hypo- and hyper-mentalizing • Can assess spontaneous mental state reasoning, therefore has good ecological validity, and may be more challenging and sensitive than some other tasks • Non-facial as well as non-verbal stimuli • Multiple scores meaning complex patterns of performance and selective deficits can be identified • Can be related to social, school, and occupational functioning in schizophrenia (476) • Responses can be scored for length as a control |
• Complex scoring and transcription required, a need for multiple raters • The clips are short: standardized instructions are required in relation to the number of viewings to permit • Experimenter must avoid providing cues as to the nature of the task • Verbal abilities from speech to vocabulary will influence response quality (e.g., 186) and visual attention may affect performance • Possible gender effect (186) • The video clips are not matched across condition in terms of length or complexity |
Intentions Comic Strip Task | • Avoids verbal demands, which makes it accessible across cultures and enhances the purity of the measure • Useful for fMRI experiments (e.g., 191) • Contains useful control conditions • Factor analysis supports the validity of the three conditions (477) • Taps implicit reasoning • Fairly pure measure of cognitive ToM |
• Possible ceiling effect in controls (478) • Used in few clinical groups overall • Studies have yet to explore the contribution of, e.g., executive functions to task performance |
Pictures of Facial Affect | • Can be used to reveal emotion specific deficits • Suitable for use with children (479) • May be a sensitive measure in terms of tracking disorder state (e.g., 210) • Performance can indicate carer burden (480) • Includes neutral trials can offer particular insight (481) • Validity supported by associations with other social cognitive tasks (223) |
• Only assesses recognition of basic emotions and mainly negative emotions • Motor contribution unknown • Associated with global cognition or education (238, 482) and IQ (211, 216) • Interpretation is complex as performance could be impaired by self-awareness (483), problems with motor simulation, or memory • Possible gender (484) and age effects (479, 485, 486) • Time-limited format may lead to guessing (222) • Little ethnic variation in stimuli, grayscale, old fashioned (479) • Ecologically validity is limited by the use of static images • Possible effects of field of presentation (487) |
The Assessment of Social Inference Test | • No ceiling effect (488) • Linked to functional outcome/social skills in schizophrenia (238, 274) and in traumatic brain injury (489), as well as caregiver burden (231, 232) • Comprehensive and naturalistic, as taps ability to use a range of skills in combination, including facial expression and other non-verbal cues (490) • Good construct and convergent validity as related to other perspective taking measures (230) and IRI (242) • More challenging and less contrived than facial expressions • Lots of norms available for scoring • Dynamic, not static, so better predictive value (491) • Indexes frontal lobe volume loss in fronto-temporal dementia (234) • Good psychometrics (223) |
Age effect (228, 238, 492–494) • Performance is influenced by vocabulary (494, 495), IQ (249, 489), education (238), and executive functions (228–230, 245, 496) including processing speed and working memory (223) • Motor component is unclear (497) • Lengthy task for impaired patients, although a short version is now available (496) • Surprise items are poor (230) • Forced-choice response format limits ecological validity (242) • Impairments could simply reflect poor face emotion recognition as this is correlated (209, 249, 489) |
Movie for the Assessment of Social Cognition | • Can detect both hypo- and hyper-mentalizing • Tests understanding of both cognitive and affective mental state reasoning and fine-grained assessment that can reveal selective deficits (69, 259) • Reliable in adolescents (260, 498) • Good psychometrics (250) including internal consistency and reliability (263, 273) • Ecologically valid (267) • Not related to verbal IQ (69) • Validity supported by correlations with other social cognitive tasks (150, 151, 260, 499) but not always correlated with other social cognitive tasks (273) • Not affected by culture or social desirability (150, 151) |
• Depression, IQ, and executive functions can affect performance (255, 265, 501) • Age effects (265, 270, 499) • Uses only second-person perspective and participant is observer (499), should add self-referent aspect (271) • Long time to administer and score—45–70 min (150, 151) • Use of contextual cues could mask a deficit (468) • Stress can affect performance (502) • Need trained raters (69, 259) • Doesn’t tap implicit social cognition (250) • Further psychometric analysis would be helpful |
Hinting Task | • Takes less than 10 min to administer (278) • Strong test–retest reliability and good internal consistency (500) • Not associated with IQ (294, 503) • Validity supported by correlation with spoken prosody (504) and correlates with other social cognitive tasks, e.g., emotion recognition (505) • Related to social functioning in schizophrenia (274, 506) • Not associated with referential thinking in general (507, 508) |
• Potential ceiling effect (274, 275, 300) • Only assesses cognitive ToM • Poor test–retest reliability and practice effect (274) • Highly dependent on verbal comprehension (293) and associated with IQ (509) • Executive function may affect performance (504, 510–514), especially processing speed and memory (297) • Age effect (301) |
Reading the Mind in the Eyes Test | • Validity supported by strong association with other social cognitive measures, e.g., Hinting task (506), IRI-PT (515) but perhaps only a weak correlation with autism spectrum quotient (516) • No ceiling in controls, can examine positive, negative, and neutral trials separately (e.g., 382–384) and use RT to offer insight (382–384, 517) • Scores remain stable over time (518) • Short administration time (typically 10–15 min) • Can use across cultures (349) and many existing translations • Not just basic emotion recognition (519) • Associated with social factors such as maternal functioning (520), social isolation (506), and clinical change in psychosis (521) • Test–retest reliability is fairly good for the child version of RMET and one study demonstrated no learning effects (522). |
• Gender effects are debated (361, 515, 518, 523–525) • Performance is associated with visuospatial skills (512), reading (526), autobiographical memory (527), IQ (528–532), and executive function (533; my papers; 298, 534) • Debate as to whether stress affects performance (502, 535) • Age effects (160, 523, 536) • Cronbach’s alpha can be low (312, 537) • The stimuli were restricted to only Caucasians in the original task, and a gender confound as the males are older, less attractive, and more negative (538) • Ecological validity is also weakened by static images, specificity of cues and forced-choice response format • Better control tasks are needed (539) • Debate over whether the task measures cognitive or affective ToM, or empathy, or emotion recognition (261) • Some items have floor or ceiling effects |
Faux Pas Task | • Used to test cognitive and affective ToM, with multiple layers of difficulty, and fine-grained analysis possible • Control stories are included and can indicate hyper-mentalizing as well as hypo-mentalizing • Mimics real life • Associated with other social cognitive tasks and quality of life in epilepsy (373) • Can adapt to other cultures (137) • Associated with prosody deficit/indirect speech understanding (540, 541) and RMET performance in some studies (542) but not others (543, 544) • Associated with carer behavior ratings (545) and mixed findings for social functioning in schizophrenia (366, 546) |
• A verbal task that makes cognitive demands beyond mental state reasoning (474) • Accuracy may reflect use social norms and scripts, not just online reasoning about mental states, making this a “top-down” task (547) • Associated with education (548) and IQ (549), and executive function can affect performance (339, 378, 382–385, 546, 550, 551) • Scoring differences across studies (160) and some responses are difficult to score • The cognitive and affective questions may not be of comparable difficulty • Controls don’t always perform at ceiling • Antipsychotic medications may affect performance (552) • Little psychometric data |
Interpersonal Reactivity Index | • A multidimensional measure that can be used to assess cognitive and affective empathy: multidimensional • Fast to administer—15 min (447) • High convergent and discriminant validity (553) • Often associated with other social cognitive tasks (e.g., 341) • Psychophysiological data support the difference between cognitive and affective aspects (430) • Stable over time in schizophrenia (554) • Predicts functional capacity/psychosocial functioning in schizophrenia (555, 556) and psychosocial function in bipolar disorder (557) as well as being associated with carer burden (231, 232, 461) • Proxy version available and scores can be correlated, e.g., between parents and their adolescent children (558). |
• Not associated with other empathy measures (559) • Self-report means potential for bias and difficulties due to insight or anosagnosia (541) • Social desirability can be a problem, e.g., in forensic populations (560), so more objective measures are needed (561) • Cognitive and affective subscales and combinations have questionable validity (562) and the factor structure can be challenged (563): the scale be less valid for affective empathy (564) • The PD subscale has weakest internal consistency (565), plus this subscale is self-oriented and neither it nor the F subscale measures true empathy (566) • Gender effect (567–569) • Scores can be associated with executive function (450) • Age effect (570) |
Limitations are raised by the author where no reference is given. Factors such as ceiling effects and the specificity of the measure could be considered both strengths and limitations. A ceiling effect in controls could mean a task can highlight a profound deficit in patients, but no ceiling effect may mean greater sensitivity, whereas task specificity can help to reveal a precise deficit to target with intervention, although a more global perspective on social cognitive performance may also be needed.