Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Feb 1.
Published in final edited form as: Res Dev Disabil. 2023 Jan 3;133:104416. doi: 10.1016/j.ridd.2022.104416

Cognitive flexibility assessment in youth with Down syndrome: Reliability, practice effects, and validity

Emily K Schworer 1,2, Amanallah Soltani 3, Mekibib Altaye 4,5, Deborah J Fidler 6, Anna J Esbensen 1,4
PMCID: PMC9852016  NIHMSID: NIHMS1862107  PMID: 36603310

Abstract

Background:

Cognitive flexibility refers to the ability to switch between different mental sets, tasks, or strategies and is challenging for some individuals with Down syndrome (DS). The lack of reliable and valid cognitive flexibility measures for individuals with DS is a major barrier to clinical trials and intervention studies designed to address cognitive challenges specific to DS. To avoid measurement limitations that could confound interpretations of performance in clinical trials in children with DS, it is critical to use phenotype-sensitive and psychometrically sound measures of cognitive flexibility.

Aim:

This study aims to evaluate the psychometric properties of three measures of cognitive flexibility including Rule-Shift, Weigl Sorting, and KiTAP Flexibility in a sample of 97 youth with DS aged 6 to 17 years old.

Method:

Data were collected at two time points with a two-week interval. Parents also completed adaptive behavior and cognitive flexibility questionnaires. Child cognitive and language abilities were also assessed.

Results:

The Weigl Sorting met the most psychometric criteria, with adequate feasibility (≥ 80%) and significant correlations with most of the broader developmental domains; however, the levels of test-retest reliability, practice effects, and convergent validity did not meet a priori criteria. Rule-Shift and KiTAP Flexibility measures did not have acceptable feasibility; although sensitivity and specificity analyses revealed that Rule-Shift may be appropriate for a subgroup of the participants.

Conclusion:

No evaluated measures met all psychometric study criteria and, therefore, additional evaluation of cognitive flexibility measures is needed for use among individuals with DS.

Keywords: Down syndrome, cognitive flexibility, clinical trials, measurement

1. Introduction

Down syndrome (DS) or trisomy 21 is the most common neurogenetic syndrome associated with intellectual disability (Llewellyn et al., 2019; Mai et al., 2019). Besides global developmental delays, individuals with DS are more likely to experience difficulties with executive functioning (EF) (Daunhauer et al., 2014; Daunhauer et al., 2017; Daunhauer et al., 2020; de Weger et al., 2021; Iralde et al., 2020; Lee et al., 2011; Loveall et al., 2017; Tungate & Conners, 2021). EF is a term for the cognitive processes used to complete goal-directed actions or behaviors, which include cognitive flexibility, inhibitory control, and working memory (Miyake et al., 2000). EF difficulties in persons with DS have substantial effects on daily functioning throughout the lifespan in a variety of contexts, from social and adaptive skills (Amadó et al., 2016; Daunhauer et al., 2017; Esbensen et al., 2021; Tomaszewski et al., 2018; Will et al., 2021) to school achievement (Sabat et al., 2020; Will et al., 2017).

1.1. Executive function measurement evaluation for clinical trials in Down syndrome

Because individuals with DS are predisposed to greater risk for EF challenges, EF has been a target for recent pharmaceutical, educational, and behavioral intervention studies that include adolescents and adults with DS (Goeldner et al., 2022; McGlinchey et al., 2019; Ringenbach et al., 2021; Ringenbach et al., 2016). However, the lack of reliable and valid EF measures for individuals with DS is a major barrier to clinical trials and intervention studies. To detect EF changes and avoid measurement limitations that could confound interpretations of performance in clinical trials, it is critical to use phenotype-sensitive and psychometrically sound measures of EF (Esbensen et al., 2017). Measures used to evaluate EF in DS must be feasible, have meaningful variation in scores, strong test-retest reliability, no or minimal practice effects, and construct validity (Esbensen et al., 2017; withheld for review). Because individuals with DS are not commonly included in standard norming procedures for assessments, it is necessary to evaluate measures in individuals with DS (Esbensen et al., 2017). Specifically, performance-based assessments commonly used to measure component processes of EF have yet to undergo rigorous psychometric evaluation (Esbensen et al., 2017; Fidler et al., 2020).

To provide clear empirical guidance on measures that are good candidates for use in clinical trials, several studies have evaluated the psychometric properties of neurocognitive standardized assessments including EF measures for use in the DS population (Basten et al., 2018; d’Ardhuy et al., 2015; de Sola et al., 2015; Edgin et al., 2017; Edgin et al., 2010; Hutchinson & Oakes, 2011; Sinai et al., 2016; Startin et al., 2016; Startin et al., 2019). The Arizona Cognitive Test Battery (ACTB) for DS initially developed by Edgin and colleagues (2010) includes direct measures of working memory, cognitive flexibility, and inhibition. Although the general psychometric properties of measures included in the battery were adequate, significant floor effects were reported for measures of working memory and cognitive flexibility (Edgin et al., 2010).

The feasibility, distributional qualities, test-retest reliability, and convergent validity of several standardized measures designed to assess specific EF processes, such as working memory, have been recently evaluated in 6 to 17 year old children with DS (Schworer et al., 2022). Several working memory measures were psychometrically sound and appropriate for use in clinical trials, whereas others had problematic floor effects or poor test-retest reliability (for details, see Schworer et al., 2022). Additional psychometric evaluations completed on EF measures in DS have focused on adults (de Sola et al., 2015; Sinai et al., 2016; Startin et al., 2016; Startin et al., 2019). Although these studies focusing on adults have made progress in identifying measures appropriate for tracking cognitive decline in DS, it is also necessary to evaluate the psychometric properties of EF measures, including measures of cognitive flexibility, in children and adolescents with DS.

1.2. Cognitive flexibility outcome measures

Cognitive flexibility (also referred to as cognitive set-shifting or shifting) refers to the ability to switch between different mental sets, tasks, or strategies (Diamond, 2013). It allows appropriate adjustment of thoughts and behaviors in response to changing environmental demands (García et al., 2017). Cognitive flexibility is therefore essential for problem-solving, releasing from behavioral patterns, shifting attention, and making transitions (Buttelmann & Karbach, 2017; Dajani & Uddin, 2015; Deák & Wiseheart, 2015). Cognitive flexibility is typically investigated using both behavioral rating scales and laboratory tasks. The Behavior Rating Inventory of Executive Function (BRIEF) is a commonly used scale that includes a shifting domain (Gioia, 2000; Gioia et al., 2015). In children and adolescents with DS, the BRIEF Shift scale had acceptable internal consistency in a previous psychometric evaluation of the parent and teacher versions of the measure (Esbensen et al., 2019).

In the laboratory, cognitive flexibility is typically investigated using both task-switching and sorting paradigms (Buttelmann & Karbach, 2017; Dajani & Uddin, 2015; Deák & Wiseheart, 2015). In the task-switching paradigm (also referred to as Rule-Shift) individuals need to adapt behavior based on changing task rules. Participants are required to respond to stimuli according to one rule and then another presented successively (Wilson et al., 2004). During these tasks, certain stimulus-response mappings are generated and after a switch, participants shift to another stimulus-response mapping (Buttelmann & Karbach, 2017; Dajani & Uddin, 2015; Deák & Wiseheart, 2015). An example of the task-switching paradigm is the Rule-Shift Cards Test (Wilson et al., 1998), which requires participants to verbally respond to stimuli (red or black cards) according to one rule (respond “yes” to the red cards and “no” to the black cards), and then respond to the stimuli according to another rule (respond “yes” when the presented card is same as the previous one and “no” when the presented card is different from the previous one). The task has been used in previous studies, specifically those conducted among adolescents with DS (Lanfranchi et al., 2010). Additional task-switching cognitive flexibility measures have also been generated. One example is a computerized task, the Test of Attentional Performance for children (KiTAP) Flexibility subtest (Psychologische Testsysteme: KiTAP Test of Attentional Performance for Children, 2011).

The sorting paradigm has also been used to measure cognitive flexibility in both children and adults. The most well-known measure using this paradigm is the Wisconsin Card Sorting Test (WCST; Grant & Berg, 1948). In this test, the participant sorts the cards first according to one dimension for a certain number of trials (color, shape, or number) and then switches to sort by a different dimension through inferring feedback received from the examiner. The Modified Card Sorting Test (Nelson, 1976) is a simplified version of WCST, in which the participant is explicitly instructed to change the sorting criterion and has been previously used to assess cognitive flexibility in adolescents with DS (Lanfranchi et al., 2010). The Dimensional Change Card Sort (Zelazo, 2006), and Weigl Sorting tasks (Grant & Berg, 1948) are two other sorting measures adapted for children. Previous studies have used these sorting tasks among children and adults with DS (Ball et al., 2008; Lanfranchi et al., 2010; Will et al., 2017).

Although both task switching and sorting paradigms have undergone measurement evaluation among typically developing children and adults (e.g., Basso et al., 2001; Hommel et al., 2022; Willoughby et al., 2010, 2012), additional efforts are required to assess the measures for use among individuals with DS. This syndrome-specific evaluation of measures of cognitive flexibility is warranted given that verbal mental age has been connected to cognitive flexibility performance in DS (Campbell et al., 2013) and therefore verbal demands of measures must be carefully considered in measurement evaluations. The current study aimed to evaluate the psychometric properties of three cognitive flexibility tasks including the Rule-Shift task (Wilson et al., 1998), Weigl Sorting task (Grant & Berg, 1948), and KiTAP Flexibility subtest (Psychologische Testsysteme: KiTAP Test of Attentional Performance for Children, 2011; Zimmermann & Fimm, 2002) among individuals with DS. Measures were selected from recommendations from NIH Outcome Measures in DS working groups and previous results of potentially promising measures from other neurogenetic conditions, such as fragile X syndrome (Knox et al. 2012). Specifically, we aimed to determine the feasibility (the number of participants who can obtain scores on the measures), floor effects, test-retest reliability, practice effects, convergent validity, and associations with broader developmental domains (chronological age, cognition, adaptive behavior, and vocabulary) of the three cognitive flexibility tasks in a sample of 6–17-year-old youth with DS. A secondary goal of the study was to identify what ages and IQ levels were appropriate for the administration of the subtests with low feasibility using post hoc sensitivity and specificity probabilities.

2. Method

2.1. Procedures

The Streamlined, Multisite, Accelerated Resources for Trials (SMART) IRB platform approved all study procedures and informed consent was obtained for each participant at the two participating study sites. Assent was also obtained from participants with DS older than 11 years of age. Study eligibility included a diagnosis of DS (reported from caregiver or medical chart review), English as the primary language, nonverbal mental age of approximately three years or above, and caregiver report that their child would be able to comply with visit procedures. After enrolling in the study, participants and their caregivers completed two visits, two weeks apart. Assessment visits lasted approximately 1.5 – 2.5 hours and the examiner provided breaks to limit participant fatigue. During the visits, participants’ IQ, vocabulary, and cognitive flexibility were assessed. Cognitive flexibility measures were attempted at both visits regardless of performance at time 1. Information on the child’s adaptive behaviors and demographics were reported by the caregivers. The order of administration of the neuropsychological battery was randomized to avoid systematic differences in performance due to child fatigue or attention throughout the study visit.

2.2. Participants

The sample included 97 children and adolescents with DS between 6 and 17 years old (M = 12.6, SD = 3.3). Sex was approximately equal with 50 (51.6%) males and 47 (48.4%) females. Eighty-six percent of the participants were White, 5% were Black, 5% were Asian, and 4% were of other or mixed races. Most participants were not Hispanic and Latino (95%), and 5% were Hispanic or Latino. Four participants with DS had mosaicism and two had translocation. Average SB-5 ABIQ was 49.07, SD = 5.36 (SB-5 deviation ABIQ M = 38.55, SD = 16.94), and the average adaptive behavior composite score was 68.99, SD = 11.09.

2.3. Measures

2.3.1. Cognitive abilities and adaptive behaviors

Stanford-Binet Intelligence Scales, Fifth Edition Abbreviated Battery IQ (SB-5 ABIQ).

The SB-5 is designed and normed to measure cognitive abilities in individuals 2 through +85 years old (Roid, 2003). The SB-5 Abbreviated Battery IQ (SB-5 ABIQ) is based on two routing subtests of SB-5, including one nonverbal (Object Series/Matrices) and one verbal (Vocabulary) measuring fluid reasoning and crystallized knowledge (Roid, 2003). To reduce issues with significant floor effects observed when using the SB-5 with individuals with intellectual disabilities, SB-5 deviation scores were used. Deviation scores are based on the general population norms and provide a greater score distribution using raw z score transformation (Sansone et al., 2014). In the current study, SB-5 ABIQ deviation scores were used to estimate participants’ IQ.

Vineland Adaptive Behavior Scale, Third Edition (VABS-3).

The comprehensive VABS-3 (Sparrow et al., 2016) is an informant-based rating scale measuring adaptive behaviors of persons from birth to +90 years old. Social interaction and communication skills, personal living skills, community living skills, and motor skills are measured by the VABS-3. An overall adaptive behavior composite is also yielded. VABS-3 is a leading instrument widely used to assess the adaptive behavior of individuals with intellectual and developmental disabilities (Braconnier & Siper, 2021; Hamburg et al., 2019). In addition, VABS-3 has been identified as a viable option to assess adaptive functioning domains in individuals with DS (Esbensen et al., 2017). In the current study, caregivers of the participants completed the VABS-3, and the composite score was used to assess the association between adaptive behaviors and cognitive flexibility measures.

2.3.2. Receptive and Expressive Vocabulary

Peabody Picture Vocabulary Test, Fourth and Fifth Edition (PPVT-4 and PPVT-5).

The PPVT-4 and PPVT-5 (Dunn, 2019; Dunn & Dunn, 2007) are designed to measure receptive vocabulary of English-speaking children and adults. In this test, four colored pictures are shown to the participant and he/she is asked to choose the one that best matches the word orally presented. Higher scores indicate better performance. Overall reliability of the normative sample and test-retest reliability for all ages are excellent (Dunn, 2019; Dunn & Dunn, 2007). Previous studies have used PPVT-4 or PPVT-5 to assess individuals with DS (Hartley et al., 2017; Kristensen et al., 2022; Lao et al., 2017; Schworer, Hoffman, et al., 2021). The PPVT was recommended for use in studies involving individuals with DS and is frequently administered with this population (Esbensen et al., 2017; Hartley et al., 2017; Kristensen et al., 2022; Lao et al., 2017; Schworer, Hoffman, et al., 2021). In the current study, the first 16 participants were assessed with the PPVT-4, and when published, the PPVT-5 was used for all other participants. Correlations between the PPVT-4 and PPVT-5 in the norming sample were high (corrected r = .84; Dunn, 2019).

Expressive Vocabulary Test, Second and Third Edition (EVT-2 and EVT-3).

The EVT-2 and EVT-3 (Williams, 2007, 2019) are designed to measure the expressive vocabulary of English-speaking children and adults. Different pictures are shown to the participant and he/she is asked to label them or provide a one-word synonym for them. Higher scores reflect better performance, and more recent studies have used EVT-3 as valid and reliable tools to assess expressive vocabulary in children and adults with DS (Kristensen et al., 2022; Schworer, Hoffman, et al., 2021). In the present study, the first 16 participants were assessed with the EVT-2, and when published, the EVT-3 was used for all other participants. Correlations between the EVT-2 and EVT-3 in the norming sample were high (corrected r = .88; Williams, 2019).

2.3.3. Cognitive Flexibility

Rule-Shift Task.

This task is a subtest of a battery originally designed to assess patients with frontal lobe damage (Wilson et al., 1998). It requires the ability to change responses according to two different rules. In the first part of the test, participants respond to stimuli (colored cards) according to a specific rule (naming the color of the card). In the second part of the test, the rule changes, and participants are asked to respond to stimuli according to a new rule (say “yes” when seeing a card that has the same color as the previous one, and “no” when the presented card has a different color) (Lanfranchi et al., 2010). The second part of the task is used to sum the total incorrect score and has a total of 30 items. Administration time is approximately 5 minutes. In the current study, the Rule-Shift total incorrect scores were used to examine the psychometric properties of the test in the sample of youth with DS, with a range of 0–30 and higher scores reflecting more incorrect responses and challenge with cognitive flexibility.

Weigl Sorting Task.

This test was originally designed by Weigl (1941) for use among patients with cerebral lesions (Weigl, 1941). It is a brief test (administration time approximately 3–5 minutes) assessing the ability to categorize stimuli by a specific perceptual symbol and then to shift to sort the stimuli according to another perceptual feature (Laiacona et al., 2000; Mole et al., 2021). The participant is given 9 cards (yellow, red, and blue simple shapes) and told to sort the pieces into separate groups so that the alike pieces go together. Scoring is completed using sorting behavior on both the first and second trials and ranges from 0 to 5. Higher scores reflect better performances (fewer challenges with cognitive flexibility) and the raw scores were analyzed to evaluate the psychometric properties of the test.

Test of Attentional Performance for Children (KiTAP) Flexibility.

The KiTAP (Psychologische Testsysteme: KiTAP Test of Attentional Performance for Children, 2011; Zimmermann & Fimm, 2002) is a computer-based attention and EF battery adapted for use in children from the Test of Attentional Performance. It is composed of eight subtests varying in length and difficulty level from simple to more difficult, and has been evaluated in other groups of individuals with intellectual disabilities (Knox et al., 2012). In the Flexibility subtest of the KiTAP, the participant is required to alternate between identifying colored stimuli that appear on random sides of the screen by tapping one of two response buttons (separate from the computer keyboard and approximately 2 × 2 inches in size). Two dragons of different colors (green and blue) simultaneously appear to the left and right of the screen. At the first presentation, the participant is asked to press the response button on the side that the green dragon appears. At the next presentation, the participant is asked to press the response button on the side that the blue dragon appears. The shifting between green and blue dragons continues throughout the task, while the side of presentation for the green and blue dragons varies. Administration time is approximately 5 minutes and there are 50 items in the task. A higher number of incorrect responses and slower response reaction times indicate more challenges with cognitive flexibility. In this study, the psychometric properties of KiTAP Flexibility total incorrect (errors), and KiTAP Flexibility median reaction time were evaluated.

Behavior Rating Inventory of Executive Function, Second Edition (BRIEF2), Parent and Teacher Forms, Shift subscale.

The BRIEF2 (Gioia et al., 2015) is a rating scale commonly used to measure EF in children and youth in their home and school environments. The Shift scale of the BRIEF2 includes 8 items (e.g., “Has trouble moving from one activity to another”). Parents and teachers rate their child’s abilities on each item using a three-point Likert scale (never, sometimes, often). Both Parent and Teacher form T-scores (M = 50, SD = 10) were calculated and higher scores indicate more cognitive flexibility challenges. The BRIEF Shift scale had acceptable internal consistency for use among children and adolescents with DS (Esbensen et al., 2019).

2.4. Analysis plan

First, the feasibility and performance of the participants on the three cognitive flexibility tasks (Rule-Shift, Weigl Sorting, and KiTAP Flexibility) were evaluated. Feasibility is defined as the percentage of participants who completed/generated correct or incorrect responses at time 1 and time 2. The feasibility threshold of 80% was selected as it has been used by previous studies evaluating outcome measures in youth with intellectual disabilities and DS (Hessl et al., 2016; Schworer, Esbensen, et al., 2021; Schworer et al., 2022). In addition, minimum, maximum, mean, median, skewness, kurtosis, and distributional issues (floor effect) for the scores at the first study visit were examined. Participants who attained the test’s lowest score and participants who were unable to complete the test were included in the investigation of floor effects. The threshold for acceptable floor effect was <20%, corresponding with previous studies conducted among individuals with DS (Schworer, Esbensen, et al., 2021; Schworer et al., 2022),

Next, the 2-week test-retest reliability, practice effects, convergent validity, and associations with broader developmental domains (chronological age, cognition, adaptive behaviors, and vocabulary) of the cognitive flexibility measures were calculated. Intraclass correlation coefficients (ICCs) were used to examine test-retest reliability between time 1 and time 2. ICCs of good (.75–.90) or excellent (> .90) (Koo & Li, 2016) were considered acceptable for the current study. Paired samples t-tests compared the mean scores between test administrations (time 1 and time 2) and were used to identify practice effects. Practice effects were deemed problematic with a significant p-value and Cohen’s d > 0.20 (Cohen, 1988). Convergent validity was evaluated using Spearman correlations among the cognitive flexibility measures and parent/teacher behavioral ratings of cognitive flexibility. Good (.50–.70) or excellent (> .70) were selected as criterion for acceptable convergent validity. To examine the association between cognitive flexibility measures and broader developmental domains, Spearman correlations were calculated between the cognitive flexibility tasks and chronological age, cognition, adaptive behavior, and vocabulary.

Finally, for measures that did not meet the feasibility criteria, sensitivity and specificity probabilities were analyzed to determine the chronological ages and IQs of participants likely to be able to complete each low-feasibility measure. The goal of these analyses was to determine if restricting age or IQ would better predict who was able and unable to complete the tasks. Sensitivity probabilities indicate the likelihood of correctly identifying participants (included in a specific chronological age range and/or IQ range) that would be able to complete the measure, and specificity probabilities refer to the likelihood of correctly identifying participants (outside of a specified chronological age range and/or IQ range) that would be unable to complete the test. Chronological ages of above 8 years old and above 10 years old were evaluated to align with previous inclusion criteria for clinical trials in DS (Kishnani et al. 2010), and each chronological age benchmark was compared with an additional ABIQ deviation scores cut-off of no restriction, ≥ 30, ≥ 40, and ≥ 50 (Schworer, Esbensen, et al., 2021; Schworer et al., 2022).

3. Results

3.1. Feasibility and score distributions

Of the three cognitive flexibility measures, Weigl Sorting met the feasibility criteria (≥ 80%), Rule-Shift had unacceptable feasibility (59.8%), and KiTAP Flexibility had very poor feasibility (3.1%; Table 1). Non-completion reasons for Weigl Sorting were as follows: not understanding the task (n = 2), behavioral noncompliance (n = 1), and verbal refusal (n = 1). Further, about 11 of the participants completed the Weigl Sorting only at one time (time 1 or time 2). Reasons for non-completion of the Rule-Shift task were as follows: not understanding the task (n = 25), behavioral noncompliance (n = 3), limited verbal abilities (n = 5), and acquiescence (n = 2). Additionally, a small number of participants completed Rule-Shift only at one time point (n = 4). Finally, non-completion reasons for KiTAP Flexibility included: not understanding the task (n = 57), behavioral noncompliance (n = 6), verbal refusal (n = 6), fatigue (n = 16), and acquiescence (n = 1), and three participants completed KiTAP Flexibility only at one time point.

Table 1.

Cognitive flexibility performance and feasibility at time 1

Min Max Mean (SD) Median Skew Kurtosis Feasibility n (%) n at floor (%)
Rule-Shift Total Incorrect 0 21 5.43 (5.47) 4 1.06 0.45 58 (59.8%) 39/97 (40.2%)
Weigl Sorting 0 5 2.16 (1.55) 1 0.52 −.099 82 (84.5%) 21/97 (21.6%)
KiTAP Flexibility Total Incorrect 7 14 10.67 (2.73) 10.5 -a -a 3 (3.1%)b 89/92b (96.7%)
KiTAP Flexibility Median Reaction Time (ms) 496 2509 1148.60 (816.54) 1048 -a -a 3 (3.1%)b 89/92b (96.7%)
a

KiTAP Flexibility was not interpreted for skewness or kurtosis because of the low feasibility of the measure

b

Technology error removed from the denominator of feasibility and floor calculations. KiTAP = Test of Attentional Performance for Children.

Minimum, maximum, mean, median, skewness, kurtosis, levels of feasibility, and floor effects for the cognitive flexibility measures at the first time point are reported in Table 1. The Weigl Sorting had a normal distribution (skewness and kurtosis were between −1 and +1). Rule-Shift total incorrect demonstrated acceptable distributional characteristics (skewness and kurtosis were between −2 and +2). KiTAP Flexibility was not interpreted for skewness or kurtosis because of low measurement feasibility. All cognitive flexibility measures had unacceptable levels of participants scoring at the floor (> 20%).

3.2. Test-retest reliability and practice effects

Test-retest reliability and practice effects of the Rule-Shift task and Weigl Sorting are reported in Table 2. KiTAP Flexibility was not included given the poor feasibility of the measure. The observed ICCs for the Rule-Shift total incorrect and Weigl Sorting were 0.69 and 0.71 respectively, indicating moderate, but unacceptable test-retest reliability (ICCs) according to the study criteria (≥ 0.75). Practice effects were observed for Weigl Sorting (p < 0.01, Cohen’s d > 0.20). For Rule-Shift total incorrect, close to significant mean differences between time 1 and time 2 (p = 0.06) and Cohen’s d just over 0.20 showed potential for practice effects but did not meet the a priori threshold.

Table 2.

Evaluation of practice effects and test–retest reliability

Time 1, Mean (SD) Time 2, Mean (SD) t a p Cohen’s d ICC
Rule-Shift Total Incorrect (n = 58) 5.40 (5.54) 4.38 (4.86) 1.93 0.06 0.25 0.69
Weigl Sorting (n = 82) 2.22 (1.56) 2.66 (1.74) −3.32 <0.01 −0.37 0.71

Note: Means and standard deviations reported for participants who completed measures at both time 1 and time 2.

a

Differences based on paired-samples t-tests

3.3. Convergent validity and correlations with broader developmental domains

Moderate, but unacceptable, convergent validity was observed between Rule-Shift total incorrect and Weigl Sorting (Table 3). Unacceptable convergent validities (r < .5) were also found between both cognitive flexibility measures and parent and teacher ratings of cognitive flexibility (BRIEF2). Better performance on Weigl Sorting was significantly correlated with higher cognitive ability (SB-5 ABIQ deviation), adaptive behavior (VABS-3), and receptive vocabulary (PPVT), however, Weigl Sorting was not correlated with age and expressive vocabulary (EVT). None of the developmental domains were significantly correlated with the Rule-Shift task.

Table 3.

Convergent validity and correlations with broader developmental domains

Weigl Sorting BRIEF2 Parent Shift BRIEF2 Teacher Shift Age SB-5 ABIQ Deviation VABS-3 ABC EVT PPVT
Rule-Shift Total Incorrect −.40** −.03 .46* −.22 −.13 −.24 −.02 −.003
Weigl Sorting −.13 −.40** .19 .42** .39** .19 .34**
*

p < 0.05;

**

p < 0.01.

BRIEF2: Behavioral Rating Inventory of Executive Function, Second Edition; SB-5 ABIQ: Stanford-Binet Intelligence Scales, Fifth Edition Abbreviated Battery IQ; VABS-3 ABC: Vineland Adaptive Behavior Scale, Third Edition, Adaptive Behavior Composite; EVT: Expressive Vocabulary Test; PPVT: Peabody Picture Vocabulary Test

3.4. Low feasibility measure

Despite the reported low feasibility of the Rule-Shift task, more than half of the participants (59.8%) were able to complete the task, indicating the suitability of the measure for a subset of the participants. Table 4 shows Rule-Shift sensitivity and specificity statistics for a variety of chronological age and IQ cut-offs/benchmarks. For no restriction and lower ABIQ deviation score benchmarks, sensitivity was high and specificity was low, however, as the IQ and age benchmarks became more restrictive, sensitivity decreased, and specificity increased. As an example of interpreting these results, if future clinical trials restrict their inclusion criteria to children over the age of 8 with no IQ restriction, it is likely that this age restriction will capture 95.0% of children that would be able to do the Rule-Shift task, yet the probability of correctly identifying kids that could not do the task is only 18.9%, thus potentially their sample will include numerous children that will still struggle with the task. If a clinical trial wants to ensure complete data, then restricting the inclusion criteria to age 10 and older and an IQ greater than 50 will lead to excellent specificity (100%), yet omit children who would do the task (21.7% sensitivity).

Table 4.

Sensitivity and specificity for Rule-Shift measure

Sensitivity Specificity
Age 8 Age 10 Age 8 Age 10
No ABIQ Deviation Restriction 95.0% 85.0% 18.9% 32.4%
ABIQ Deviation ≥ 30 (n = 68) 71.7% 61.7% 56.8% 70.3%
ABIQ Deviation ≥ 40 (n = 45) 50.0% 40.0% 83.8% 91.9%
ABIQ Deviation ≥ 50 (n = 24) 26.7% 21.7% 97.3% 100.0%

4. Discussion

The lack of reliable and valid cognitive flexibility measures for individuals with DS is a major barrier to clinical trials and intervention studies. Without psychometrically sound outcome measures, studies are at risk for encountering limitations with measures (i.e., floor effects, practice effects, unreliable measurement) that could confound interpretations of performance when examining effects of pharmaceutical or behavioral interventions. This study evaluated the psychometric properties of three measures of cognitive flexibility including Rule-Shift, Weigl Sorting, and KiTAP Flexibility in a sample of youth with DS. Of the evaluated measures, the Weigl Sorting task was the most feasible measure for use among youth with DS in clinical trials, but did not meet any other study psychometric criteria. Although the feasibility of the Rule-Shift task was below acceptable criterion for use among a wide range of youth with DS, the results revealed that this measure can be used in specific subgroups of the DS population with IQ higher than 30. Finally, one computer task designed for typically developing children (KiTAP Flexibility) may not be appropriate for youth with DS in clinical trials or clinical settings.

4.1. Feasibility and score distributions

The Weigl Sorting met feasibility criterion, but showed moderate floor effects. There was also variability in task completion across timepoints, as 11% of participants completed the Weigl Sorting at only one timepoint, which may indicate the task was not consistently engaging. Rule-Shift total incorrect and KiTAP Flexibility had unacceptable feasibility and floor effects. The ability to adequately complete Weigl Sorting by the participants of the current study may indicate that using task sorting paradigms (Grant & Berg, 1948) rather than the Rule-Shift paradigm (Wilson et al., 1998) may lead to greater task completion for a greater number of individuals with DS. Sorting pieces by one feature and then shifting to sort them by another feature may be an accessible measure for youth, whereas, responding to stimuli according to rules presented successively (Rule-Shift) may be difficult for some individuals with DS. Differences in verbal demands of the tasks may also influence feasibility (Campbell et al., 2013), as verbal responses are not required for the Weigl Sorting. The very low feasibility observed in the KiTAP Flexibility measure may also underscore the difficulty of rule-shift paradigms for individuals with DS. Further, the high rate of the participants who did not understand the KiTAP Flexibility (62%) compared with those who did not understand the Rule-Shift task (25%) may indicate that for individuals with DS it is easier to understand and respond to tasks using manipulatives rather than computerized tasks designed for typically developing children, such as the KiTAP Flexibility. Slower processing speed observed in DS may also have contributed to the challenges with the KiTAP Flexibility subtest (Inui et al., 1995; Lalo et al., 2005). Modifications to task instructions and administration or response modalities may be necessary to support the utility of computerized assessments in children and adolescents with DS and other intellectual disabilities.

4.2. Test-retest reliability and practice effects

ICC coefficients observed for the Rule-Shift and Weigl Sorting tasks were in the moderate category of test-retest reliability indicating modest, but not acceptable, reliability for both tasks. Additionally, significant mean differences between time 1 and time 2 and effect size greater than the study criterion provided evidence of practice effects for Weigl Sorting. The findings indicate that, due to task repetition at time 2, the performance of the participants on Weigl Sorting may be slightly improved, however, the average change was less than one unit on a five-point measure. Rule-Shift met criteria regarding mean differences for negligible practice effects, yet effect sizes were concerning and signal a potential for practice effects. These potential or concerning practice effects may be the reason for the lower test-retest reliability observed in both tests. EF measures may have disappointingly low test-retest reliability because they demand higher attentional control when they are used for the first time (Friedman & Miyake, 2004; Rabbitt, 2004).

4.3. Convergent validity and correlations with broader developmental domains

Moderate convergent validity was observed for the Rule-Shift and Weigl Sorting tasks. The correlation between the two tasks was significant but it did not meet acceptable criterion (> 0.50). Modest convergent validity coefficients may reflect that the two tasks are representative of two different paradigms measuring cognitive flexibility (Buttelmann & Karbach, 2017; Dajani & Uddin, 2015; Deák & Wiseheart, 2015). Although both rule-shift and sorting tasks are designed to measure cognitive flexibility, they likely rely on different cognitive processes or demands. Along with domain-general brain regions involved in cognitive flexibility, there are differences in brain activity for the different cognitive flexibility paradigms (Kim et al., 2012).

The correlations between the parent and teacher ratings (BRIEF2 Shifting) and both direct measures of cognitive flexibility (Rule-Shift and Weigl Sorting) also did not meet acceptable criterion for convergent validity, although a medium correlation was observed between BRIEF-2 Shifting (teacher from) and both Rule-Shift and Weigl Sorting. Direct and rating measures of cognitive flexibility are different in terms of how they are administered and scored. Direct measures involve standardized procedures that are administered by an examiner and usually assess the accuracy and/or response time, while rating measures involve an informant reporting on difficulties with carrying out everyday tasks (Toplak et al., 2013). Thus, these two types of measures likely assess different aspects of cognitive flexibility (Toplak et al., 2013), specifically in children with DS (Daunhauer et al., 2017), and account for the modest correlation between direct measures and parent/teacher reports.

Additionally, cognitive abilities, adaptive behaviors, and receptive vocabulary had positive significant correlations with Weigl Sorting. However, no significant associations were found between these broader developmental domains and the Rule-Shift task. Age and expressive vocabulary had no significant association with both Weigl Sorting and Rule-Shift tasks. These findings provide evidence for the validity of the Weigl Sorting test to measure cognitive flexibility in youth with DS. Measures of psychological constructs are validated by testing whether they are related to measures of other constructs as specified by theory, and significant relations between measures reflect the validity of both tests (Strauss & Smith, 2009). The association between cognitive flexibility (and other EF domains) with general cognitive abilities, adaptive behaviors, and language has been assumed theoretically and also reported empirically, especially among individuals with developmental disorders (Blom et al., 2021; Duggan & Garcia-Barrera, 2015; Fidler & Lanfranchi, 2022; Gravråkmo et al., 2022; Sabat et al., 2020). Thus, the significant relations between the Weigl Sorting and the valid measures of cognition (SB-5), adaptive behaviors (VABS-3), and receptive vocabulary (PPVT) observed in the current study provided evidence of Weigl Sorting validity for youth with DS. The lack of significant relation between Weigl Sorting scores and age also provide some evidence of validity for the Weigl Sorting test, as it is in line with the developmental stability hypothesis (Lee et al., 2015) which suggests that the magnitude of difficulty in EF skills is stable across age in DS.

4.4. Low feasibility measure

Understanding who can complete a task, particularly for measures that tend to be more challenging for individuals with DS (i.e., low feasibility measures) can be used to inform measurement decisions for research studies and clinical trials. The results of sensitivity and specificity analyses for the Rule-Shift task indicate that more restricted and higher ABIQ scores increased the probability of correctly identifying the participants who were unable to complete the test (high specificity), and less restrictive ABIQ decreased such probability (low specificity). On the other hand, higher and more restrictive ABIQ scores decreased the probability of correctly identifying participants who could complete the test (low sensitivity), but less restrictive ABIQ scores increase such probability (high sensitivity). Relatively balanced sensitivity and specificity were observed for the benchmark of ABIQ ≥ 30, indicating the appropriateness of using this benchmark for inclusion/exclusion criteria if the Rule-Shift test is a measure used in future clinical trials and research. However, this benchmark should be used with caution, as some participants below than ABIQ deviation score of 30 were also able to complete the measure.

4.5. Limitations and future directions

This study had numerous strengths including a large sample size and the contribution of essential information related to measurement selection for upcoming clinical trials and interventions, however, some limitations should be considered when interpreting the results of the current study. First, there were no standard scores for the measures and the psychometric evaluations were limited to the raw scores. Developing norms for specific EF measures with good feasibility will be important for interpretation of individual performance in relation to others with intellectual and developmental disabilities for both research and practice. Second, the sample included 6–17-year-old youth with DS, and thus, findings cannot be generalized to adults or young children with DS. Participants were also volunteers with DS that may not be representative of the full population of youth with DS and the study was limited in regard to representation from Black and Hispanic participants. Additionally, eligibility criteria of a diagnosis of DS was confirmed via parent report and medical chart review, however, karyotyping was not performed in the study.

Future studies are recommended to increase the time interval between the two visits and determine if test-retest reliability and practice effects are still problematic for intervals that mirror those used in clinical trials. Further, there is a need to evaluate the measures in the age ranges not included in the current study. Finally, although the selected tasks are among the most common measures of cognitive flexibility used or recommended for use among individuals with DS (Ball et al., 2008; Esbensen et al., 2017; Lanfranchi et al., 2010), additional cognitive flexibility tasks, such as Modified Card Sorting Test and Dimensional Change Card Sort, should also be evaluated in DS.

4.6. Conclusion

The evaluation of outcome measures is a critical step for ensuring the future success and interpretation of performance in clinical trials and interventions. The results of the current study support the use of the Weigl Sorting task as a feasible measure of cognitive flexibility in youth with DS, although its test-retest reliability, practice effect, and convergent validity did not meet a priori criteria and the measure has a limited range of possible scores. Psychometric properties of the Rule-Shift task did not meet a priori criteria, although there were negligible practice effects and a balance between sensitivity and specificity probabilities for the participants with IQ higher than 30. Finally, the KiTAP Flexibility had poor feasibility suggesting this task was not appropriate for use among youth with DS. Because no measure met all psychometric study criteria, evaluating additional cognitive flexibility measures among youth with DS may be necessary to provide evidence-based outcomes measures for clinical trials and research.

Highlights.

  • Three measures of cognitive flexibility were assessed among youth with Down syndrome.

  • Weigl Sorting task had adequate feasibility and close to acceptable floor effects.

  • Rule-Shift and KiTAP Flexibility had significant floor effects and inadequate feasibility.

  • No measures had acceptable test-retest reliability or convergent validity.

  • Practice effects were observed for the Weigl Sorting task, but not for Rule-Shift.

What this paper adds?

Valid and reliable outcome measures are essential to capturing change in performance for interventions and clinical trials. Because individuals with Down syndrome are not commonly included in standard norming procedures for performance-based assessments, it is necessary to evaluate measures for their valid use with individuals with Down syndrome. Previous studies have evaluated different sets of cognitive and executive function measures, including cognitive flexibility measures, for use among individuals with Down syndrome. Although some measures of executive function were reported as having adequate psychometric properties in these previous measurement evaluations among individuals with Down syndrome, there were significant floor effects for measures of cognitive flexibility. Because no promising measures of cognitive flexibility have been identified, additional evaluation of different performance-based measures is required. This study adds to this emerging literature by focusing on a set of cognitive flexibility measures commonly used to assess children and adolescents with Down syndrome. The selected measures originated from two cognitive measurement paradigms including task-switching and sorting. In addition, the study adds to the current knowledge by assessing a wide range of psychometric properties including feasibility, test-retest reliability, practice effect, convergent validity, and correlations with broader developmental domains. Further, in subgroups of different ages and IQ levels, the proportions of the participants who were able or unable to complete the tasks were identified.

Acknowledgments

The authors have no conflicts of interest to disclose. This manuscript was prepared with support from the Eunice Kennedy Shriver National Institute of Child Health and Human Development of the National Institutes of Health under award numbers R01 HD093754 (Esbensen PI), T32HD007489, and P50HD105353 and the University of Wisconsin-Madison. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This research would not have been possible without the contributions of the participating families and the community support.

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Credit Author Statement

Emily K. Schworer: Methodology, Visualization, Data curation, Validation, Investigation, Writing- original draft, Writing- review and editing; Amanallah Soltani: Writing- original draft, Writing- review and editing; Mekibib Altaye: Formal analysis, Writing- review and editing; Deborah J. Fidler: Conceptualization, Supervision, Funding acquisition, Project administration, Resources, Methodology, Writing- review and editing; Anna J. Esbensen: Conceptualization, Supervision, Funding acquisition, Project administration, Resources, Methodology, Investigation, Writing- review and editing

References

  1. Amadó A, Serrat E, & Vallès-Majoral E (2016). The role of executive functions in social cognition among children with down syndrome: relationship patterns. Frontiers in Psychology, 7, 1363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Ball SL, Holland AJ, Treppner P, Watson PC, & Huppert FA (2008). Executive dysfunction and its association with personality and behaviour changes in the development of Alzheimer’s disease in adults with Down syndrome and mild to moderate learning disabilities. British Journal of Clinical Psychology, 41, 1–29. [DOI] [PubMed] [Google Scholar]
  3. Basso MR, Lowery N, Ghormley C, & Bornstein RA (2001). Practice effects on the Wisconsin Card Sorting Test–64 card version across 12 months. The Clinical Neuropsychologist, 15(4), 471–478. [DOI] [PubMed] [Google Scholar]
  4. Basten IA, Boada R, Taylor HG, Koenig K, Barrionuevo VL, Brandão AC, & Costa AC (2018). On the design of broad-based neuropsychological test batteries to assess the cognitive abilities of individuals with down syndrome in the context of clinical trials. Brain Sciences, 8(12), 205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Blom E, Berke R, Shaya N, & Adi-Japha E (2021). Cognitive flexibility in children with Developmental Language Disorder: Drawing of nonexistent objects. Journal of Communication Disorders, 93, 106137. [DOI] [PubMed] [Google Scholar]
  6. Braconnier ML, & Siper PM (2021). Neuropsychological assessment in autism spectrum disorder. Current Psychiatry Reports, 23(10), 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Buttelmann F, & Karbach J (2017). Development and plasticity of cognitive flexibility in early and middle childhood. Frontiers in Psychology, 8, 1040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Campbell C, Landry O, Russo N, Flores H, Jacques S, & Burack JA (2013). Cognitive flexibility among individuals with Down syndrome: assessing the influence of verbal and nonverbal abilities. American Journal on Intellectual and Developmental Disabilities, 118(3), 193–200. 10.1352/1944-7558-118.3.193 [DOI] [PubMed] [Google Scholar]
  9. Cohen J (1988). The effect size index: d. Statistical Power Analysis for the Behavioral Sciences, 2, 284–288. [Google Scholar]
  10. d’Ardhuy XL, Edgin JO, Bouis C, de Sola S, Goeldner C, Kishnani P, Nöeldeke J, Rice S, Sacco S, Squassante L, Spiridigliozzi GA, Visootsak J, Heller JH, & Khwaja O (2015). Assessment of cognitive scales to examine memory, executive function and language in individuals with Down syndrome: implications of a 6-month observational study. Frontiers in Behavioral Neuroscience, 9, 300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Dajani DR, & Uddin LQ (2015). Demystifying cognitive flexibility: Implications for clinical and developmental neuroscience. Trends in Neurosciences, 38(9), 571–578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Daunhauer LA, Fidler DJ, Hahn L, Will E, Lee NR, & Hepburn S (2014). Profiles of everyday executive functioning in young children with Down syndrome. American Journal on Intellectual and Developmental Disabilities, 119(4), 303–318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Daunhauer LA, Gerlach-McDonald B, Will E, & Fidler D (2017). Performance and ratings based measures of executive function in school-aged children with Down syndrome. Developmental Neuropsychology, 42(6), 351–368. [DOI] [PubMed] [Google Scholar]
  14. Daunhauer LA, Will E, Schworer EK, & Fidler DJ (2020). Young students with Down syndrome: Early longitudinal academic achievement and neuropsychological predictors. Journal of Intellectual & Developmental Disability, 45(3), 211–221. [Google Scholar]
  15. de Sola S, de la Torre R, Sanchez-Benavides G, Benejam B, Cuenca-Royo A, del Hoyo L, Rodriguez J, Catuara-Solarz S, Sanchez-Gutiérrez J, Dueñas-Espin I, Hernandez G, Pena-Casanova J, Langohr K, Videla S, Blehaut H, Farre M, Dierssen M, & Group TS (2015). A new cognitive evaluation battery for Down syndrome and its relevance for clinical trials. Frontiers in Psychology, 6, 708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. de Weger C, Boonstra FN, & Goossens J (2021). Differences between children with Down syndrome and typically developing children in adaptive behaviour, executive functions and visual acuity. Scientific Reports, 11(1), 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Deák GO, & Wiseheart M (2015). Cognitive flexibility in young children: General or task-specific capacity? Journal of experimental child psychology, 138, 31–53. [DOI] [PubMed] [Google Scholar]
  18. Diamond A (2013). Executive functions. Annual Review of Psychology, 64, 135–168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Duggan EC, & Garcia-Barrera MA (2015). Executive functioning and intelligence. In Handbook of intelligence (pp. 435–458). Springer. [Google Scholar]
  20. Dunn DM (2019). Peabody Picture Vocabulary Test (PPVT5) (5th ed. ed.). NCS Pearson. [Google Scholar]
  21. Dunn LM, & Dunn DM (2007). PPVT-4: Peabody Picture Vocabulary Test. Pearson. [Google Scholar]
  22. Edgin JO, Anand P, Rosser T, Pierpont EI, Figuero C, Hamilton D, Huddleston L, Mason GM, Spanò G, Toole L, Nguyen-Driver M, Capone G, Abbeduto L, Maslen C, Reeves RH, & Sherman SL (2017). The Arizona Cognitive Test Battery for Down Syndrome: Test-retest reliability and practice effects. American Journal of Intellectual and Developmental Disabilities, 122(3), 215–234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Edgin JO, Mason GM, Allman MJ, Capone GT, DeLeon I, Maslen C, Reeves RH, Sherman SL, & Nadel L (2010). Development and validation of the Arizona Cognitive Test Battery for Down syndrome. Journal of Neurodevelopmental Disorders, 2, 149–164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Esbensen AJ, Hoffman E, Shaffer R, Chen E, Patel L, & Jacola LM (2019). Reliability of informant report measure of executive functioning in children with Down syndrome. American Journal of Intellectual and Developmental Disabilities, 124, 220–233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Esbensen AJ, Hoffman EK, Shaffer RC, Patel LR, & Jacola LM (2021). Relationship Between Parent and Teacher Reported Executive Functioning and Maladaptive Behaviors in Children With Down Syndrome. American Journal on Intellectual and Developmental Disabilities, 126(4), 307–323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Esbensen AJ, Hooper SR, Fidler D, Hartley SL, Edgin JO, d’Ardhuy XL, Capone G, Conners FA, Mervis CB, Abbeduto L, Rafii M, Krinsky-McHale SJ, & Urv T (2017). Outcome measures for clinical trials in Down syndrome. American Journal on Intellectual and Developmental Disabilities, 122(3), 247–281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Fidler D, & Lanfranchi S (2022). Executive function and intellectual disability: innovations, methods and treatment. Journal of Intellectual Disability Research, 66(1–2), 1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Fidler DJ, Daunhauer LA, Schworer E, & Patel L (2020). Executive function in Down syndrome: Links to adaptation and treatment implications. In The Oxford Handbook of Down Syndrome and Development. [Google Scholar]
  29. Friedman NP, & Miyake A (2004). The relations among inhibition and interference control functions: a latent-variable analysis. Journal of Experimental Psychology: General, 133(1), 101. [DOI] [PubMed] [Google Scholar]
  30. García O, Castillo-Ignacio B, & Arias-Trejo N (2017). Vocabulary and Cognitive Flexibility in People with Down Syndrome. In Language Development and Disorders in Spanish-speaking Children (pp. 343–355). Springer. [Google Scholar]
  31. Gioia GA (2000). Behavior Rating Inventory of Executive Function: Professional Manual. Psychological Assessment Resources, Incorporated. [Google Scholar]
  32. Gioia GA, Isquith PK, Guy SC, & Kenworthy L (2015). Behavior Rating Inventory of Executive Function 2nd Edition (BRIEF2): Professional Manual. Psychological Assessment Resources, Incorporated. [Google Scholar]
  33. Goeldner C, Kishnani PS, Skotko BG, Casero JL, Hipp JF, Derks M, Hernandez M-C, Khwaja O, Lennon-Chrimes S, & Noeldeke J (2022). A randomized, double-blind, placebo-controlled phase II trial to explore the effects of a GABAA-α5 NAM (basmisanil) on intellectual disability associated with Down syndrome. Journal of Neurodevelopmental Disorders, 14(1), 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Grant DA, & Berg E (1948). A behavioral analysis of degree of reinforcement and ease of shifting to new responses in a Weigl-type card-sorting problem. Journal of Experimental Psychology, 38(4), 404. [DOI] [PubMed] [Google Scholar]
  35. Gravråkmo S, Olsen A, Lydersen S, Ingul JM, Henry L, & Øie MG (2022). Associations between executive functions, intelligence and adaptive behaviour in children and adolescents with mild intellectual disability. Journal of Intellectual Disabilities, 17446295221095951. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Hamburg S, Lowe B, Startin CM, Padilla C, Coppus A, Silverman W, Fortea J, Zaman S, Head E, & Handen BL (2019). Assessing general cognitive and adaptive abilities in adults with Down syndrome: a systematic review. Journal of Neurodevelopmental Disorders, 11(1), 1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Hartley SL, Handen BL, Devenny D, Mihaila I, Hardison R, Lao PJ, Klunk WE, Bulova P, Johnson SC, & Christian BT (2017). Cognitive decline and brain amyloid-β accumulation across 3 years in adults with Down syndrome. Neurobiology of Aging, 58, 68–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Hessl D, Sansone SM, Berry-Kravis E, Riley K, Widaman KF, Abbeduto L, Schneider A, Coleman J, Oaklander D, & Rhodes KC (2016). The NIH Toolbox Cognitive Battery for intellectual disabilities: Three preliminary studies and future directions. Journal of Neurodevelopmental Disorders, 8(1), 35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Hommel BE, Ruppel R, & Zacher H (2022). Assessment of cognitive flexibility in personnel selection: Validity and acceptance of a gamified version of the Wisconsin Card Sorting Test. International Journal of Selection and Assessment, 30(1), 126–144. [Google Scholar]
  40. Hutchinson N, & Oakes P (2011). Further evaluation of the criterion validity of the severe impairment battery for the assessment of cognitive functioning in adults with Down syndrome. Journal of Applied Research in Intellectual Disabilities, 24(2), 172–180. [Google Scholar]
  41. Inui N, Yamanishi M, & Tada S (1995). Simple reaction times and timing of serial reactions of adolescents with mental retardation, autism, and Down syndrome. Perceptual and Motor Skills, 81(3), 739–745. 10.2466/pms.1995.81.3.739 [DOI] [PubMed] [Google Scholar]
  42. Iralde L, Roy A, Detroy J, & Allain P (2020). A Representational Approach to Executive Function Impairments in Young Adults with Down Syndrome. Developmental Neuropsychology, 45(5), 263–278. [DOI] [PubMed] [Google Scholar]
  43. Kim C, Cilles SE, Johnson NF, & Gold BT (2012). Domain general and domain preferential brain regions associated with different types of task switching: A meta-analysis. Human Brain Mapping, 33(1), 130–142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Kishnani PS, Heller JH, Spiridigliozzi GA, Lott I, Escobar L, Richardson S, McRae T (2010). Donepezil for treatment of cognitive dysfunction in children with Down syndrome aged 10–17. American Journal of Medical Genetics Part A, 152(12), 3028–3035. 10.1002/ajmg.a.33730 [DOI] [PubMed] [Google Scholar]
  45. Knox A, Schneider A, Abucayan F, Hervey C, Tran C, Hessl D, & Berry-Kravis E (2012). Feasibility, reliability, and clinical validity of the Test of Attentional Performance for Children (KiTAP) in Fragile X syndrome (FXS). Journal of Neurodevelopmental Disorders, 4(1), 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Koo TK, & Li MY (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155–163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Kristensen K, Lorenz K, Zhou X, Piro-Gambetti B, Hartley S, Godar S, Diel S, Neubauer E, & Litovsky R (2022). Language and executive functioning in young adults with Down syndrome. Journal of Intellectual Disability Research, 66(1–2), 151–161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Laiacona M, Inzaghi M, De Tanti A, & Capitani E (2000). Wisconsin card sorting test: a new global score, with Italian norms, and its relationship with the Weigl sorting test. Neurological Sciences, 21(5), 279–291. [DOI] [PubMed] [Google Scholar]
  49. Lalo E, Vercueil L, Bougerol T, Jouk P-S, & Debû B (2005). Late event-related potentials and movement complexity in young adults with Down syndrome. Neurophysiologie Clinique/Clinical Neurophysiology, 35(2), 81–91. 10.1016/j.neucli.2005.03.002 [DOI] [PubMed] [Google Scholar]
  50. Lanfranchi S, Jerman O, Dal Pont E, Alberti A, & Vianello R (2010). Executive function in adolescents with Down Syndrome. Journal of Intellectual Disability Research, 54, 308–319. [DOI] [PubMed] [Google Scholar]
  51. Lao PJ, Handen BL, Betthauser TJ, Mihaila I, Hartley SL, Cohen AD, Tudorascu DL, Bulova PD, Lopresti BJ, & Tumuluru RV (2017). Longitudinal changes in amyloid positron emission tomography and volumetric magnetic resonance imaging in the nondemented Down syndrome population. Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring, 9(1), 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Lee NR, Anand P, Will E, Adeyemi EI, Clasen LS, Blumenthal JD, Giedd JN, Daunhauer LA, Fidler DJ, & Edgin JO (2015). Everyday executive functions in Down syndrome from early childhood to young adulthood: evidence for both unique and shared characteristics compared to youth with sex chromosome trisomy (XXX and XXY). Frontiers in Behavioral Neuroscience, 9, 264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Lee NR, Fidler DJ, Blakeley-Smith A, Daunhauer L, Robinson C, & Hepburn SL (2011). Caregiver report of executive functioning in a population-based sample of young children with Down syndrome. American Journal on Intellectual and Developmental Disabilities, 116(4), 290–304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Llewellyn C, Ayers S, McManus I, Newman SP, Petrie K, Revenson T, & Weinman J (2019). Cambridge handbook of psychology, health and medicine. Cambridge University Press. [Google Scholar]
  55. Loveall S, Conners F, Tungate A, Hahn L, & Osso T (2017). A cross-sectional analysis of executive function in Down syndrome from 2 to 35 years. Journal of Intellectual Disability Research, 61(9), 877–887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Mai CT, Isenburg JL, Canfield MA, Meyer RE, Correa A, Alverson CJ, Lupo PJ, Riehle-Colarusso T, Cho SJ, & Aggarwal D (2019). National population-based estimates for major birth defects, 2010–2014. Birth Defects Research, 111(18), 1420–1435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. McGlinchey E, McCarron M, Holland A, & McCallion P (2019). Examining the effects of computerised cognitive training on levels of executive function in adults with Down syndrome. Journal of Intellectual Disability Research, 63(9), 1137–1150. [DOI] [PubMed] [Google Scholar]
  58. Miyake A, Friedman NP, Emerson MJ, Witzki AH, Howerter A, & Wager TD (2000). The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis. Cognitive Psychology, 41(1), 49–100. [DOI] [PubMed] [Google Scholar]
  59. Mole J, Dore C, Xu T, Shallice T, Chan E, & Cipolotti L (2021). Is the Weigl Colour-Form Sorting Test Specific to Frontal Lobe Damage? Journal of the International Neuropsychological Society, 27(2), 204–210. [DOI] [PubMed] [Google Scholar]
  60. Nelson HE (1976). A modified card sorting test sensitive to frontal lobe defects. Cortex, 12(4), 313–324. [DOI] [PubMed] [Google Scholar]
  61. Psychologische Testsysteme: KiTAP Test of Attentional Performance for Children. (2011).
  62. Rabbitt P (2004). Introduction: Methodologies and models in the study of executive function. In Methodology of Frontal and Executive Function (pp. 9–45). Routledge. [Google Scholar]
  63. Ringenbach S, Arnold N, Myer B, Hayes C, Nam K, & Chen C-C (2021). Executive Function Improves Following Acute Exercise in Adults with Down Syndrome. Brain Sciences, 11(5), 620. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Ringenbach S, Holzapfel S, Mulvey G, Jimenez A, Benson A, & Richter M (2016). The effects of assisted cycling therapy (ACT) and voluntary cycling on reaction time and measures of executive function in adolescents with Down syndrome. Journal of Intellectual Disability Research, 60(11), 1073–1085. [DOI] [PubMed] [Google Scholar]
  65. Roid GH (2003). Stanford-binet intelligence scales (SB5). Riverside. [Google Scholar]
  66. Sabat C, Arango P, Tassé MJ, & Tenorio M (2020). Different abilities needed at home and school: The relation between executive function and adaptive behaviour in adolescents with Down syndrome. Scientific Reports, 10(1), 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Sansone SM, Schneider A, Bickel E, Berry-Kravis E, Prescott C, & Hessl D (2014). Improving IQ measurement in intellectual disabilities using true deviation from population norms. Journal of Neurodevelopmental Disorders, 6(1), 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Schworer EK, Esbensen A, Fidler D, Beebe D, Carle A, & Wiley S (2021). Evaluating working memory outcome measures for children with Down syndrome. Journal of Intellectual Disability Research. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Schworer EK, Hoffman EK, & Esbensen AJ (2021). Psychometric Evaluation of Social Cognition and Behavior Measures in Children and Adolescents with Down Syndrome. Brain Sciences, 11(7), 836. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Schworer EK, Voth K, Hoffman EK, & Esbensen AJ (2022). Short-term memory outcome measures: Psychometric evaluation and performance in youth with Down syndrome. Research in Developmental Disabilities, 120, 104147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Sinai A, Hassiotis A, Rantell K, & Strydom A (2016). Assessing specific cognitive deficits associated with dementia in older adults with Down syndrome: Use and validity of the Arizona Cognitive Test Battery (ACTB). PLoS One, 11(5), e0153917. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Sparrow S, Cicchetti D, & Saulnier C (2016). Vineland Adaptive Behavior Scale-Third Edition (Vineland-3). Pearson. [Google Scholar]
  73. Startin CM, Hamburg S, Hithersay R, Davies A, Rodger E, Aggarwal N, Al-Janabi T, & Strydom A (2016). The LonDownS adult cognitive assessment to study cognitive abilities and decline in Down syndrome. Wellcome Open Research, 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Startin CM, Lowe B, Hamburg S, Hithersay R, Strydom A, & Consortium L (2019). Validating the Cognitive Scale for Down Syndrome (CS-DS) to detect longitudinal cognitive decline in adults with Down syndrome. Frontiers in Psychiatry, 10, 158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Strauss ME, & Smith GT (2009). Construct validity: Advances in theory and methodology. Annual Review of Clinical Psychology, 5, 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Tomaszewski B, Fidler D, Talapatra D, & Riley K (2018). Adaptive behaviour, executive function and employment in adults with Down syndrome. Journal of Intellectual Disability Research, 62(1), 41–52. [DOI] [PubMed] [Google Scholar]
  77. Toplak ME, West RF, & Stanovich KE (2013). Practitioner review: Do performance-based measures and ratings of executive function assess the same construct? Journal of Child Psychology and Psychiatry, 54(2), 131–143. [DOI] [PubMed] [Google Scholar]
  78. Tungate AS, & Conners FA (2021). Executive function in Down syndrome: A meta-analysis. Research in developmental disabilities, 108, 103802. [DOI] [PubMed] [Google Scholar]
  79. Weigl E (1941). On the psychology of so-called processes of abstraction. The Journal of Abnormal and Social Psychology, 36(1), 3. [Google Scholar]
  80. Will E, Fidler D, Daunhauer L, & Gerlach-McDonald B (2017). Executive function and academic achievement in primary–grade students with Down syndrome. Journal of Intellectual Disability Research, 61(2), 181–195. [DOI] [PubMed] [Google Scholar]
  81. Will EA, Schworer EK, & Esbensen AJ (2021). The role of distinct executive functions on adaptive behavior in children and adolescents with Down syndrome. Child Neuropsychology, 1–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Williams KT (2007). EVT-2: Expressive Vocabulary Test. Pearson Assessments. [Google Scholar]
  83. Williams KT (2019). Expressive Vocabulary Test (EVT 3) (3rd ed ed.). NCS Pearson. [Google Scholar]
  84. Willoughby MT, Blair CB, Wirth R, & Greenberg M (2010). The measurement of executive function at age 3 years: psychometric properties and criterion validity of a new battery of tasks. Psychological Assessment, 22(2), 306. [DOI] [PubMed] [Google Scholar]
  85. Willoughby MT, Blair CB, Wirth R, & Greenberg M (2012). The measurement of executive function at age 5: psychometric properties and relationship to academic achievement. Psychological Assessment, 24(1), 226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Wilson BA, Evans JJ, Alderman N, Burgess PW, & Emslie H (2004). Behavioural assessment of the dysexecutive syndrome. In Methodology of Frontal and Executive Function (pp. 240–251). Routledge. [Google Scholar]
  87. Wilson BA, Evans JJ, Emslie H, Alderman N, & Burgess P (1998). The development of an ecologically valid test for assessing patients with a dysexecutive syndrome. Neuropsychological Rehabilitation, 8(3), 213–228. [Google Scholar]
  88. Zelazo PD (2006). The Dimensional Change Card Sort (DCCS): A method of assessing executive function in children. Nature Protocols, 1(1), 297–301. [DOI] [PubMed] [Google Scholar]
  89. Zimmermann P, & Fimm B (2002). A test battery for attentional performance. Applied neuropsychology of attention: Theory, diagnosis and rehabilitation, 110–151. [Google Scholar]

RESOURCES