Brief Report: Differences Between Stanford-Binet Abbreviated and Full-Scale Estimates of IQ in Fragile X Syndrome Vary Across Development

Walker S McKinney; Meredith Nelson; Rebecca C Shaffer; Kelli C Dominick; Craig A Erickson; Lauren M Schmitt

doi:10.1007/s10803-025-07062-w

. Author manuscript; available in PMC: 2026 Mar 5.

Published before final editing as: J Autism Dev Disord. 2025 Oct 23:10.1007/s10803-025-07062-w. doi: 10.1007/s10803-025-07062-w

Brief Report: Differences Between Stanford-Binet Abbreviated and Full-Scale Estimates of IQ in Fragile X Syndrome Vary Across Development

Walker S McKinney ¹, Meredith Nelson ^1,², Rebecca C Shaffer ^1,², Kelli C Dominick ^3,⁴, Craig A Erickson ^3,⁴, Lauren M Schmitt ⁵

PMCID: PMC12958467 NIHMSID: NIHMS2143666 PMID: 41128961

Abstract

Purpose

Fragile X syndrome (FXS) is the most common inherited cause of intellectual disability and single-gene cause of autism. The Stanford-Binet, Fifth Edition (SB-5) is commonly used to assess IQ in FXS. It is not known if the SB-5 routing form’s abbreviated IQ (ABIQ) score accurately estimates full-scale IQ (FSIQ), limiting data-informed decision-making when choosing between an abbreviated or full SB-5 battery.

Methods

198 participants with FXS (143 males) aged 4 to 47 years of age completed the full SB-5. We calculated differences between abbreviated and full-scale estimates of IQ and assessed the extent to which the agreement between ABIQ and FSIQ varied as a function of age, routing subtest scatter, and FSIQ.

Results

The abbreviated SB-5 battery over-estimated FSIQ in most school-age children (< 11 years), and under-estimated FSIQ in adolescents and adults. This under-estimate of FSIQ was larger when there was a greater discrepancy (scatter) between the two routing subtests that comprise ABIQ and in individuals with FSIQ < 68.

Conclusion

Clinicians and researchers should consider administering the full SB-5 battery to individuals with FXS when possible. If only an abbreviated estimate of IQ is available, ABIQ should be interpreted with caution based on our findings of over- or under-estimation occurring across development. Large discrepancies between verbal and nonverbal skills as well as greater severity of ID should both serve as cues to administer the full battery to avoid under-estimating cognitive skills that are otherwise only captured by FSIQ.

Keywords: Fragile X syndrome, Stanford–Binet, Intelligence, IQ, Cognition, Intellectual disability

Fragile X syndrome (FXS) is the most common heritable form of intellectual disability and single-gene cause of autism spectrum disorder (ASD), occurring in approximately 1 in 7,000 males and 1 in 11,000 females (Hunter et al., 2014). FXS is caused by a trinucleotide repeat expansion (> 200 CGG repeats) in the Fragile X messenger ribonucleoprotein 1 (FMR1) gene on the X chromosome. Nearly all males with FXS have an intellectual disability, although the severity is highly variable (Schmitt et al., 2024). When present, intellectual disability tends to be milder in females with FXS, and many females with FXS have IQ scores in the borderline or average range due to the protective effect of a second unaffected X chromosome and random X-inactivation (Bartholomay et al., 2019; Kirchgessner et al., 1995; Schmitt et al., 2024).

Given the hallmark intellectual disability phenotype, the accurate assessment of IQ in FXS is integral to its clinical characterization. IQ scores, although just one criterion for having an intellectual disability, continue to guide educational and vocational supports and placements, inform clinical prognosis, and have implications for accessing services and other benefits (e.g., Supplemental Security Income) (Greenspan et al., 2015; Silverman et al., 2010). IQ scores also are used in research studies in FXS beyond clinical characterization, and often are correlated with primary measures and biomarkers like electrophysiological output (Pedapati et al., 2022) and fragile X messenger ribonucleoprotein (FMRP) and FMR1 RNA expression (Boggs et al., 2022; Schmitt et al., 2024; Straub et al., 2023) to determine the extent to which biomarkers co-vary with the hallmark phenotype.

One known challenge with assessing IQ in FXS is inadequate sensitivity to individual differences in IQ among individuals with moderate-to-severe intellectual disabilities (Hessl et al., 2009). This results in a floor effect, especially among adults with FXS due to age-associated “declines” in IQ driven by limited growth in skills relative to same-age peers. For example, as measured by the Stanford-Binet, Fifth Edition (SB-5) (Roid, 2003), which already has the benefit of an extended floor relative to other IQ instruments, 50% of the males and 7% of the females with FXS in the present study have a full-scale IQ of 40 (the floor of the SB-5). One well-established solution to this floor effect is the transformation of raw scores to z-scores based on deviation from the original normative sample (“deviation IQ”), which extends the floor of observed IQ scores. This method has been described for both the SB-5 and Weschler Scales (Hessl et al., 2009; Sansone et al., 2014), and is publicly available for use via the PRO-ED SB-5 scoring software. This extended range of IQ scores reduces floor effects and produces a near-normal distribution of deviation IQ scores in individuals with FXS (Schmitt et al., 2024). The SB-5 excels at meaningfully capturing cognitive differences in individuals with intellectual disability, and it is the only publicly available method for calculating deviation IQ (the use agreement for the Wechsler Scales was not renewed and thus the deviation IQ scoring method can only be calculated for the SB-5) (Sansone et al., 2014). This makes the SB-5 a valuable approach to measuring IQ in FXS.

The SB-5, like other IQ measures, allows for the calculation of an “abbreviated IQ” score, a combination of performance on the two Verbal Knowledge and Nonverbal Fluid Reasoning routing subtests (Roid, 2003). Administration of these two subtests provides a briefer estimate of IQ than comparable instruments like the Wechsler Abbreviated Scale of Intelligence, Second Edition (WASI-II) which is made up of four subtests (Wechsler, 2011). In our extensive experience, the routing subtests can be completed in 10–15 min in FXS, compared to the 60–90 min required for the full battery. Abbreviated IQ (ABIQ) has thus become a widespread method for brief cognitive screening in FXS across clinical and research settings (Abbeduto et al., 2021; Norris et al., 2022; Shaffer et al., 2020).

Despite its prevalence, the accuracy of SB-5 ABIQ in estimating FSIQ for individuals with FXS or intellectual disabilities more broadly is not known. The accuracy of ABIQ is better understood in autistic individuals, a population that significantly overlaps with FXS, but which has lower prevalence of co-occurring intellectual disability. For example, ABIQ over-estimates FSIQ by an average of 3.5 points in autistic preschoolers (FSIQ range: 40 to 117, M = 70, 42.5% with FSIQ ≥ 70) (Twomey et al., 2018). Stephenson and colleagues further demonstrated that ABIQ over-estimated FSIQ in autistic individuals and individuals with ADHD who have a high degree of scatter between the two routing subtests (FSIQ range: 40–133, M = 78) (Stephenson et al., 2023). These findings suggest ABIQ may over-estimate FSIQ in children with neurodevelopmental disabilities and that the magnitude of this over-estimation is greater when there is a large discrepancy between verbal and nonverbal skills. This is relevant to individuals with FXS, especially males, who often show greater verbal relative to nonverbal skills (Freund & Reiss, 1991; Huddleston et al., 2014). Knowledge of the extent of this error would allow for data-informed decision-making when choosing between the administration of an abbreviated or full SB-5 battery in FXS in clinical or research settings.

To support this data-informed decision-making, the present study aimed to determine the extent to which ABIQ under- or over-estimates FSIQ in individuals with FXS. We also aimed to assess whether this pattern varies by age, the degree of routing subtest scatter, and FSIQ (i.e., a proxy for ID severity). Based on previous findings in autistic youth (Twomey et al., 2018), we hypothesized that ABIQ would over-estimate FSIQ in children with FXS. Based on our clinical experience administering the SB-5 to adults with FXS, we also hypothesized that ABIQ would under-estimate FSIQ in adults with FXS and in those with more severe ID (i.e., lower FSIQ).

Methods

Participants

Participants included all male and female patients with FXS seen in the Cincinnati Fragile X Research and Treatment Center who were administered the Stanford-Binet, Fifth Edition (SB-5) full battery between 2015 and 2025. All participants had a confirmed diagnosis of FXS, defined as having the full FMR1 mutation (> 200 CGG repeats), confirmed via past testing results made available in a participant’s medical record or via Southern Blot and/or PCR conducted in collaboration with the Molecular Diagnostic Laboratory at Rush University. Only participants younger than 50 years of age were analyzed due to a small number of participants (N = 6) over the age of 50. A subset of participants (N = 61) provided clinical data across multiple timepoints: 198 unique participants provided data across 281 visits (see Results for additional details). Full demographic details are reported in Table 1. Parents or legal guardians of participants younger than 18 years of age and legal guardians of adult participants with limited decision-making capacity stemming from their intellectual disability provided written informed consent. Adult participants otherwise provided written informed consent. All participants provided assent when possible given each participant’s expressive and receptive language abilities. All study procedures were approved by the Cincinnati Children’s Hospital Medical Center Institutional Review Board.

Table 1.

Participant demographics

	Mean (standard deviation) or % (N)

	Overall (N = 198)	Males (N =143)	Females (N = 55)

Age (years)	20.5 (11.3)	21.0 (11.6)	19.1 (10.5)
	Range: 4.7–47.2	Range: 4.7–47.2	Range: 5.2–43.8
% with repeated evaluations/multiple timepoints	30.8% (61)	30.1% (43)	32.7% (18)
Age groups
0–10 years	24.7% (49)	25.1% (36)	23.6% (13)
11–20 years	32.3% (64)	27.3% (39)	45.5% (25)
21–30 years	20.7% (41)	23.1% (33)	14.5% (8)
31–40 years	14.5% (29)	16.1% (23)	10.9% (6)
41–50 years	7.6% (15)	8.4% (12)	5.5% (3)
Race
Asian-American/Pacific Islander	1.0% (2)	-	3.6% (2)
Black	4.0% (8)	4.2% (6)	3.6% (2)
Native American/Alaskan Native	0.5% (1)	0.7% (1)	-
Multiracial	0.5% (1)	0.7% (1)	-
White	93.9% (186)	94.4% (135)	92.7% (51)
Ethnicity
Hispanic or Latino	5.1% (10)	4.2% (6)	7.3% (4)
Non-Hispanic or Latino	94.4% (187)	95.1% (136)	92.7% (51)
Not known	0.5% (1)	0.7% (1)	-
Stanford-Binet 5 Factor Index Scores (Deviation Standard Scores)
% at floor of SB-5 (FSIQ = 40)	37.9% (75)	49.7% (71)	7.3% (4)
Full-Scale Deviation IQ	49 (23)	41 (18)	71 (22)
	Range: 0–106	Range: 0–106	Range: 5–100
Abbreviated Deviation IQ	43 (28)	34 (22)	69 (25)
	Range: −18–106	Range: −18–100	Range: −11–106
FSIQ-ABIQ difference	5.8 (11.0)	7.3 (11.6)	1.8 (8.1)
	Range: −23.1–35.4	Range: −23.1–35.4	Range: −15.7–17.3
% ABIQ within 1 SD (15 points) of FSIQ	75.8% (150)	69.9% (100)	90.9% (50)
Verbal Deviation IQ	49 (24)	41 (19)	71 (23)
	Range: −5–102	Range: −5–98	Range: 0–102
Nonverbal Deviation IQ	49 (24)	41 (18)	71 (22)
	Range: −3–114	Range: 0–114	Range: −3–103
Fluid Reasoning	44 (30)	34 (24)	72 (26)
	Range: −10–116	Range: −10–116	Range: −7–109
Knowledge	52 (23)	45 (19)	72 (21)
	Range: −10–102	Range: −10–94	Range: 14–102
Quantitative Reasoning	53 (23)	46 (20)	71 (21)
	Range: −3–103	Range: −2–103	Range: −3–102
Visual Spatial	53 (22)	46 (17)	71 (23)
	Range: 4–122	Range: 8–122	Range: 4–108
Working Memory	44 (27)	34 (21)	68 (24)
	Range: −21–109	Range: −21–98	Range: −10–109
Stanford-Binet 5 Subscales (Deviation Scaled Scores)
Nonverbal Fluid Reasoning	−3 (7)	−5 (6)	3 (6)
	Range: −17–14	Range: −17–14	Range: −16–12
Nonverbal Knowledge	1 (5)	0 (4)	5 (4)
	Range: −12–11	Range: −12–10	Range: −7–11
Nonverbal Quantitative Reasoning	1 (5)	−1 (5)	5 (5)
	Range: −12–13	Range: −11–13	Range: −12–11
Nonverbal Visual Spatial	1 (5)	−1 (4)	5 (5)
	Range: −11–18	Range: −9–18	Range: −11–15
Nonverbal Working Memory	−1 (5)	−2 (4)	3 (5)
	Range: −12–12	Range: −12–9	Range: −12–12
Verbal Fluid Reasoning	0 (6)	−2 (5)	6 (5)
	Range: −10–13	Range: −10–13	Range: −10–13
Verbal Knowledge	0 (5)	−2 (4)	4 (5)
	Range: −12–11	Range: −12–8	Range: −11–11
Verbal Quantitative Reasoning	1 (4)	−1 (4)	4 (4)
	Range: −10–10	Range: −10–9	Range: −10–10
Verbal Visual Spatial	0 (5)	−1 (4)	3 (5)
	Range: −12–11	Range: −12–11	Range: −11–10
Verbal Working Memory	−2 (7)	−4 (5)	4 (6)
	Range: −21–12	Range: −21–11	Range: −14–12

Open in a new tab

All IQ index scores reflect deviation-normed standard scores (M = 100, SD = 15). All IQ subscale scores reflect deviation-normed scaled scores (M = 10, SD = 3). To avoid duplicated data for those with multiple visits/timepoints, only the most recent visit was used in calculating table values.

Procedures

Stanford–Binet, Fifth Edition

All participants completed the full version of the SB-5 (Roid, 2003), administered by either a licensed clinical psychologist, supervised post-doctoral clinical psychology fellow, or supervised clinical research coordinator. Deviation scores for the SB-5 were calculated using previously reported methods validated in FXS to minimize floor effects common in this population (Sansone et al., 2014). Subtest scatter was calculated as the absolute difference between scaled scores on the Verbal Knowledge and Nonverbal Fluid Reasoning subtests. In our Supplementary Results, to compare routing Verbal Knowledge performance with VIQ and Nonverbal Fluid Reasoning performance with NVIQ using the same scale, scaled scores for each routing subtest were transformed to standard scores. These are referred to as “abbreviated VIQ” and “abbreviated NVIQ”, respectively, for ease of reference.

Statistical Analyses

Three separate mixed effects models were used to examine whether differences between full-scale and abbreviated IQ estimates (e.g., FSIQ minus ABIQ) varied by age, subtest scatter, or FSIQ. Linear and quadratic effects for age, subtest scatter, and FSIQ were examined and models with the strongest fit were implemented (see Results). When linear and quadratic effects provided models with similar fit, the more parsimonious linear effect was examined. Participant/subject was the random effect in all models to account for participants who were tested at multiple visits.

Given the unique cognitive profiles seen across males and females with FXS (Freund & Reiss, 1991; Huddleston et al., 2014), we also report whether the above associations were different between males and females (i.e., age × sex; subtest scatter × sex; FSIQ × sex) in our Supplementary Materials. Identical models in our Supplementary Materials also were used to examine the extent to which Verbal Knowledge and Nonverbal Fluid Reasoning routing subtests accurately estimated VIQ and NVIQ, respectively, and whether these estimates varied as a function of age and sex.

Changepoint detection was implemented to determine the age, degree of subtest scatter, and FSIQ score at which the mean difference between FSIQ and ABIQ estimates significantly changed. At Most One Change (AMOC; i.e., one changepoint) was allowed based on visual inspection of the linear/quadratic fit (see Figs. 1, 2 and 3) and to maximize interpretability and clinical utility.

Fig. 1 — FSIQ vs. ABIQ across development. Accuracy of abbreviated IQ (ABIQ) in estimating full-scale IQ (FSIQ) across development. The horizontal dashed line reflects an ABIQ score that perfectly predicts FSIQ (difference of 0). The vertical dashed line reflects the age (11.5 years) at which a changepoint analysis suggests ABIQ shifts from over- to under-estimating FSIQ. The dark green line reflects the quadratic trend line fit independent of sex

Fig. 2 — FSIQ vs. ABIQ as a function of subtest scatter. Accuracy of abbreviated IQ (ABIQ) in estimating full-scale IQ (FSIQ) as a function of routing subtest scatter. The horizontal dashed line reflects an ABIQ score that perfectly predicts FSIQ (difference of 0). The vertical dashed line reflects the degree of subtest scatter (5.5 points) at which a changepoint analysis suggests ABIQ shifts from over- to under-estimating FSIQ. The dark green line reflects the linear trend line fit independent of sex

Fig. 3 — FSIQ vs. ABIQ as a function of FSIQ. Accuracy of abbreviated IQ (ABIQ) in estimating full-scale IQ (FSIQ) as a function of FSIQ (our proxy for ID severity). The horizontal dashed line reflects an ABIQ score that perfectly predicts FSIQ (difference of 0). The vertical dashed line reflects the FSIQ score (68) below which a changepoint analysis suggests ABIQ begins to underestimate FSIQ. The dark green line reflects the linear trend line fit independent of sex

All statistical analyses were conducted using R version 4.3.1 (2023). Mixed effects models used the lme4 R package (Bates et al., 2015). Changepoint detection used the changepoint R package (Killick et al., 2022). All data was visualized using the ggplot2 R package (Wickham, 2016).

Results

Stability of ABIQ and FSIQ

Sixty-one (61) participants provided data across multiple timepoints representing 144 total repeated visits. The number of repeated visits ranged from 2 to 4 (M = 2.5 visits, SD = 0.6 visits). The average time between repeated visits ranged from 265 to 2856 days (M = 698.6 days, SD = 435.0 days). Change in ABIQ between subsequent visits ranged from 0 to 48.9 (M = 9.5, SD = 8.8). Change in FSIQ between subsequent visits ranged from 0 to 33.0 (M = 7.3, SD = 6.7).

Does ABIQ Accurately Estimate FSIQ in Fragile X Syndrome?

ABIQ was within 1 SD (15 standard score points) of FSIQ for 77.2% of all timepoints (N = 217). We first examined whether the accuracy of ABIQ in estimating FSIQ varied as a function of age. A model including a quadratic age effect provided a better fit (AIC = 1992.4, BIC = 2010.6, log-likelihood = −991.18, marginal R² = 0.263, conditional R² = 0.693, adjusted ICC = 0.584) than a model including a linear age effect (AIC = 2026.8, BIC = 2041.4, log-likelihood = −1009.42, marginal R² = 0.378, conditional R² = 0.707, adjusted ICC = 0.529; χ²(1) = 36.474, p < .001). The degree to which ABIQ accurately estimated FSIQ changed non-linearly with age (F_(1,221.71) = 38.998, p < .001; Fig. 1). Changepoint detection indicated that ABIQ over-estimated FSIQ in participants younger than 11.5 years of age (vertical dashed line in Fig. 1), after which ABIQ under-estimated FSIQ by an average of 9.2 standard score points (SD = 9.1, range = −15.5 to 35.4).

We also examined whether the accuracy of ABIQ in estimating FSIQ varied as a function of routing subtest scatter. A model including a linear subtest scatter effect provided a similar fit (AIC = 2042.7, BIC = 2057.3, log-likelihood = −1017.4, marginal R² = 0.179, conditional R² = 0.663, adjusted ICC = 0.589) as a model including a quadratic subtest scatter effect (AIC = 2044.7, BIC = 2062.9, log-likelihood = −1017.4, marginal R² = 0.179, conditional R² = 0.663, adjusted ICC = 0.589; χ²(1) < 0.001, p = .983). The more parsimonious model including a linear fit was selected. The degree to which ABIQ accurately estimated FSIQ changed linearly with subtest scatter (F_(1,278.079) = 61.274, p < .001; Fig. 2). The greater absolute scatter there was between the verbal and nonverbal routing subtests, the more ABIQ tended to under-estimate FSIQ. Changepoint detection indicated that ABIQ began to consistently under-estimate FSIQ by an average of 12.5 standard score points (SD = 9.7, range = −21.7 to 27.9) after the difference between the routing subtests was greater than 5.5 scaled score points. Subsequent visual inspection of the raw scatter between subtests (i.e., non-absolute value) indicated that for most participants, this underestimate was occurring in participants whose verbal routing performance was greater than their nonverbal routing performance.

We also examined whether the accuracy of ABIQ in estimating FSIQ varied as a function of FSIQ (i.e., differences varying as a function of ID severity). A model including a linear FSIQ effect provided a similar fit (AIC = 2090.4, BIC = 2105.0, log-likelihood = −1041.2, marginal R² = 0.027, conditional R² = 0.674, adjusted ICC = 0.665) as a model including a quadratic FSIQ effect (AIC = 2090.8, BIC = 2108.9, log-likelihood = −1040.4, marginal R² = 0.034, conditional R² = 0.661, adjusted ICC = 0.650; χ²(1) = 1.658, p = .198). The more parsimonious model including a linear fit was selected. The degree to which ABIQ accurately estimated FSIQ changed linearly with FSIQ (F_(1,241.082) = 6.521, p = .011; Fig. 3). ABIQ tended to under-estimate FSIQ more in participants with lower FSIQ. Changepoint detection indicated that this under-estimate was more consistent in those with FSIQ < 68. In participants with FSIQ < 68, this underestimate was an average of 6.3 standard score points (SD = 11.6, range = −23.1 to 35.4).

Discussion

To inform clinical decision-making when administering the SB-5 to individuals with FXS (i.e., determining a full vs. abbreviated battery), we examined differences between FSIQ and ABIQ in nearly 200 participants with FXS. We also determined whether characteristics like age, scatter between routing subtests, and ID severity could inform an administrator’s decision to use ABIQ or administer the full SB-5 to calculate FSIQ. We have two primary findings from this analysis. First, ABIQ often over-estimates true abilities (FSIQ) before age 11 in individuals with FXS, after which ABIQ almost universally under-estimates FSIQ, suggesting over- or under-estimation varies across development, likely as skills emerge that are only captured by a full administration (Fig. 1). Second, the degree to which ABIQ under-estimates FSIQ worsens as the split between verbal and nonverbal skills grows and in individuals with more severe ID, providing simple indicators during routing subtest administration to determine if a full SB-5 battery is needed (Figs. 2 and 3). Based on these findings, we make three recommendations for clinicians and clinical researchers.

Recommendation 1: Administer the Full SB-5 in FXS When Feasible

As the developers of the SB-5 acknowledge, ABIQ is only an estimate of FSIQ and errors in its estimation are expected (Roid, 2003). FSIQ itself is only an estimate of true cognitive abilities, although we treat it as “ground truth” for the purposes of discussion here. However, we demonstrate that errors in the estimation of FSIQ by ABIQ likely vary across development in individuals with FXS. Our finding that the SB-5 ABIQ score over-estimates FSIQ in school-age children with FXS is consistent with findings in autistic youth (Twomey et al., 2018). When conducting evaluations, we therefore recommend a more conservative approach: a full administration may avoid over-estimating abilities in children and under-estimating abilities in adults.

The issue of age-related under-estimation and over-estimation is amplified in males relative to females (see Supplementary Materials). At the lower end of the performance distribution, small raw score differences translate into disproportionately large differences in standard scores. In males who have more significantly affected cognitive skills and who measure at the lower of the performance distribution (Schmitt et al., 2024), small differences in raw performance thus translate to large differences in estimated abilities. This effect is demonstrated in our finding that the under-estimation of skills by ABIQ is more drastic in individuals with lower FSIQ (Fig. 3).

Differences between the verbal and nonverbal routing subtest’s prediction of VIQ and NVIQ described in our Supplementary Materials clarify what is likely driving the under-estimation of skills by ABIQ in adolescents and adults with FXS. The accuracy of the Verbal Knowledge routing subtest in predicting VIQ is more consistent across development, whereas the Nonverbal Fluid Reasoning subtest begins to under-estimate NVIQ beginning in adolescence. This suggests there are nonverbal skills not captured by this lower ABIQ score in adolescents and adults that are otherwise captured by FSIQ. As Fluid Reasoning is already reflected in the ABIQ score, this suggests performance on the other domains (Nonverbal Knowledge, Quantitative Reasoning, Visual-Spatial Processing, or Working Memory) may be stronger.

Recommendation 2: When a Full Administration is Not Possible, Take Precautions When Relying on the Abbreviated SB-5 in FXS

There may be times when clinicians or researchers only have ABIQ available for analysis (e.g., a full administration is terminated early; retrospective studies). If presented with these issues, we encourage users to recognize the limitations of the ABIQ estimate and take appropriate precautions when interpreting the data. Within the context of a research study, this limitation should be explicitly reported. Ultimately, this underscores the importance of evidence-based psychological assessment that bases recommendations on the results of a comprehensive battery without relying on an IQ score as the sole data point (Wright et al., 2022). When other data points (e.g., adaptive behaviors, measures of co-occurring symptomatology) are available for clinical characterization, this other data may be weighed more heavily when interpreting research findings or making clinical recommendations.

Recommendation 3: Researchers Can Use Scatter Between the Routing Subtests or Estimates of ID Severity During Testing to Determine if a Full SB-5 Battery is Necessary

Our finding that this ABIQ under-estimate is worse when routing subtest scatter is high and in individuals with more severe ID gives administrators a concrete decision point after completion of the routing subtests. If the difference between the nonverbal and verbal routing subtests is less than 5.5 scaled score points and if an individual appears to have an IQ > 68, ABIQ often adequately estimates FSIQ. However, under- or over-estimation of FSIQ by ABIQ may remain due to age-related concerns described above, and ABIQ should still be interpreted with caution.

Limitations and Future Directions

Several limitations of our study inform directions for future research. First, many of our participants performed at the floor of the SB-5 (50% of males, 7% of females). This precluded our ability to conduct our analysis using the original SB-5 scoring method because there is a 7-point difference between the lowest possible FSIQ (40) and ABIQ scores (47) that would create an artificial “over-estimate” of 7 points in these participants. This would minimize individual differences in raw performance that are otherwise captured by the deviation IQ scoring method (Hessl et al., 2009; Sansone et al., 2014). We are interested in similar studies using the original scoring method in a sample with a greater number of participants with FXS performing above the SB-5 floor. Second, this study was conducted on a predominantly white and non-Hispanic sample, consistent with studies finding race- and ethnic-based disparities in FXS service access (Crawford et al., 2002; Kidd et al., 2017). The degree to which these findings generalize to a more representative sample is unknown. Last, IQ scores are only one broadband metric of cognitive abilities, and our paper frames as FSIQ as the truest available estimate of cognition for the sake of interpretability. It is imperative to recognize that IQ tests and scores are imprecise, reductionistic, and historically rooted in eugenics and other discriminatory practices (Au, 2013). Despite these issues, IQ testing remains a foundational, but imperfect, component of neurodevelopmental assessments and informs decision-making for families and providers.

Conclusion

Based on findings from a sample of nearly 200 participants with FXS, we demonstrate errors in the estimation of FSIQ by ABIQ when using the SB-5 in FXS. The SB-5 remains a high-quality instrument, but ABIQ estimates should be interpreted with appropriate precautions in FXS given these findings. We generally recommend that administrators aim to complete the full SB-5 when possible. When this is not possible, administrators should acknowledge potential sources of error in ABIQ estimates of intelligence. Our findings highlight the complexity of assessing uneven cognitive profiles in individuals with neurodevelopmental disabilities such as FXS and the care that must be taken when interpreting testing results.

Supplementary Material

supplementary material

NIHMS2143666-supplement-supplementary_material.docx^{(14.6KB, docx)}

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s10803-025-07062-w.

Acknowledgments

The authors wish to thank the participating patients and families seen in the Cincinnati Fragile X Research and Treatment Center, without whom this work would not be possible. The authors are grateful for the genetic testing services provided by the Molecular Diagnostic Laboratory at Rush University (PI: Dr. Elizabeth Berry-Kravis). This work was supported by the National Institutes of Health (CAE, grant numbers U54HD082008, U54HD104461), (LMS, grant number K23HD101416).

Footnotes

Declarations

Conflict of interest The authors have no conflicts of interest to disclose.

References

Abbeduto L, Klusek J, Taylor JL, Abdelnur N, Sparapani N, & Thurman AJ (2021). Concurrent associations between expressive Language ability and independence in adolescents and adults with fragile X syndrome. Brain Science, 10.3390/brainsci11091179 [DOI] [PMC free article] [PubMed] [Google Scholar]
Au W (2013). Hiding behind high-stakes testing: Meritocracy, objectivity and inequality in U.S. Education. International Education Journal: Comparative Perspectives, 12(2), 7–19. [Google Scholar]
Bartholomay KL, Lee CH, Bruno JL, Lightbody AA, & Reiss AL (2019). Closing the gender gap in fragile X syndrome: Review on females with FXS and preliminary research findings. Brain Science, 9(1). 10.3390/brainsci9010011 [DOI] [PMC free article] [PubMed] [Google Scholar]
Bates D, Machler M, Bolker B, & Walker S (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. 10.18637/jss.v067.i01 [DOI] [Google Scholar]
Boggs AE, Schmitt LM, McLane RD, Adayev T, LaFauci G, Horn PS, Dominick KC, Gross C, & Erickson CA (2022). Optimization, validation and initial clinical implications of a Luminex-based immunoassay for the quantification of fragile X protein from dried blood spots. Scientific Reports, 12(1), 5617. 10.1038/s41598-022-09633-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
Crawford DC, Meadows KL, Newman JL, Taft LF, Scott E, Leslie M, Shubek L, Holmgreen P, Yeargin-Allsopp M, Boyle C, & Sherman SL (2002). Prevalence of the fragile X syndrome in African-Americans. American Journal of Medical Genetics, 110(3), 226–233. 10.1002/ajmg.10427 [DOI] [PubMed] [Google Scholar]
Freund LS, & Reiss AL (1991). Cognitive profiles associated with the fra(X) syndrome in males and females. American Journal of Medical Genetics, 38(4), 542–547. 10.1002/ajmg.1320380409 [DOI] [PubMed] [Google Scholar]
Greenspan S, Harris JC, & Woods GW (2015). Intellectual disability is a condition, not a number: Ethics of IQ cut-offs in psychiatry, human services and law. Ethics Medicine and Public Health, 1(3), 312–324. 10.1016/j.jemep.2015.07.004 [DOI] [Google Scholar]
Hessl D, Nguyen DV, Green C, Chavez A, Tassone F, Hagerman RJ, Senturk D, Schneider A, Lightbody A, Reiss AL, & Hall S (2009). A solution to limitations of cognitive testing in children with intellectual disabilities: The case of fragile X syndrome. Journal of Neurodevelopmental Disorders, 1(1), 33–45. 10.1007/s11689-008-9001-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
Huddleston LB, Visootsak J, & Sherman SL (2014). Cognitive aspects of fragile X syndrome. Wiley Interdisciplinary Reviews. Cognitive Science, 5(4), 501–508. 10.1002/wcs.1296 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hunter J, Rivero-Arias O, Angelov A, Kim E, Fotheringham I, & Leal J (2014). Epidemiology of fragile X syndrome: A systematic review and meta-analysis. American Journal of Medical Genetics Part A, 164A(7), 1648–1658. 10.1002/ajmg.a.36511 [DOI] [PubMed] [Google Scholar]
Kidd SA, Raspa M, Clark R, Usrey-Roos H, Wheeler AC, Liu JA, Wylie A, & Sherman SL (2017). Attendance at fragile X specialty clinics: Facilitators and barriers. American Journal on Intellectual and Developmental Disabilities, 122(6), 457–475. 10.1352/1944-7558-122.6.457 [DOI] [PubMed] [Google Scholar]
Killick R, Haynes K, Eckley I, Fearnhead P, Long R, & Lee J (2022). Methods for changepoint detection. In https://github.com/rkillick/changepoint/
Kirchgessner CU, Warren ST, & Willard HF (1995). X inactivation of the FMR1 fragile X mental retardation gene. Journal of Medical Genetics, 32(12), 925–929. 10.1136/jmg.32.12.925 [DOI] [PMC free article] [PubMed] [Google Scholar]
Norris JE, DeStefano LA, Schmitt LM, Pedapati EV, Erickson CA, Sweeney JA, & Ethridge LE (2022). Hemispheric utilization of alpha oscillatory dynamics as a unique biomarker of neural compensation in females with fragile X syndrome. Acs Chemical Neuroscience, 13(23), 3389–3402. 10.1021/acschemneuro.2c00404 [DOI] [PMC free article] [PubMed] [Google Scholar]
Pedapati EV, Schmitt LM, Ethridge LE, Miyakoshi M, Sweeney JA, Liu R, Smith E, Shaffer RC, Dominick KC, Gilbert DL, Wu SW, Horn PS, Binder DK, Lamy M, Axford M, & Erickson CA (2022). Neocortical localization and thalamocortical modulation of neuronal hyperexcitability contribute to fragile X syndrome. Communications Biology, 5(1), 442. 10.1038/s42003-022-03395-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
Roid GH (2003). Stanford-Binet intelligence Scales - Examiner’s manual (5th ed.). Riverside Publishing. [Google Scholar]
Sansone SM, Schneider A, Bickel E, Berry-Kravis E, Prescott C, & Hessl D (2014). Improving IQ measurement in intellectual disabilities using true deviation from population norms. Journal of Neurodevelopmental Disorders, 6(1), 16. 10.1186/1866-1955-6-16 [DOI] [PMC free article] [PubMed] [Google Scholar]
Schmitt LM, Nelson M, Shaffer RC, & Erickson CA (2024). A near normal distribution of IQ in fragile X syndrome. Scientific Reports. 10.1038/s41598-024-73626-y [DOI] [PMC free article] [PubMed] [Google Scholar]
Shaffer RC, Schmitt L, Thurman J, Abbeduto A, Hong L, Pedapati M, Dominick E, Sweeney K,J, & Erickson C (2020). The relationship between expressive Language sampling and clinical measures in fragile X syndrome and typical development. Brain Science, 10.3390/brainsci10020066 [DOI] [PMC free article] [PubMed] [Google Scholar]
Silverman W, Miezejeski C, Ryan R, Zigman W, Krinsky-McHale S, & Urv T (2010). Stanford-Binet & WAIS IQ differences and their implications for adults with intellectual disability (aka mental Retardation). Intelligence, 38(2), 242–248. 10.1016/j.intell.2009.12.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
Stephenson KG, Levine A, Russell NCC, Horack J, & Butter EM (2023). Measuring intelligence in autism and ADHD: Measurement invariance of the-Binet 5th edition and impact of subtest scatter on abbreviated IQ accuracy. Autism Research, 16(12), 2350–2363. 10.1002/aur.3034 [DOI] [PubMed] [Google Scholar]
Straub D, Schmitt LM, Boggs AE, Horn PS, Dominick KC, Gross C, & Erickson CA (2023). A sensitive and reproducible qRT-PCR assay detects physiological relevant trace levels of FMR1 mRNA in individuals with fragile X syndrome. Scientific Reports, 13(1), 3808. 10.1038/s41598-023-29786-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
Team, R. C. (2023). R: A Language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/ [Google Scholar]
Twomey C, O’Connell H, Lillis M, Tarpey SL, & O’Reilly G (2018). Utility of an abbreviated version of the stanford-binet intelligence scales (5(th) ed.) in estimating ‘full scale’ IQ for young children with autism spectrum disorder. Autism Research, 11(3), 503–508. 10.1002/aur.1911 [DOI] [PubMed] [Google Scholar]
Wechsler D (2011). Wechsler abbreviated scale of intelligence (WASI-II) (2nd ed.). NCS Pearson. [Google Scholar]
Wickham H (2016). ggplot2: Elegant graphics for data analysis. Springer-. https://ggplot2.tidyverse.org [Google Scholar]
Wright AJ, Pade H, Gottfried ED, Arbisi PA, McCord DM, & Wygant DB (2022). Evidence-based clinical psychological assessment (EBCPA): Review of current state of the literature and best practices. Professional Psychology: Research and Practice, 53(4), 372–386. 10.1037/pro0000447 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplementary material

NIHMS2143666-supplement-supplementary_material.docx^{(14.6KB, docx)}

[R1] Abbeduto L, Klusek J, Taylor JL, Abdelnur N, Sparapani N, & Thurman AJ (2021). Concurrent associations between expressive Language ability and independence in adolescents and adults with fragile X syndrome. Brain Science, 10.3390/brainsci11091179 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Au W (2013). Hiding behind high-stakes testing: Meritocracy, objectivity and inequality in U.S. Education. International Education Journal: Comparative Perspectives, 12(2), 7–19. [Google Scholar]

[R3] Bartholomay KL, Lee CH, Bruno JL, Lightbody AA, & Reiss AL (2019). Closing the gender gap in fragile X syndrome: Review on females with FXS and preliminary research findings. Brain Science, 9(1). 10.3390/brainsci9010011 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Bates D, Machler M, Bolker B, & Walker S (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. 10.18637/jss.v067.i01 [DOI] [Google Scholar]

[R5] Boggs AE, Schmitt LM, McLane RD, Adayev T, LaFauci G, Horn PS, Dominick KC, Gross C, & Erickson CA (2022). Optimization, validation and initial clinical implications of a Luminex-based immunoassay for the quantification of fragile X protein from dried blood spots. Scientific Reports, 12(1), 5617. 10.1038/s41598-022-09633-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Crawford DC, Meadows KL, Newman JL, Taft LF, Scott E, Leslie M, Shubek L, Holmgreen P, Yeargin-Allsopp M, Boyle C, & Sherman SL (2002). Prevalence of the fragile X syndrome in African-Americans. American Journal of Medical Genetics, 110(3), 226–233. 10.1002/ajmg.10427 [DOI] [PubMed] [Google Scholar]

[R7] Freund LS, & Reiss AL (1991). Cognitive profiles associated with the fra(X) syndrome in males and females. American Journal of Medical Genetics, 38(4), 542–547. 10.1002/ajmg.1320380409 [DOI] [PubMed] [Google Scholar]

[R8] Greenspan S, Harris JC, & Woods GW (2015). Intellectual disability is a condition, not a number: Ethics of IQ cut-offs in psychiatry, human services and law. Ethics Medicine and Public Health, 1(3), 312–324. 10.1016/j.jemep.2015.07.004 [DOI] [Google Scholar]

[R9] Hessl D, Nguyen DV, Green C, Chavez A, Tassone F, Hagerman RJ, Senturk D, Schneider A, Lightbody A, Reiss AL, & Hall S (2009). A solution to limitations of cognitive testing in children with intellectual disabilities: The case of fragile X syndrome. Journal of Neurodevelopmental Disorders, 1(1), 33–45. 10.1007/s11689-008-9001-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] Huddleston LB, Visootsak J, & Sherman SL (2014). Cognitive aspects of fragile X syndrome. Wiley Interdisciplinary Reviews. Cognitive Science, 5(4), 501–508. 10.1002/wcs.1296 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] Hunter J, Rivero-Arias O, Angelov A, Kim E, Fotheringham I, & Leal J (2014). Epidemiology of fragile X syndrome: A systematic review and meta-analysis. American Journal of Medical Genetics Part A, 164A(7), 1648–1658. 10.1002/ajmg.a.36511 [DOI] [PubMed] [Google Scholar]

[R12] Kidd SA, Raspa M, Clark R, Usrey-Roos H, Wheeler AC, Liu JA, Wylie A, & Sherman SL (2017). Attendance at fragile X specialty clinics: Facilitators and barriers. American Journal on Intellectual and Developmental Disabilities, 122(6), 457–475. 10.1352/1944-7558-122.6.457 [DOI] [PubMed] [Google Scholar]

[R13] Killick R, Haynes K, Eckley I, Fearnhead P, Long R, & Lee J (2022). Methods for changepoint detection. In https://github.com/rkillick/changepoint/

[R14] Kirchgessner CU, Warren ST, & Willard HF (1995). X inactivation of the FMR1 fragile X mental retardation gene. Journal of Medical Genetics, 32(12), 925–929. 10.1136/jmg.32.12.925 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] Norris JE, DeStefano LA, Schmitt LM, Pedapati EV, Erickson CA, Sweeney JA, & Ethridge LE (2022). Hemispheric utilization of alpha oscillatory dynamics as a unique biomarker of neural compensation in females with fragile X syndrome. Acs Chemical Neuroscience, 13(23), 3389–3402. 10.1021/acschemneuro.2c00404 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Pedapati EV, Schmitt LM, Ethridge LE, Miyakoshi M, Sweeney JA, Liu R, Smith E, Shaffer RC, Dominick KC, Gilbert DL, Wu SW, Horn PS, Binder DK, Lamy M, Axford M, & Erickson CA (2022). Neocortical localization and thalamocortical modulation of neuronal hyperexcitability contribute to fragile X syndrome. Communications Biology, 5(1), 442. 10.1038/s42003-022-03395-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] Roid GH (2003). Stanford-Binet intelligence Scales - Examiner’s manual (5th ed.). Riverside Publishing. [Google Scholar]

[R18] Sansone SM, Schneider A, Bickel E, Berry-Kravis E, Prescott C, & Hessl D (2014). Improving IQ measurement in intellectual disabilities using true deviation from population norms. Journal of Neurodevelopmental Disorders, 6(1), 16. 10.1186/1866-1955-6-16 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Schmitt LM, Nelson M, Shaffer RC, & Erickson CA (2024). A near normal distribution of IQ in fragile X syndrome. Scientific Reports. 10.1038/s41598-024-73626-y [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Shaffer RC, Schmitt L, Thurman J, Abbeduto A, Hong L, Pedapati M, Dominick E, Sweeney K,J, & Erickson C (2020). The relationship between expressive Language sampling and clinical measures in fragile X syndrome and typical development. Brain Science, 10.3390/brainsci10020066 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Silverman W, Miezejeski C, Ryan R, Zigman W, Krinsky-McHale S, & Urv T (2010). Stanford-Binet & WAIS IQ differences and their implications for adults with intellectual disability (aka mental Retardation). Intelligence, 38(2), 242–248. 10.1016/j.intell.2009.12.005 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] Stephenson KG, Levine A, Russell NCC, Horack J, & Butter EM (2023). Measuring intelligence in autism and ADHD: Measurement invariance of the-Binet 5th edition and impact of subtest scatter on abbreviated IQ accuracy. Autism Research, 16(12), 2350–2363. 10.1002/aur.3034 [DOI] [PubMed] [Google Scholar]

[R23] Straub D, Schmitt LM, Boggs AE, Horn PS, Dominick KC, Gross C, & Erickson CA (2023). A sensitive and reproducible qRT-PCR assay detects physiological relevant trace levels of FMR1 mRNA in individuals with fragile X syndrome. Scientific Reports, 13(1), 3808. 10.1038/s41598-023-29786-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] Team, R. C. (2023). R: A Language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/ [Google Scholar]

[R25] Twomey C, O’Connell H, Lillis M, Tarpey SL, & O’Reilly G (2018). Utility of an abbreviated version of the stanford-binet intelligence scales (5(th) ed.) in estimating ‘full scale’ IQ for young children with autism spectrum disorder. Autism Research, 11(3), 503–508. 10.1002/aur.1911 [DOI] [PubMed] [Google Scholar]

[R26] Wechsler D (2011). Wechsler abbreviated scale of intelligence (WASI-II) (2nd ed.). NCS Pearson. [Google Scholar]

[R27] Wickham H (2016). ggplot2: Elegant graphics for data analysis. Springer-. https://ggplot2.tidyverse.org [Google Scholar]

[R28] Wright AJ, Pade H, Gottfried ED, Arbisi PA, McCord DM, & Wygant DB (2022). Evidence-based clinical psychological assessment (EBCPA): Review of current state of the literature and best practices. Professional Psychology: Research and Practice, 53(4), 372–386. 10.1037/pro0000447 [DOI] [Google Scholar]

PERMALINK

Brief Report: Differences Between Stanford-Binet Abbreviated and Full-Scale Estimates of IQ in Fragile X Syndrome Vary Across Development

Walker S McKinney

Meredith Nelson

Rebecca C Shaffer

Kelli C Dominick

Craig A Erickson

Lauren M Schmitt

Abstract

Purpose

Methods

Results

Conclusion

Methods

Participants

Table 1.

Procedures

Stanford–Binet, Fifth Edition

Statistical Analyses

Fig. 1.

Fig. 2.

Fig. 3.

Results

Stability of ABIQ and FSIQ

Does ABIQ Accurately Estimate FSIQ in Fragile X Syndrome?

Discussion

Recommendation 1: Administer the Full SB-5 in FXS When Feasible

Recommendation 2: When a Full Administration is Not Possible, Take Precautions When Relying on the Abbreviated SB-5 in FXS

Recommendation 3: Researchers Can Use Scatter Between the Routing Subtests or Estimates of ID Severity During Testing to Determine if a Full SB-5 Battery is Necessary

Limitations and Future Directions

Conclusion

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases