Skip to main content
Journal of Speech, Language, and Hearing Research : JSLHR logoLink to Journal of Speech, Language, and Hearing Research : JSLHR
. 2023 Aug 4;66(9):3486–3499. doi: 10.1044/2023_JSLHR-22-00573

Bilingual Vocabulary Assessment: Examining Single-Language, Conceptual, and Total Scoring Approaches

Lisa Fitton a,, J Marc Goodrich b, Lauren Thayer b, Amy Pratt c, Rose Luna a
PMCID: PMC10558146  PMID: 37541317

Abstract

Purpose:

This study explored approaches for measuring vocabulary among bilingual children with varying levels of proficiency in Spanish and English.

Method:

One hundred fifteen kindergarten and first-grade Spanish–English-speaking children completed measures of vocabulary and sentence repetition in Spanish and English. Scores were derived from their responses to the vocabulary measure: Spanish-only vocabulary, English-only vocabulary, conceptual vocabulary, and total vocabulary. Best language sentence repetition was also obtained. Using both visualization of data and statistical analysis, we tested for potential associations between children's relative language skills in Spanish and English and the scores they received on each of the vocabulary metrics.

Results:

Participants' single-language vocabulary scores were linearly associated with their relative language scores. Higher relative Spanish language skills corresponded with higher Spanish-only vocabulary scores, and higher English language skills corresponded with higher English-only vocabulary scores. A quadratic association between children's relative language and their conceptual vocabulary scores was observed. Children with more balanced skills in Spanish and English received lower scores for conceptual vocabulary. No association between total vocabulary and relative language was observed.

Conclusions:

Results revealed evidence of differential test bias for single-language vocabulary scores and conceptual vocabulary scores. Spanish-only vocabulary underestimated knowledge of participants with higher English proficiency, whereas English-only vocabulary underestimated knowledge of participants with higher Spanish proficiency. Conceptual scoring yielded lower values for participants with relatively balanced proficiency in Spanish and English. There is need for further consideration of score and test functioning across the full continuum of bilinguals with dynamic proficiencies in each of their languages.

Supplemental Material:

https://doi.org/10.23641/asha.23796330


To gain an accurate picture of a bilingual child's overall language ability, assessment and evaluation should account for all the languages that the child speaks (Bedore & Peña, 2008; Pieretti & Roseberry-McKibbin, 2016). Failure to measure children's skills across languages can lead to both under- and overestimation of bilinguals' language ability (Samson & Lesaux, 2009; Sullivan, 2011). These inaccuracies can, in turn, lead to practical consequences such as misunderstanding of child progress; flawed inferences in research; and, in the context of diagnostic assessment, disproportionate referral to special education and other support services (Morgan et al., 2015; Sullivan, 2011). There are few assessments that are standardized for bilingual children, and it is inappropriate to draw conclusions about a bilingual child's overall language proficiency based on scores from a monolingual assessment (Bedore et al., 2012; Guzman-Orth et al., 2017). To have an accurate measure of total language ability, assessment must simultaneously consider multiple aspects of oral language in all the child's languages.

Factors unique to bilingualism, such as the relative level of proficiency in each language, must be considered during assessment. Although relative language proficiency is typically conceptualized as a dichotomy (i.e., dominant vs. balanced; Bitetti et al., 2020), language balance can be more accurately conceptualized as a continuum, with each bilingual child falling somewhere along it rather than categorized into two groups. Indeed, evidence indicates that the nature of language balance shifts over time and may vary across different areas of language for many bilingual children (Côté et al., 2022; Oppenheim et al., 2020; Place & Hoff, 2011). Given that language assessments are critical to measuring students' knowledge and growth over time, with implications for instructional planning and intervention, research is needed to understand the degree to which scores derived from language assessments are impacted by factors such as relative proficiency in the two languages.

Common Bilingual Scoring Approaches

Bilinguals' language knowledge is distributed across their two languages. This is particularly evident in the domain of vocabulary, as a bilingual may know some words in their first language (L1; Spanish for this study), some words in their second language (L2; English for this study), and some words in both. Theoretical accounts of bilinguals' distributed language knowledge (e.g., Grosjean, 1998, 2008) explain word-learning differences as a natural consequence of bilinguals using each of their languages for different purposes and with different people. A central theme of bilingualism research is understanding how to quantify this knowledge (Arias & Friberg, 2017; Bedore & Peña, 2008; Kohnert, 2010). The extent to which common vocabulary assessments reflect the full repertoire of bilinguals' language knowledge varies with respect to how the assessment accounts for children's performance across languages.

Single-Language Scores

Single-language scoring is used when a bilingual is tested in only one of their two languages. A single-language score may underestimate bilinguals' knowledge if the assessor assumes that the child's language skills in one language represent their total language knowledge (Anaya et al., 2018; Bedore & Peña, 2008). It can also lead to overestimation of bilinguals' knowledge if the assessor assumes that the child's skills in the untested language are stronger than they actually are (Goodrich et al., 2022). For example, many bilingual children in the United States enter kindergarten with low proficiency in English, due to limited exposure. However, without assessment of the home language, it cannot be assumed that vocabulary knowledge in the home language is typical even if data indicate the child has been consistently exposed to that language from birth.

Total Scores

Another approach calls for summing children's vocabulary in each language. This strategy may yield accurate estimates of bilinguals' overall vocabulary knowledge prior to the age of 3 years (Core et al., 2013; Pearson et al., 1993). A total language scoring approach is intuitive in that it awards credit based on total skills across both languages, although there are no adjustments for the degree of overlap. To our knowledge, there are currently no normative assessments based on total scores. This makes total scoring somewhat impractical for diagnostic evaluation. However, the total language scoring approach may be valuable for considering bilingual learners holistically and may be useful in monitoring progress over time, although current evidence in this area is limited (Oh & Mancilla-Martinez, 2021).

Conceptual Scoring

Similar to total scores, conceptual scoring incorporates children's responses in both languages to vocabulary items. Conceptual scoring accounts for the number of named concepts a child has in any language. This approach is often used with bilingual assessments that have translated forms in each language, which allow for direct matching of concepts that might be shared across languages (e.g., the Expressive One-Word Picture Vocabulary Test: Spanish-Bilingual Edition [EOWPVT: SBE; Martin, 2013]). Tests that utilize conceptual scoring allow the child to respond in the other language after an incorrect response in the target language. For example, if an expressive vocabulary target word is “book” and the bilingual child produces an incorrect response (e.g., “paper”), the administrator would then ask the child to name the picture in Spanish. The child receives credit for each concept known rather than for a specific response in each language. Conceptual scoring has also been applied post hoc when assessments in English and Spanish are parallel in format but not direct translations of each other. The MacArthur Communicative Development Inventories (CDI; Fenson et al., 1994) and its Spanish counterpart, the MacArthur Inventario del Desarrollo de Habilidades Comunicativas (IDHC; Jackson-Maldonado et al., 1992), were originally designed to be administered with English or Spanish monolinguals. Marchman and Martínez-Sussmann (2002) matched items across the vocabulary checklists of the CDI and the IDHC that reflected the same general concept (e.g., dog = perro) to allow for conceptual scoring with bilingual children. Conceptual scoring has been shown to better capture the breadth of bilinguals' vocabulary knowledge and to reduce the differences in scores between monolingual and bilingual children in studies that used the EOWPVT: SBE (Anaya et al., 2018) as well as the CDI and the IDHC (Core et al., 2013).

However, conceptual scoring also has several key limitations that warrant further investigation. First, for assessments that allow for a response in the nontarget language following an incorrect response in the target language, there is evidence of switching costs associated with the suppression and activation of a response in the nontarget language (Gross et al., 2014). This could lead to reduced scores for children who have greater proficiency in the nontarget language compared to the target language. The Expressive One-Word Picture Vocabulary Test–Fourth Edition: Spanish-Bilingual Edition (EOWPVT-4: SBE) standardized protocol addresses this directly through the recommendation to initiate testing in the child's preferred language. However, a pragmatic but potentially limiting adaptation is allowed when the administrator has low proficiency in the child's most preferred language. In this case, administrators may test first in their preferred language, rather than the child's. Furthermore, identifying the most preferred or stronger language may be less effective for limiting the impact of suppression when the child has relatively balanced skills across languages. There is evidence that administering all items separately in each language may yield the most valid scores for Spanish–English bilingual learners in the United States (Anaya et al., 2018; Gross et al., 2014).

Relative Language Knowledge

Even when two languages are acquired simultaneously from birth, bilinguals are often found to demonstrate more advanced knowledge in one language (Silva-Corvalán & Treffers-Daller, 2015). Relative proficiency in one language compared to another is commonly referred to as dominance or differential knowledge (Anaya et al., 2016; Silva-Corvalán & Treffers-Daller, 2015). Language dominance may vary across domains, such that a Spanish–English bilingual learner may demonstrate greater morphosyntactic development in Spanish compared to English but greater semantic skills in English compared to Spanish, or vice versa (Bedore et al., 2012). Relative proficiency in each language can also shift over time. In the United States, children who speak languages other than English at home often enter school with more advanced proficiency in their home language compared to English. Depending on the linguistic environment of their school (i.e., English-only or a form of bilingual education), children's proficiencies tend to shift to align more closely with their linguistic environment (Castilla-Earls et al., 2019; Ronderos et al., 2022).

In the context of bilingual assessment, it is critical to consider whether measurement techniques function consistently for bilingual learners with a range of linguistic backgrounds, relative proficiency levels, and ability levels. If a test consistently over- or underrepresents the skills of individuals based on a characteristic or characteristics unrelated to the attribute the test aims to measure, this is referred to as differential test functioning (Nugent, 2017). Differential test functioning is problematic because it results in a misrepresentation of individuals' skills, which can, in turn, lead to invalid inferences and inappropriate educational decision making.

This Study

The purpose of this work is to evaluate sources of potential variation in bilingual vocabulary scores, to investigate if scoring approaches function differently for bilingual children with varying levels of proficiency in Spanish and English. Specifically, we aim to assess if children's vocabulary scores vary depending on their relative Spanish and English proficiencies (i.e., the degree to which children proficiency is balanced across children's two languages). This work will provide insight into the use and limitations of specific metrics to measure the dual-language vocabulary skills of bilingual learners. We addressed the following research questions:

  1. What is the nature of bilinguals' single-language, total, and conceptual vocabulary scores in English and Spanish?

  2. Are there patterns between students' relative language knowledge across Spanish and English and their single-language, total, or conceptual vocabulary scores?

For the first research question, we hypothesized that all scores would yield normal distributions. However, we anticipated that the mean for single-language standardized scores would be below the normative mean for the EOWPVT-4: SBE, given that single-language scores reflect only part of bilingual learners' vocabulary knowledge. We expected the mean for conceptual scores to be at the normative means for each scale, as this metric is based on bilingual learners' skills across both their languages and aligns with the normative scoring guidelines. Finally, although standardized scores were not obtained for total vocabulary, we expected the mean for total vocabulary raw scores to be higher than the raw scores obtained for single-language and conceptual vocabulary. This final expectation was assumed because of how the total vocabulary score is computed, as the sum of single-language vocabulary.

For the second research question, we hypothesized that learners with more balanced language skills may systematically receive lower scores on conceptual vocabulary compared to bilinguals with less balanced skills across their languages. Our rationale for this hypothesis is that conceptual scoring provides equivalent credit for knowing a concept in both languages compared to just one. Children with limited overlap in their vocabulary knowledge in Spanish and English would correspondingly receive higher scores compared to those who have substantial overlap in their vocabulary across languages.

Method

Participants

Children who participated in this study were included in a larger research project examining relations between bilingual children's oral language and reading skills (Goodrich et al., 2022; Wofford et al., 2022). As a multisite project, this study was approved by the institutional review boards of the University of South Carolina and the University of Nebraska–Lincoln. One hundred fifteen kindergarten and first-grade Spanish–English bilingual students participated in this study. To ensure a heterogeneous sample of bilingual students (rather than only recruiting students identified as English learners; Guzman-Orth et al., 2017), any child who spoke Spanish at home and was reported to be exposed to both Spanish and English regularly was included in this study. The majority (n = 76; 66.1%) were recruited from participating school sites in South Carolina, and the remaining participants were recruited from participating sites in Nebraska (n = 39; 33.9%). All children were enrolled in English-only education programs. Approximately 73% of participants were kindergartners (n = 84), and 27% were in first grade (n = 31). The average age of children in this sample was 6.29 years (SD = 0.68).

Measures

Vocabulary in Spanish and English

To evaluate expressive vocabulary skills, children completed the EOWPVT-4: SBE (Martin, 2013). For this assessment, children are shown pictures/line drawings of various objects and actions and asked to name them (e.g., “What is this? What are they doing?”). All items are scored as correct or incorrect. The standardized approach for administering this assessment requires the tester to determine the child's preferred language and use that as the primary language of test administration. If a child responds to an item incorrectly in the primary language or indicates they do not know that word, they are given an opportunity to respond in the other language. This standardized assessment procedure results in an overall score for children's conceptual vocabulary knowledge across their two languages (Martin, 2013). However, for this study, we were interested in obtaining an overall estimate of children's vocabulary knowledge as well as specific estimates of vocabulary knowledge in Spanish and English. Consequently, we administered the EOWPVT-4: SBE separately in Spanish and English on separate days, with the order of language of administration randomized by child. We then derived total and conceptual vocabulary scores based on children's responses. This approach is suggested to result in greater accuracy for measuring vocabulary knowledge among bilingual children (Anaya et al., 2018; Gross et al., 2014) and is commonly used in research to quantify bilingual children's vocabulary knowledge in each language (Ribot et al., 2018; Zucker et al., 2021). Internal consistency reliability for the EOWPVT-4: SBE is high for kindergarten and first-grade children (α = .95).

Sentence Repetition in Spanish and English

To obtain a more holistic measure of language ability, children also completed the English and Spanish sentence repetition tasks of the Bilingual English–Spanish Assessment (BESA; Peña et al., 2014). Sentence repetition is considered an assessment of morphosyntactic knowledge (Polišenská et al., 2015; Pratt et al., 2020; Seeff-Gabriel et al., 2010). For this test, children hear an examiner read nine to 10 sentences and are asked to repeat them back to the examiner verbatim. For each sentence, specific grammatical forms are scored, capturing children's verbal comprehension and ability to use different morphosyntactic components of language. Results of recent research indicate that performance on this task is a good indicator of developmental language disorder in bilingual children (Pratt et al., 2020). Internal consistency reliability for the BESA sentence repetition task is high in Spanish (Cronbach's α = .96) and English (Cronbach's α = .95).

Procedure

All assessments were conducted in a quiet area of children's elementary schools, minimizing disruption to normal school practices. Testing was conducted by trained graduate and undergraduate research assistants with native or near-native proficiency in the language of administration. The order of administration of Spanish and English assessments was randomized across children to avoid practice effects. Spanish and English assessment sessions lasted approximately 30–45 min each and were conducted on separate days for each child to avoid fatigue and/or confusion about the language in which the child should respond.

COVID-19 Context

Data collection for this project was abruptly stopped due to school closures at the onset of the COVID-19 pandemic. This resulted in some missing data within the sample. Some children completed assessments in one language prior to school closures but had not yet completed testing in the other language. This was most notable for the sentence repetition task, which had not been completed in both languages for 17 participants.

Scoring

EOWPVT-4: SBE scores. Several indices were computed to address the research questions. First, from children's responses to the EOWPVT-4: SBE, we obtained scores for Spanish-only, English-only, total, and conceptual vocabulary knowledge. Single-language raw and standardized scores were obtained based on participants' responses to vocabulary items in each language. For the total vocabulary score, children were given double credit for knowing the word in both languages (i.e., raw scores from the Spanish-only and English-only EOWPVT-4: SBE administrations were added together). No standardized score was obtained for total vocabulary. For conceptual vocabulary raw and standardized scores, children were awarded credit for identifying each target word correctly in either language but were not given double credit for identifying a word correctly in both languages.

BESA sentence repetition scores. We followed the BESA manual guidelines for computing raw and standardized scores for Spanish-only, English-only, and best language sentence repetition (Peña et al., 2014). Participants were given credit for correct repetitions of target words included on the Spanish and English versions of the task. Responses were first used to obtain single-language scores for Spanish and English. The best language score was then identified as the highest score the child received for either language.

Relative language scores. Scores reflecting participants' relative language knowledge in Spanish and English were obtained based on their responses to both the EOWPVT-4: SBE and the BESA sentence repetition tasks. First, relative scores were computed separately for each task. For the EOWPVT-4: SBE, we subtracted participants' English vocabulary raw scores from their Spanish vocabulary raw scores to generate a relative vocabulary knowledge score. Children with equal vocabulary scores in both languages received a score of zero, whereas children with higher Spanish vocabulary received scores above zero, and children with higher English vocabulary received scores below zero. To obtain a relative score for the BESA sentence repetition tasks, we first computed percent accuracy based on children's Spanish and English raw scores to account for the differences in the number of items on the Spanish and English versions of the assessments (i.e., 37 for Spanish compared to 33 for English). We then subtracted participants' percent accuracy in English from their percent accuracy in Spanish to generate the relative sentence repetition score.

These EOWPVT-4: SBE and BESA sentence repetition relative scores were z-scored to ensure similar weighting. We averaged the z-scored relative EOWPVT-4: SBE and BESA sentence repetition scores to yield a single relative language score for each child. In the case where a child had missing data for the BESA sentence repetition tasks, the child's EOWPVT-4: SBE relative score served as the sole indicator for the relative language score. The resulting index served as a proxy for the relative balance of participants' Spanish and English language knowledge. Table 1 provides examples of each scoring approach.

Table 1.

Examples of scores obtained from participant data.

Scores obtained EOWPVT-4: SBE
BESA sentence repetition
Example 1 Example 2 Example 1 Example 2
Single language
 Spanish raw 45 10 31
(84%) a
19
(51%) a
 English raw 15 20 23
(70%) a
28
(85%) a
Based on both languages
 Total 60 30
 Conceptual (Based on item responses)
 Best language 84 85
 Relative 30 −10 14 −34

Note. The included examples are all based on raw scores. Em dashes indicate data not obtained. EOWPVT-4: SBE = Expressive One-Word Picture Vocabulary Test–Fourth Edition: Spanish-Bilingual Edition; BESA = Bilingual English–Spanish Assessment.

a

Percent accuracy values, which were used to compute the relative sentence repetition scores.

Analytic Plan

All analyses were conducted in R (R Core Team, 2021) using base R functions, ggplot2 (Wickham, 2016), and psych (Revelle, 2021). To address the first research question, we computed univariate descriptive statistics, including means, standard deviations, minimum and maximum values, and generated histograms for each vocabulary scoring approach. For the second research question, which focused on investigating potential patterns between children's relative language scores and their single-language, total, and conceptual vocabulary scores, we produced and inspected bivariate scatter plots and Pearson correlation coefficients (r). We then conducted linear regression analyses including children's average relative language score as a predictor with linear and quadratic terms, with separate models for single-language, conceptual, and total vocabulary scores as outcomes. We examined the residuals for evidence of model misfit. The null hypothesis was that there would be no evidence of patterns in the relations between the children's relative language scores and any of their vocabulary scores. Criteria to reject this hypothesis for any measure included (a) nonrandom patterns in the scatter plots (e.g., linear association); (b) statistically significant linear and/or quadratic terms in the linear regression models at p < .05; and/or (c) substantial R 2 values for variance in test scores explained by average relative language scores, which we defined as R 2 above .10.

Results

Descriptive data, including missing-data rates, means, standard deviations, and minimum and maximum values, for each measure are provided in Table 2. All 115 participants had complete data for the EOWPVT-4: SBE, yielding raw scores for Spanish-only, English-only, total, conceptual, and relative vocabulary scores. For single-language vocabulary, a small number of participants received standardized scores outside the scorable range for their age (i.e., < 55). This occurred for nine participants' Spanish-only standardized vocabulary scores and nine participants' English-only standardized vocabulary scores. One participant scored below the normative range in both Spanish-only and English-only vocabulary but performed within the normative range for conceptual vocabulary. This was unsurprising given that the EOWPVT-4: SBE is designed to provide conceptual vocabulary normative scores, not single-language normative scores. To address this concern, we conducted statistical analyses twice: once for treating the single-language values of < 55 as missing and once for replacing < 55 with the numeric value 55. Findings did not differ substantially. Therefore, we report results based on the replaced values (i.e., < 55 = 55).

Table 2.

Descriptive data: participant scores.

Measure Scoring approach Metric n % missing M SD Min Max
Vocabulary (EOWPVT-4: SBE) Spanish only Standardized score 106 8 80.94 17.91 55 129
(Raw score) 115 30.19 17.30 0 73
English only Standardized score 106 8 97.08 18.39 55 145
(Raw score) 115 44.56 21.39 0 89
Conceptual Standardized score 115 102.17 16.51 55 145
(Raw score) 115 52.74 16.86 12 94
Total (Raw score) 115 74.75 26.87 12 137
Sentence repetition (BESA) Spanish only Standardized score 93 19 87.58 16.84 55 120
(Raw score) 103 10 19.46 9.54 0 36
(% accuracy) 103 10 53 26 0 97
English only Standardized score 93 19 91.02 17.81 55 115
(Raw score) 108 6 22.69 9.01 0 33
(% accuracy) 108 6 69 27 0 100
Best language Standardized score 91 21 97.91 14.87 60 120
(% accuracy) 98 15 76 22 3 100
Relative scores Vocabulary Spanish–English raw 115 −14.37 28.14 −79 66
Sentence rep Spanish–English accuracy 98 15 −16.02 33.37 −91.24 82.8
Overall Average of z-scored vocabulary + sentence rep 115 −0.01 0.96 0.87 −2.28

Note. Reported standardized score values for the EOWPVT-4: SBE are based on data within the normative range, whereas values presented in text are based on the replacement of “< 55” values with “55,” as discussed at the beginning of the Results section. Em dashes indicate no missing data; Min = minimum; Max = maximum; EOWPVT-4: SBE = Expressive One-Word Picture Vocabulary Test–Fourth Edition: Spanish-Bilingual Edition (Martin, 2013); BESA = Bilingual English–Spanish Assessment (Peña et al., 2014); rep = repetition.

Descriptive Results for Bilingual Vocabulary Scores

Of the 115 participants with complete EOWPVT-4: SBE data, 98 also completed both Spanish and English sentence repetition tasks on the BESA. Five additional participants completed only the Spanish sentence repetition task, and 10 completed only the English sentence repetition task, allowing for the computation of single-language scores but no scores that were based on both languages (i.e., best language scores). Several first-grade participants (n = 10) were outside the normative age range when they completed the task. None of these children reached the ceiling for the measure; therefore, we elected to include their raw and accuracy scores, but standardized scores were not computed.

Inspection of histograms indicated that the single-language vocabulary metrics were not normally distributed. Spanish-only vocabulary was positively skewed, and English-only vocabulary was overall platykurtic (see Supplemental Material S1). The sample means for the single-language standardized scores were also below the normative mean for the EOWPVT-4: SBE (i.e., M = 100, SD = 15), although the difference was more substantial for Spanish-only vocabulary than for English-only vocabulary: for Spanish, M = 78.91 (SD = 18.56); for English, M = 93.79 (SD = 20.99). Conversely, examination of histograms for total and conceptual vocabulary scores revealed generally normal distributions. Raw scores for total vocabulary were higher than the raw scores observed for single-language and conceptual vocabulary (see Table 2). The sample mean for conceptual vocabulary standardized scores was slightly above the normative mean (M = 102.17, SD = 16.51) and significantly higher than both Spanish-only vocabulary, t(114) = −13.31, p < .001, d = 1.11, and English-only vocabulary, t(114) = −6.15, p < .001, d = 0.43. Participants scored significantly higher on English-only vocabulary as compared to Spanish-only vocabulary, t(114) = −5.40, p < .001, d = 0.70.

Relative Language and Bilingual Vocabulary Scores

Bivariate scatter plots depicting associations between children's relative language scores and their vocabulary scores are provided in Figure 1 (raw scores) and Figure 2 (standardized scores). For data visualization purposes, we color-coded the data based on children's relative language scores. These labels were used to facilitate visualization but were not used in any statistical analyses, given the challenges of accurately classifying bilingual learners (Abedi, 2008; Goodrich et al., 2022).

Figure 1.

4 scatterplots. In all the 4 graphs, the x axis is labeled Relative Language Scores, Spanish, hyphen, English and it ranges from negative 2 to 2 in increments of 2. A. The title of the first plot is Spanish Only Vocabulary and Relative Spanish English Language. The y axis is labeled Spanish Only Vocabulary Raw Score and it ranges from 0 to 60 in increments of 20. Green points are distributed between x values of negative 2 and negative 0.5, and between y values of 0 and 50. Red points are distributed between x values of negative 0.5 and 0.25, and between y values of 10 and 50. Blue points are distributed between x values of 0.25 and 2 and between y values of 10 and 60. The line of best fit runs between (negative 2, 5) and (3, 60). B. The title of the second plot is English only vocabulary and Relative Spanish English Language. The y axis is labeled English Only Vocabulary Raw Score and it ranges from 0 to 75 in increments of 25. Green points are distributed between x values of negative 2 and negative 0.5 and between y values of 25 and 75. Red points are distributed between x values of negative 0.25 and 0.25 and between y values of 0 and 75. Blue points are distributed between x values of 0.25 and 2 and between y values of 0 and 60. The line of best fit has a negative slope and runs between (negative 2.5, 80) and (2, 12.5). C. The title of the graph is Conceptual Vocabulary and Relative Spanish English Language. The y axis is labeled Conceptual Vocabulary Raw Score and it ranges from 25 to 75 in increments of 25. Blue points are distributed between x values of negative 2 and negative 0.2, and between y values of 25 and 80. Red points are distributed between x values of negative 0.2 and 0.2 and between y values of 25 and 75. Blue points are distributed between x values of 0.5 and 2 and between y values of 20 and 75. The curve of best fit to the data points starts at (negative 2, 82), reaches a minimum value near (0.5, 48), and ends at (3, 62). D. The title of the fourth graph is Total Vocabulary and Relative Spanish English Language. The y axis is labeled Total Vocabulary Raw Score and it ranges from 0 to 100 in increments of 50. Green points are distributed between x values of negative 2 and negative 0.25 and between y values of 25 and 125. Red points are distributed between x values of negative 0.25 and 0.2 and between y values of 10 and 125. Blue points are distributed between x values of 0.25 and 2 and between y values of 10 and 110. The line of best fit has a negative slope and it runs between (negative 2, 75), and (3, 55). The legend for Primary Language is as follows. Red points: Balanced. Green points: English. Blue points: Spanish. All values are estimated.

Bivariate scatter plots depicting associations between children's vocabulary raw scores and relative language scores.

Figure 2.

4 scatterplots. In all the 4 graphs, the x axis is labeled Relative Language Scores, Spanish, hyphen, English and it ranges from negative 2 to 2 in increments of 2. E. The title of the first graph is Spanish Only Vocabulary and Relative Spanish English Language. The y axis is labeled Spanish Only Vocabulary Standardized Score and it ranges from 50 to 125. Green points are distributed between x values of negative 2 and negative 0.25 and between y values of 55 and 100. Red points are distributed between x values of negative 0.25 and 0.2 and between y values of 55 and 80. Blue points are distributed between x values of 0.2 and 2 and between y values of 55 and 110. The line of best fit runs between (negative 1.8, 55) and (3, 115). F. The title of the second graph is English Only Vocabulary and Relative Spanish English Language. The y axis is labeled English Only Vocabulary Standardized Score and it ranges from 55 to 150. Green points are distributed between x values of negative 2 and negative 0.5 and between y values of 75 and 130. Red points are distributed between x values of negative 0.5 and 0.2 and between y values of 80 and 120. Blue points are distributed between x values of 0.2 and 2 and between y values of 55 and 120. The line of best fit has a negative slope and it runs between (negative 2.2, 128), and (2, 60). G. The title of the third graph is Conceptual Vocabulary and Relative Spanish English Language. The y axis is labeled Conceptual Vocabulary Standardized Score and it ranges from 50 to 150. Green points are distributed between x values of negative 2 and negative 0.25 and between y values of 75 and 130. Red points are distributed between x values of negative 0.25 and 0.25 and between y values of 75 and 120. Blue points are distributed between x values of 0.25 and 2 and between y values of 65 and 120. The curve that is fitted to the data points starts at (negative 2, 135), reaches a minimum value near (0.5, 95), and ends at (3, 125). H. The title of the fourth graph is Best Language S R and Relative Spanish English Language. The y axis is labeled Best Language S R Standardized Score and it ranges from 50 to 125. Green points are distributed between x values of negative 2 and negative 0.25 and between y values of 60 and 115. Red points are distributed between x values of negative 0.25 and 0.25 and between y values of 75 and 115. Blue points are distributed between x values of 0.25 and 3 and between y values of 75 and 115. The curve that is fitted to the data points starts at (negative 2, 110), reaches a minimum value at (0, 95), and ends at (3, 120). The legend for Primary Language is as follows. Red points: Balanced. Green points: English. Blue points: Spanish. All values are estimated.

Bivariate scatter plots depicting associations between children's relative language scores and their vocabulary standardized scores and sentence repetition scores.

As shown in Figures 1A and 1B, there was evidence of linear associations between children's single-language vocabulary scores and their relative language scores. Higher relative language scores, which reflected greater Spanish proficiency compared to English proficiency, corresponded with higher scores for Spanish-only vocabulary. This was further supported by a significant bivariate correlation (r = .65, 95% CI [.53, .75], p < .001) and regression results indicative of a positive linear relation between relative language and Spanish-only vocabulary scores (see Table 3, top row). Similarly, lower relative language scores, which reflected greater English proficiency compared to Spanish proficiency, corresponded with higher scores for English-only vocabulary. Results from correlational analyses (r = −.74, 95% CI [−.81, −.64], p < .001) and linear regression (see Table 3, top row) supported this interpretation of the scatter plot. Findings were consistent across both raw and standardized single-language scores (see Figures 2E and 2F).

Table 3.

Results from linear regression analyses.

Predictor Spanish-only vocabulary
English-only vocabulary
Conceptual vocabulary
Estimate 95% CI p Estimate 95% CI p Estimate 95% CI p
(Intercept) 78.39 [75.36, 81.42] < .001 93.95 [90.79, 97.12] < .001 97.69 [94.39, 100.99] < .001
Relative language 13.03 [10.30, 15.76] < .001 −16.13 [−18.98, 13.28] < .001 −5.80 [−8.77, −2.82] < .001
Quadratic: relative lang 0.71 [−1.15, 2.57] .452 −0.35 [−2.29, 1.60] .726 4.87 [2.84, 6.90] < .001
R 2/R 2 adjusted
.472/.467
.552/.544
.211/.197

Total vocabulary (raw score)
Best language sentence rep



Predictor
Estimate
95% CI
p
Estimate
95% CI
p



(Intercept) −1.22 [−45.93, 43.49] .957 94.57 [91.02, 98.12] < .001
Relative language −3.09 [−8.32, 2.13] .244 −2.17 [−5.35, 1.01] .178
Quadratic: relative lang 0.39 [−3.12, 3.90] .828 3.53 [1.45, 5.61] .001
Age (covariate) 1.00 [0.42, 1.58] .001
R 2/R 2 adjusted .120/.096 .115/.095

Note. Results reported are based on standardized scores for each outcome, except for total vocabulary, which is reported as a raw score. To account for this, age was included as a covariate in the model with total vocabulary as outcome. Significant findings (p < .01) are bolded. CI = confidence interval; lang = language; rep = repetition.

There was no evidence of associations between children's relative language scores and their total vocabulary scores. The scatter plot shown in Figure 1D was generally random. Neither correlational nor regression analyses revealed evidence of associations (r = −.17, 95% CI [−.81, −.64], p = .070). Regression results are provided in the bottom row of Table 3. Age was included as a covariate in the model, given that no normative scores are available for total vocabulary.

Finally, bivariate scatter plots revealed evidence of a quadratic trend in the association between children's relative language scores and their conceptual vocabulary scores (see Figures 1C and 2G). Children with relative language scores near zero, indicative of generally balanced skills across Spanish and English, tended to receive lower conceptual vocabulary scores compared to children with higher or lower relative language scores. This trend was not evident in the linear correlation coefficients obtained (for raw conceptual vocabulary: r = −.31, 95% CI [−.46, −.12], p = .001; for standardized conceptual vocabulary: r = −.23, 95% CI [−.39, −.05], p = .015), but regression revealed significant linear and quadratic terms (see Table 3, top row, last column).

Post Hoc Analysis: Best Language Sentence Repetition

One possible explanation for the finding that children with relatively balanced language skills in Spanish and English tended to receive lower scores for both raw and standardized conceptual vocabulary is sampling error. It is possible that, within our participant sample drawn from schools in South Carolina and Nebraska, our recruited participants with more balanced Spanish–English skills tended to have lower language ability (and, correspondingly, vocabulary) compared to participants with less balanced skills in each language. We sought to test this potential explanation directly by evaluating children's BESA best language sentence repetition scores in relation to their relative language scores. We selected best language sentence repetition as a follow-up measure because of the high diagnostic accuracy it yields for Spanish–English-speaking children (Peña et al., 2014). Studies indicate that it is a highly reliable measure (Fitton et al., 2019) with strong evidence for its validity as a measure of language (Pratt et al., 2020).

We conducted these post hoc analyses using the same procedures as were used for the vocabulary metrics, but with standardized best language sentence repetition as the outcome of interest. The bivariate scatter plot is provided in Figure 2H. The correlation between relative language and best language sentence repetition was not significant (r = −.03, 95% CI [−.23, .18], p = .808). Regression analyses revealed a nonsignificant linear trend, but the quadratic term was statistically significant (3.53, 95% CI [1.45, 5.61], p = .001); however, this quadratic effect appears to be driven by higher scores for children with relatively stronger proficiency in Spanish rather than by lower scores for children with relatively balanced bilingual proficiency (see Table 3, bottom row). No evidence suggested that children with balanced bilingual proficiency had lower language ability. Overall, results of post hoc analyses indicated that nonlinear relations between conceptual vocabulary scores and relative language proficiency were not due to sampling error but are rather attributable to systematic differences in how conceptual vocabulary scores function among bilingual learners.

Discussion

The primary purpose of this study was to evaluate different scoring approaches for assessing bilingual children's vocabulary skills, focusing on how scores derived using different approaches related to children's relative language proficiency across Spanish and English. Overall, consistent with hypotheses and prior research (Arias & Friberg, 2017; Bedore & Peña, 2008; Bedore et al., 2005), results indicated that scores derived from single-language assessments alone underestimated children's overall vocabulary knowledge when compared to conceptual and total scores. However, results of this study provide preliminary evidence that conceptual scoring may function differently for children with different relative levels of proficiency in English and Spanish. Specifically, conceptual scoring may underestimate the vocabulary knowledge of children who have relatively balanced English and Spanish proficiency and who have more overlap in their lexicons. Although preliminary, these results have potentially important implications for assessment of oral language skills among young Spanish–English bilingual children in the United States.

Single-Language Vocabulary Scoring

Descriptive findings, including means and standard deviations, for conceptual vocabulary were generally consistent with prior literature focusing on Spanish–English-speaking children in kindergarten or first grade (Language and Reading Research Consortium et al., 2021; Mancilla-Martinez et al., 2020; Simon-Cereijido & Méndez, 2018). To our knowledge, there are two prior studies that have examined single-language and conceptual scoring using versions of the EOWPVT: SBE (Anaya et al., 2018; Gross et al., 2014). Both studies obtained similar descriptive results for conceptual vocabulary, although Anaya et al. (2018) evaluated scores separately for children identified as having specific language impairment (M = 96.74, SD = 16.31) compared to children with typical development (M = 112.58, SD = 16.15).

Substantial differences were notable in the single-language scores, however. Participants in our sample and in the sample recruited by Gross et al. (2014) tended to score higher on English-only vocabulary compared to Spanish-only vocabulary, whereas the opposite trend was observed by Anaya et al. (2018). This may be due to differences in bilingual education. Our participants were all enrolled in English-only education programs, whereas students in the study by Anaya et al. (2018) were all in bilingual education. Alternatively, some differences across studies may be due to differences in age of exposure to an L2. The sample in Gross et al. included a substantial percentage of students enrolled in bilingual education programs and was divided into simultaneous and sequential bilingual students. Although we did not have data on whether children in our sample were simultaneous or sequential bilinguals, our data from children in English-only instructional environments suggest a rather large difference across English and Spanish vocabulary (more than a standard deviation difference). In the Gross et al. study, the gap favoring English vocabulary was larger for simultaneous bilingual children (more than a standard deviation difference) than for sequential bilingual children (approximately half of 1 SD). Gross et al. reported that, in addition to having earlier exposure to and use of English, children in the simultaneous bilingual group had more current exposure to English, had more use of English at home, and more frequently preferred to use English than did children in the sequential bilingual group. This pattern of results is consistent with our hypothesis that differences across studies are due to language exposure, whether at home or at school. Overall, these results underscore the importance of conducting assessment in both languages for bilingual learners. Single-language vocabulary assessment is not an accurate indicator of overall knowledge across bilinguals' languages.

Inspection of histograms further supports the need for bilingual assessment. The distribution of conceptual and total vocabulary raw scores was normal, as would be expected in child language research. However, children's single-language vocabulary raw scores were not normally distributed. More participants scored below the sample mean than above for Spanish-only vocabulary, and a generally even number of participants received scores at each point of the distribution for English-only vocabulary. Given that population-level overall language skills are assumed to follow a normal distribution, these findings suggest that critical information is missing when children's vocabulary is measured in only one language. Single-language vocabulary is not simply a downward shift of the normal distribution but rather an incomplete picture of children's skills. As observed in the single-language score scatter plots, different children across the continuum of relative bilingual language are disadvantaged when single-language scores are used as a metric for overall knowledge. For children with stronger English language skills, Spanish-only vocabulary scores underestimate their total vocabulary knowledge, whereas for children with stronger Spanish language skills, English-only vocabulary scores underestimate their knowledge. Measurement in both languages is necessary to gain an accurate picture of bilingual children's overall skills.

Conceptual Vocabulary Scores and Relative Language

The relative balance of knowledge and skills across languages is a critical component of bilingual language development. As previously noted, bilingual learners' proficiencies across each of their languages are dynamic and may shift over time depending on the child's linguistic and social environments (Bedore et al., 2016; Ronderos et al., 2022). For a scoring metric to be valid and reliable across the population of bilingual learners, it is critical to consider whether the measure functions consistently for bilinguals with a range of relative proficiencies in their languages. In this work, we investigated potential differential test functioning by participants' relative proficiencies across Spanish and English. Results suggest that both single-language and conceptual scores depend in part on bilinguals' relative proficiencies in each language. Whereas single-language scores yield lower scores for children with more disparate proficiencies, conceptual scores appear to disadvantage children with more balanced proficiencies across Spanish and English. We observed a systematic trend where children with relatively balanced language skills in Spanish and English received lower conceptual vocabulary scores compared to children with more disparate skills in Spanish and English. This was evident both visually, as shown in scatter plots, and statistically, where a significant quadratic trend was identified.

We considered the possibility that this finding was driven by sampling error. Given the relatively small sample size, it is possible that the children with balanced levels of proficiency in Spanish and English in our sample simply had lower overall language proficiency than other children in the sample. If this were the case, the trend we observed could be wholly attributable to the characteristics of the sample and was not a result of differential test functioning. To address this potential limitation, we conducted post hoc analyses to examine the relation between best language sentence repetition scores, which are considered a strong indicator of language disorder among Spanish–English bilinguals (Pratt et al., 2020), and relative language scores. If the quadratic trend were attributable to sampling bias alone, we would have expected results mirroring those observed for conceptual vocabulary. Instead, we found limited evidence of a relation between best language sentence repetition and relative language scores. Despite the statistically significant curvilinear trend in the relation between relative language proficiency and best language sentence repetition scores, an examination of Figure 2H indicated that this trend was driven by higher scores for the children with substantially higher Spanish language scores compared to English language scores rather than by lower scores among children with more balanced Spanish–English skills. Additionally, the variance accounted for in conceptual vocabulary (19.7%) was approximately twice as large as the variance accounted for in best language sentence repetition (9.5%). We also explored the relationship between relative proficiency and total vocabulary. Total vocabulary scores reflect the total correct labels provided by a child, whereas conceptual vocabulary scores reflect only the total unique concepts the child can identify in either (or both) language(s). Similar to the findings from best language sentence repetition, no trend was observed between total vocabulary and relative language scores. Neither statistical nor visual analyses suggested any association. This provides further evidence indicating that sampling bias alone does not explain the findings.

Instead, the disadvantage of conceptual scoring for more balanced bilingual learners is likely explained by a greater degree of overlap across the Spanish and English lexicons of these children. In a conceptual scoring paradigm, bilingual learners who can label a picture in both Spanish and English receive the same score for that item as bilinguals who can label the picture in Spanish only or in English only. Contrast this with a total scoring paradigm, in which a bilingual learner who can label a picture in both Spanish and English receives double the score for that item as compared to their peer who can label the picture in Spanish only or in English only. Thus, in order to achieve higher scores on a vocabulary test that uses conceptual scoring, like the EOWPVT, children with more overlap in the words they know across languages must provide more correct labels, as compared to children with minimal overlap. Previous studies with bilingual toddlers have noted that this lexical overlap is highly variable, is not accounted for in conceptual scoring, and often results in underestimation of a bilingual child's lexical knowledge (Core et al., 2013).

This may be important to consider in future measurement research focusing on bilingual learners. There is need to investigate whether unique credit for each label (i.e., total scores), credit for each concept (i.e., conceptual scores), or other unexplored alternatives (e.g., partial credit for words known in both languages) may yield valid, reliable scores accounting for bilingual learners' knowledge across both their languages. Additionally, assessing bilingual children using a larger item set may lead to additional differentiation in skills, although this merits future research. Although the focus of this study is on vocabulary specifically, it is worth considering these questions for assessing other areas of language as well.

The EOWPVT-4: SBE is not recommended as a diagnostic measure because it focuses solely on vocabulary; however, it is commonly used in research and educational practice and is one of the few standardized assessments designed and normed for Spanish–English bilingual learners across a large age range (Oh & Mancilla-Martinez, 2021). Our study suggests that underestimation of vocabulary skills is likely for specific subgroups of bilingual learners. This finding can support researchers in avoiding magnifying underestimation of bilingual learners' overall vocabulary and practitioners in interpreting vocabulary scores with appropriate consideration of different scores' strengths and weaknesses.

Limitations

Data collection for this study was interrupted due to the onset of the COVID-19 pandemic. Consequently, there was a relatively large amount of missing data for a cross-sectional study (e.g., approximately 15% of participants have missing data for at least one of the sentence repetition measures). Additionally, the context of the COVID-19 pandemic may limit the extent to which the findings of this study generalize to current kindergarten and first-grade students. Patterns of exposure to English and Spanish may have been altered by school and child care center closures during the pandemic, which could impact the relative levels of proficiency in English and Spanish for many bilingual children (e.g., many bilingual children may have received less exposure to English during the pandemic, as exposure to English may occur primarily outside the home for many bilingual children in the United States).

Another consideration is the operationalization of relative language skills in Spanish and English. Although our approach was practical and based on children's scores across more than one domain of language (i.e., vocabulary and morphosyntax), it does not fully reflect the dynamic nature of language dominance (Bedore et al., 2012; Silva-Corvalán & Treffers-Daller, 2015). Spanish–English bilingual children may have stronger skills in Spanish for vocabulary but stronger skills in high-level grammar in English (or vice versa), or bilingual children may know some vocabulary terms relevant to the home environment only in Spanish but then learn more academic vocabulary in English. Language dominance also is likely to shift over time, with children's relative proficiencies in each language changing based on environmental language exposure and opportunities for use (Bedore et al., 2012, 2016; Castilla-Earls et al., 2019). There is need to continue investigating how to quantify bilingual learners' relative scores across their language systems.

Finally, it is possible that this pattern of results would not generalize to bilingual children of different ages or language contexts. Our study included only kindergarten and first-grade children, many of whom speak mostly Spanish at home and mostly English at school. It is possible that the observed pattern of scores would not emerge for older children who have received more English exposure at school or children who are growing up in contexts in which there is a greater degree of overlap in the use of each language within the environment (e.g., home contexts in which the two languages are used equally, bilingual education programs at school).

Conclusions

In conclusion, additional research is needed to identify scoring approaches that work similarly for bilingual children across the continuum of relative bilingual proficiency. Otherwise, well-intended scoring approaches specifically designed to account for the unique nature of bilingual language proficiency may disadvantage certain children and lead to underestimation of bilingual learners' overall knowledge and skills. Prior to validation of scoring procedures for use with all bilingual children, our findings suggest that practitioners should evaluate relative proficiency in L1 and L2, and low conceptual scores for balanced bilingual children should be interpreted with caution. Low scores may be indicative of lower underlying knowledge but may also be an artifact of differential test functioning. Importantly, no vocabulary measure, regardless of scoring approach, is solely sufficient for evaluating language ability. Rather, the best available evidence supports the use of diagnostic approaches based on converging evidence (Castilla-Earls et al., 2020).

Data Availability Statement

The data sets generated and/or analyzed during this study are available from the corresponding author on reasonable request.

Supplementary Material

Supplemental Material S1. Histograms and QQ–Plots for the Spanish–only, English–only, and conceptual vocabulary scores. Histograms depict the distributions of each of the scores and QQ–Plots present scores as compared to a theoretical normal distribution.

Acknowledgments

The research reported in this article was supported by an American Speech-Language-Hearing Foundation (ASHFoundation) New Investigators Research Grant, awarded to Lisa Fitton and J. Marc Goodrich, in 2020 and by Eunice Kennedy Shriver National Institute of Child Health and Human Development Award R21HD106072. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the ASHFoundation.

Funding Statement

The research reported in this article was supported by an American Speech-Language-Hearing Foundation (ASHFoundation) New Investigators Research Grant, awarded to Lisa Fitton and J. Marc Goodrich, in 2020 and by Eunice Kennedy Shriver National Institute of Child Health and Human Development Award R21HD106072. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the ASHFoundation.

References

  1. Abedi, J. (2008). Classification system for English language learners: Issues and recommendations. Educational Measurement: Issues and Practice, 27(3), 17–31. 10.1111/j.1745-3992.2008.00125.x [DOI] [Google Scholar]
  2. Anaya, J. B. , Peña, E. D. , & Bedore, L. M. (2016). Where Spanish and English come together: A two dimensional bilingual approach to clinical decision making. Perspectives of the ASHA Special Interest Groups, 1(14), 3–16. 10.1044/persp1.SIG14.3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Anaya, J. B. , Peña, E. D. , & Bedore, L. M. (2018). Conceptual scoring and classification accuracy of vocabulary testing in bilingual children. Language, Speech, and Hearing Services in Schools, 49(1), 85–97. 10.1044/2017_LSHSS-16-0081 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Arias, G. , & Friberg, J. (2017). Bilingual language assessment: Contemporary versus recommended practice in American schools. Language, Speech, and Hearing Services in Schools, 48(1), 1–15. 10.1044/2016_LSHSS-15-0090 [DOI] [PubMed] [Google Scholar]
  5. Bedore, L. M. , & Peña, E. D. (2008). Assessment of bilingual children for identification of language impairment: Current findings and implications for practice. International Journal of Bilingual Education and Bilingualism, 11(1), 1–29. 10.2167/beb392.0 [DOI] [Google Scholar]
  6. Bedore, L. M. , Peña, E. D. , García, M. , & Cortez, C. (2005). Conceptual versus monolingual scoring: When does it make a difference? Language, Speech, and Hearing Services in Schools, 36(3), 188–200. 10.1044/0161-1461(2005/020) [DOI] [PubMed] [Google Scholar]
  7. Bedore, L. M. , Peña, E. D. , Griffin, Z. M. , & Hixon, J. G. (2016). Effects of age of English exposure, current input/output, and grade on bilingual language performance. Journal of Child Language, 43(3), 687–706. 10.1017/S0305000915000811 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bedore, L. M. , Peña, E. D. , Summers, C. L. , Boerger, K. M. , Resendiz, M. D. , Greene, K. , Bohman, T. M. , & Gillam, R. B. (2012). The measure matters: Language dominance profiles across measures in Spanish–English bilingual children. Bilingualism: Language and Cognition, 15(3), 616–629. 10.1017/S1366728912000090 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bitetti, D. , Hammer, C. S. , & López, L. M. (2020). The narrative macrostructure production of Spanish–English bilingual preschoolers: Within- and cross-language relations. Applied Psycholinguistics, 41(1), 79–106. 10.1017/S0142716419000419 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Castilla-Earls, A. , Bedore, L. , Rojas, R. , Fabiano-Smith, L. , Pruitt-Lord, S. , Restrepo, M. A. , & Peña, E. (2020). Beyond scores: Using converging evidence to determine speech and language services eligibility for dual language learners. American Journal of Speech-Language Pathology, 29(3), 1116–1132. 10.1044/2020_AJSLP-19-00179 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Castilla-Earls, A. , Francis, D. , Iglesias, A. , & Davidson, K. (2019). The impact of the Spanish-to-English proficiency shift on the grammaticality of English learners. Journal of Speech, Language, and Hearing Research, 62(6), 1739–1754. 10.1044/2018_JSLHR-L-18-0324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Core, C. , Hoff, E. , Rumiche, R. , & Señor, M. (2013). Total and conceptual vocabulary in Spanish–English bilinguals from 22 to 30 months: Implications for assessment. Journal of Speech, Language, and Hearing Research, 56(5), 1637–1649. 10.1044/1092-4388(2013/11-0044) [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Côté, S. L. , Gonzalez-Barrero, A. M. , & Byers-Heinlein, K. (2022). Multilingual toddlers' vocabulary development in two languages: Comparing bilinguals and trilinguals. Journal of Child Language, 49(1), 114–130. 10.1017/S030500092000077X [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Fenson, L. , Dale, P. S. , Reznick, J. S. , Bates, E. , Thal, D. J. , & Pethick, S. J. (1994). Variability in early communicative development. Monographs of the Society for Research in Child Development, 59(5), i –185. 10.2307/1166093 [DOI] [PubMed] [Google Scholar]
  15. Fitton, L. , Hoge, R. , Petscher, Y. , & Wood, C. (2019). Psychometric evaluation of the Bilingual English–Spanish Assessment sentence repetition task for clinical decision making. Journal of Speech, Language, and Hearing Research, 62(6), 1906–1922. 10.1044/2019_JSLHR-L-18-0354 [DOI] [PubMed] [Google Scholar]
  16. Goodrich, J. M. , Fitton, L. , Chan, J. , & Davis, C. J. (2022). Assessing oral language when screening multilingual children for learning disabilities in reading. Intervention in School and Clinic, 58(3), 164–172. 10.1177/10534512221081264 [DOI] [Google Scholar]
  17. Grosjean, F. (1998). Studying bilinguals: Methodological and conceptual issues. Bilingualism: Language and Cognition, 1(2), 131–149. 10.1017/S136672899800025X [DOI] [Google Scholar]
  18. Grosjean, F. (2008). Studying bilinguals. Oxford University Press. [Google Scholar]
  19. Gross, M. , Buac, M. , & Kaushanskaya, M. (2014). Conceptual scoring of receptive and expressive vocabulary measures in simultaneous and sequential bilingual children. American Journal of Speech-Language Pathology, 23(4), 574–586. 10.1044/2014_AJSLP-13-0026 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Guzman-Orth, D. , Lopez, A. A. , & Tolentino, F. (2017). A framework for the dual language assessment of young dual language learners in the United States. ETS Research Report Series, 2017(1), 1–19. 10.1002/ets2.12165 [DOI] [Google Scholar]
  21. Jackson-Maldonado, D. , Bates, E. , & Thal, D. (1992). Fundación MacArthur: Inventario del Desarrollo de Habilidades Comunicativas. San Diego State University. [Google Scholar]
  22. Kohnert, K. (2010). Bilingual children with primary language impairment: Issues, evidence and implications for clinical actions. Journal of Communication Disorders, 43(6), 456–473. 10.1016/j.jcomdis.2010.02.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Language and Reading Research Consortium, Mesa, C. , & Yeomans-Maldonado, G. (2021). English and Spanish predictors of Grade 3 reading comprehension in bilingual children. Journal of Speech, Language, and Hearing Research, 64(3), 889–908. 10.1044/2020_JSLHR-20-00379 [DOI] [PubMed] [Google Scholar]
  24. Mancilla-Martinez, J. , Hwang, J. K. , Oh, M. H. , & Pokowitz, E. L. (2020). Patterns of development in Spanish–English conceptually scored vocabulary among elementary-age dual language learners. Journal of Speech, Language, and Hearing Research, 63(9), 3084–3099. 10.1044/2020_jslhr-20-00056 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Marchman, V. A. , & Martínez-Sussmann, C. (2002). Concurrent validity of caregiver/parent report measures of language for children who are learning both English and Spanish. Journal of Speech, Language, and Hearing Research, 45(5), 983–997. 10.1044/1092-4388(2002/080) [DOI] [PubMed] [Google Scholar]
  26. Martin, N. A. (2013). Expressive One-Word Picture Vocabulary Test-4: Spanish. Academic Therapy Publications. [Google Scholar]
  27. Morgan, P. L. , Farkas, G. , Hillemeier, M. M. , Mattison, R. , Maczuga, S. , Li, H. , & Cook, M. (2015). Minorities are disproportionately underrepresented in special education: Longitudinal evidence across five disability conditions. Educational Researcher, 44(5), 278–292. 10.3102/0013189X15591157 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Nugent, W. R. (2017). Understanding DIF and DTF: Description, methods, and implications for social work research. Journal of the Society for Social Work and Research, 8(2), 305–334. 10.1086/691525 [DOI] [Google Scholar]
  29. Oh, M. H. , & Mancilla-Martinez, J. (2021). Comparing vocabulary knowledge conceptualizations among Spanish–English dual language learners in a new destination state. Language, Speech, and Hearing Services in Schools, 52(1), 369–382. 10.1044/2020_LSHSS-20-00031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Oppenheim, G. M. , Griffin, Z. , Peña, E. D. , & Bedore, L. M. (2020). Longitudinal evidence for simultaneous bilingual language development with shifting language dominance, and how to explain it. Language Learning, 70(S2), 20–44. 10.1111/lang.12398 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Pearson, B. Z. , Fernández, S. C. , & Oller, D. K. (1993). Lexical development in bilingual infants and toddlers: Comparison to monolingual norms. Language Learning, 43(1), 93–120. 10.1111/j.1467-1770.1993.tb00174.x [DOI] [Google Scholar]
  32. Peña, E. D. , Gutiérrez-Clellen, V. F. , Iglesias, A. , Goldstein, B. , & Bedore, L. M. (2014). Bilingual English–Spanish Assessment (BESA). AR-Clinical Publications. [Google Scholar]
  33. Pieretti, R. A. , & Roseberry-McKibbin, C. (2016). Assessment and intervention for English language learners with primary language impairment: Research-based best practices. Communication Disorders Quarterly, 37(2), 117–128. 10.1177/1525740114566652 [DOI] [Google Scholar]
  34. Place, S. , & Hoff, E. (2011). Properties of dual language exposure that influence 2-year-olds' bilingual proficiency. Child Development, 82(6), 1834–1849. 10.1111/j.1467-8624.2011.01660.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Polišenská, K. , Chiat, S. , & Roy, P. (2015). Sentence repetition: What does the task measure? International Journal of Language & Communication Disorders, 50(1), 106–118. 10.1111/1460-6984.12126 [DOI] [PubMed] [Google Scholar]
  36. Pratt, A. S. , Peña, E. D. , & Bedore, L. M. (2020). Sentence repetition with bilinguals with and without DLD: Differential effects of memory, vocabulary, and exposure. Bilingualism: Language and Cognition, 24(2), 305–318. 10.1017/s1366728920000498 [DOI] [Google Scholar]
  37. R Core Team. (2021). R: A language and environment for statistical computing (R Version 4.1.1) . R Foundation for Statistical Computing. https://www.R-project.org/ [Google Scholar]
  38. Revelle, W. R. (2021). psych: Procedures for personality and psychological research (2.1.6) . Northwestern University. https://CRAN.R-project.org/package=psych [Google Scholar]
  39. Ribot, K. M. , Hoff, E. , & Burridge, A. (2018). Language use contributes to expressive language growth: Evidence from bilingual children. Child Development, 89(3), 929–940. 10.1111/cdev.12770 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Ronderos, J. , Castilla-Earls, A. , & Marissa Ramos, G. (2022). Parental beliefs, language practices and language outcomes in Spanish–English bilingual children. International Journal of Bilingual Education and Bilingualism, 25(7), 2586–2607. 10.1080/13670050.2021.1935439 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Samson, J. F. , & Lesaux, N. K. (2009). Language-minority learners in special education: Rates and predictors of identification for services. Journal of Learning Disabilities, 42(2), 148–162. 10.1177/0022219408326221 [DOI] [PubMed] [Google Scholar]
  42. Seeff-Gabriel, B. , Chiat, S. , & Dodd, B. (2010). Sentence imitation as a tool in identifying expressive morphosyntactic difficulties in children with severe speech difficulties. International Journal of Language & Communication Disorders, 45(6), 691–702. 10.3109/13682820903509432 [DOI] [PubMed] [Google Scholar]
  43. Silva-Corvalán, C. , & Treffers-Daller, J. (2015). Digging into dominance: A closer look at language dominance in bilinguals. In Silva-Corvalán C. & Treffers-Daller J. (Eds.), Language dominance in bilinguals: Issues of measurement and operationalization (pp. 1–14). Cambridge University Press. 10.1017/CBO9781107375345.001 [DOI] [Google Scholar]
  44. Simon-Cereijido, G. , & Méndez, L. I. (2018). Using language-specific and bilingual measures to explore lexical–grammatical links in young Latino dual-language learners. Language, Speech, and Hearing Services in Schools, 49(3), 537–550. 10.1044/2018_LSHSS-17-0058 [DOI] [PubMed] [Google Scholar]
  45. Sullivan, A. L. (2011). Disproportionality in special education identification and placement of English language learners. Exceptional Children, 77(3), 317–334. 10.1177/001440291107700304 [DOI] [Google Scholar]
  46. Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer-Verlag. 10.1007/978-3-319-24277-4 [DOI] [Google Scholar]
  47. Wofford, M. C. , Cano, J. , Goodrich, J. M. , & Fitton, L. (2022). Tell or retell? The role of task and language in Spanish–English narrative microstructure performance. Language, Speech, and Hearing Services in Schools, 53(2), 511–531. 10.1044/2021_lshss-21-00055 [DOI] [PubMed] [Google Scholar]
  48. Zucker, T. A. , Carlo, M. S. , Montroy, J. J. , & Landry, S. H. (2021). Pilot test of the Hablemos Juntos Tier 2 academic language curriculum for Spanish-speaking preschoolers. Early Childhood Research Quarterly, 55, 179–192. 10.1016/J.ECRESQ.2020.11.009 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material S1. Histograms and QQ–Plots for the Spanish–only, English–only, and conceptual vocabulary scores. Histograms depict the distributions of each of the scores and QQ–Plots present scores as compared to a theoretical normal distribution.

Data Availability Statement

The data sets generated and/or analyzed during this study are available from the corresponding author on reasonable request.


Articles from Journal of Speech, Language, and Hearing Research : JSLHR are provided here courtesy of American Speech-Language-Hearing Association

RESOURCES