Skip to main content
Journal of Speech, Language, and Hearing Research : JSLHR logoLink to Journal of Speech, Language, and Hearing Research : JSLHR
. 2023 Jul 25;66(8):2671–2687. doi: 10.1044/2023_JSLHR-22-00495

Longitudinal Examination of Morphosyntactic Skills in Bilingual Children: Spanish and English Standardized Scores

Anny Castilla-Earls a,, Juliana Ronderos a, David J Francis b
PMCID: PMC10555469  PMID: 37490611

Abstract

Purpose:

This study aimed to examine changes in English and Spanish morphosyntactic standardized scores over time in bilingual children.

Method:

One hundred bilingual children participated in this longitudinal study. The average age of the children at the beginning of the study was 5;11 (years;months). A subset of the participants was identified as children with developmental language disorder (DLD, n = 43). Children completed behavioral testing in Spanish and English at three time points over a period of 2 years. Growth curve modeling was employed to analyze longitudinal data.

Results:

Distinct patterns of Spanish and English language growth were observed. While the average standard score in English increased, the average score in Spanish decreased over time for both groups. Children with DLD showed persistent language difficulties in both Spanish and English over time in comparison to their peers.

Conclusions:

The results of this study provide evidence of a shift in language proficiency from Spanish to English for bilingual children with and without language disorders. This study also shows that bilingual children with DLD show a protracted but parallel growth in morphosyntactic skills in comparison to children without DLD.

Supplemental Material:

https://doi.org/10.23641/asha.23671464


Spanish–English bilingual children form the largest group of bilingual children in the United States (U.S. Census Bureau, 2021). These children are often overidentified or underidentified as having language disorders and are disproportionally represented in special education (Artiles et al., 2002; Morgan et al., 2017; Samson & Lesaux, 2009). One of the potential reasons for the misidentification of bilingual children is that their morphosyntactic skills are highly variable, which poses significant challenges for the differentiation between typical and atypical language development (Bedore & Leonard, 2001; Castilla-Earls et al., 2019; Goebel-Mahrle & Shin, 2020; Gusewski & Rojas, 2017). In English, typically developing (TD) bilingual children make grammatical errors while they are in the process of learning the language. For instance, Gusewski and Rojas (2017) reported that TD bilingual children were about 60% accurate in tense marking productions in preschool but reached mastery after 2 years of schooling in English. In Spanish, TD bilingual children make grammatical errors because the sociolinguistic context does not always continue supporting the learning of Spanish (e.g., Montrul & Potowski, 2007; Shin, 2018). For example, in a cross-sectional study, Goebel-Mahrle and Shin (2020) found that older TD bilingual children made more grammatical clitic errors in Spanish than younger bilingual children, suggesting that the changes in input over time might have an effect on grammatical skills in Spanish. This pattern of increase in English morphosyntactic skills and decrease in Spanish morphosyntactic skills also seems evident when examining standard scores from morphosyntactic standardized assessments. While the English morphosyntax standard scores of the Bilingual English–Spanish Oral Screener (Peña et al., 2011) increased from 92 in first grade to 103 in third grade, the morphosyntax standard scores in Spanish decreased from 70 in first grade to 52 in third grade (Bedore et al., 2016), suggesting that older bilingual children performed worse on grammatical tasks in comparison to younger bilingual children in this cross-sectional study. It is important to note that most studies examining morphosyntactic developmental patterns in Spanish–English bilingual children tend to be cross-sectional in design and focus on TD bilingual children. The overall goal of this study is to examine longitudinal changes in morphosyntax skills in Spanish–English bilingual children with and without developmental language disorders (DLDs).

DLD in Bilingual Children

Children with DLDs show language learning difficulties, most detectible in the area of morphosyntax, that are not attributable to neurological, cognitive, or sensory concomitant disorders (Bishop et al., 2016, 2017; Leonard, 2014). Children with DLD produce sentences that are less complex and less grammatical than children with typical language skills of the same age. In bilingual children, these morphosyntactic difficulties are manifested in both languages, although the representations of these grammatical problems vary according to the language (Bedore & Leonard, 2001; Paradis, 2005). Therefore, current recommendations for the identification of language disorders in bilingual children are to evaluate them in both languages to obtain a complete picture of their skills in each language (Kohnert, 2010; Paradis, 2016).

The gold standard for the identification of bilingual children with language disorders is to use a combination of measures to find converging evidence of the disorder (Castilla-Earls et al., 2020). That is, no single measure or indicator should be used for identification purposes. Instead, evidence from various measures and/or indicators should be used to determine if a language disorder is present in bilingual children. Standardized assessment is one of the potential tools that could be used as part of the converging evidence framework for the identification of language disorders in bilingual children. Moreover, speech-language pathologists (SLPs) often use standardized tests and tend to rely on standard scores during the diagnostic-making process (e.g., Fulcher-Rood et al., 2018, 2019). Thus, understanding how the results of standardized tests change over time in bilingual children with and without DLD would improve our understanding of language development in bilingual children and provide useful information that can be used to minimize misidentification of DLD in bilingual children.

One of the standardized tests available for bilingual children in the United States is the Bilingual English–Spanish Assessment (BESA; Peña et al., 2018). The BESA stands out as a test uniquely designed for the identification of 4- to 6-year-old bilingual children with DLD because it was standardized on bilingual children in the United States. This test includes the assessment of both languages, provides a standard score by language, and uses the score in the best language to estimate the presence or absence of a language disorder using age-specific cutoffs. The authors of the BESA suggest that this test is useful for the (a) identification of language disorders in bilingual children, (b) documentation of language dominance by language domain (e.g., phonology, morphology), and (c) documentation of progress after treatment. The BESA has good diagnostic accuracy for the identification of language disorders, as reported in the testing manual. For example, the sensitivity of the morphosyntax subtest using the best language ranges between 90% and 96%, and the specificity between 83% and 90%, depending on the age group. Furthermore, Lazewnik et al. (2019) conducted an independent examination of the diagnostic accuracy of the BESA and reported that it had a sensitivity of 93% and specificity of 87%, adding further support for the use of this measure with bilingual children. The developers of the BESA also have the Bilingual English–Spanish Assessment–Middle Extension (BESA-ME; Peña et al., 2020), an extension of the BESA for children between the ages of 7 and 12 years. The BESA-ME is not currently available to the public, but there is an experimental version available to researchers. The BESA-ME also uses bilingual administration and the best language for the identification of language disorders and has relatively good diagnostic accuracy. For example, the sensitivity of the morphosyntactic subtest is 89%, and its specificity is 90% for bilingual children in second grade. However, the diagnostic accuracy decreases by fourth grade to 71% sensitivity and 77% specificity. Due to the relatively good diagnostic accuracy and the limited availability of bilingual standardized assessments, the BESA and BESA-ME (BESA/ME) are useful diagnostic measures for the identification of DLD in bilingual children between the ages of 4 and 9 years.

Growth of Language Skills in Children With DLD

Most of the research on the growth of language skills in children with DLD has been conducted on English-speaking monolingual children (Conti-Ramsden et al., 2012; Goffman & Leonard, 2000; Norbury et al., 2017; Rice, 2013; Rice et al., 2006, 2010). One of the central findings from these studies is that children with and without DLD differ in their general level of attainment over time, with children with DLD performing consistently lower than TD children on various measures of language skills (e.g., Conti-Ramsden et al., 2012; Norbury et al., 2017; Rice, 2013; Rice et al., 1998). Therefore, the language difficulties observed in children with DLD are considered to be persistent. A second central finding is that the language skills of TD children and children with DLD grow at the same rate (i.e., there are no statistical differences in their rate of growth) and follow the same growth trajectory (Conti-Ramsden et al., 2012; Norbury et al., 2017; Rice, 2013). This similarity is considered to be a strength in terms of learning mechanisms for children with DLD (Rice, 2013). It is important to add that differences in growth trajectories are observed between measured skills for all children, including children with DLD. For example, receptive vocabulary raw scores show linear growth, while measures such as mean length of utterance show a curvilinear pattern with initial growth followed by a plateau effect in both groups of children (Rice, 2012; 2013). Crucial for this study is the finding that when language skills are measured using standard scores, children with and without DLD tend to maintain their level of attainment over time, particularly after age 6 years (Conti-Ramsden et al., 2012; Norbury et al., 2017). That is, although children's language skills are increasing over time, their placement in reference to other children (i.e., the rank order of individuals within a group) tends to be the same over time, indicating that their performance in language measures is stable, particularly after age 6 years (Norbury et al., 2017).

In comparison to the literature on monolingual children, longitudinal studies on morphosyntactic skills in bilingual children with DLD are scarce. Castilla-Earls et al. (2019) examined the growth of the percentage of grammatical utterances (PGU), a broad measure of grammatical skills, in a group of children considered to have typical skills and a group of children at risk of DLD (low grammaticality in both languages at the onset of the study). For children in the English instruction, the more typical instruction model in the United States, children at risk of DLD started with a Spanish PGU of about 50% at age 5 years, with steady growth until age 7 years reaching about 80% and plateauing thereafter. In contrast, TD children had, on average, a Spanish PGU of about 94% between the ages of 5 and 7 years, where a decline was evident. In English, both groups of children increased their PGU from about 30% at the onset of the study to about 75% at around 9 years of age. The results of this study suggested that bilingual children with and without DLD might have different rates of growth and trajectories in Spanish grammaticality but similar rates of growth and trajectories in English, the less developed language for this group of children at the onset of the study. However, the diagnosis status of these children could not be confirmed in this retrospective study, and therefore, further examination of the growth of morphosyntactic skills in bilingual children with and without DLD is needed to better understand their growth rates and trajectories and potential differences with the growth skills in monolingual children.

Using Standardized Testing to Examine Morphosyntax in Bilingual Children With and Without DLD

Standard scores are used to compare the performance of a child on a task against the performance of their peers of the same age on the same task. In other words, a standard score is a between-children comparison that offers a reference of the performance of a child on a particular task relative to a standardization sample. This between-children comparison is the foundation of most of our language assessment practices because it provides general information about typical versus atypical performance on language tasks. From between-children comparisons, we can determine whether a child is performing below, above, or at the same level as their peers.

Within-child comparisons of standard scores (i.e., how a child performs on a standardized test in comparison to themselves at a different point in time) are used in research studies to examine developmental trajectories of growth in children with and without DLD (e.g., Conti-Ramsden et al., 2012; Norbury et al., 2017) and to document gains after treatment (e.g., Ebert et al., 2014), among other potential uses. From within-child comparisons, we can determine whether a child is performing below, above, or at the same level as their previous performance in terms of their rank order in their group. When there are no within-child changes in standard scores (i.e., a child obtains the same standard score over time), there is growth in skills over time while maintaining the rank order in their group, also known as stability of language performance (Norbury et al., 2017). A decline in standard scores over time for a child indicates that the child's language is not growing at the expected rate in comparison to other children and that there is a change in the performance of the child that affects the ranking of the child in comparison to previous performance. Conversely, an increase in standard scores indicates that a child's language skills are growing faster than would be expected, resulting in a performance at a higher rank in comparison to other children.

Monolingual children tend to show stability in language standard scores over time probably because their exposure to one language is constant. In other words, their language input is not partitioned into two languages and changing over time, as is the case for bilingual children. Therefore, within-child changes in standard scores for monolingual children are not an effect of changes in exposure to a language but an indication of their stability in language performance (Norbury et al., 2017). For bilingual children, the interpretation of within-child changes in standard scores needs to consider that the bilingual child is exposed to and learning two languages. Unlike monolingual children, bilingual children's input is partitioned into two languages with significant variability in the amount of exposure to each language and the context in which each language is used between children and within a child. For example, a Spanish-speaking child who is just starting to learn English might initially have a lower standard score in English in comparison to their score in Spanish, but that would likely change over time as their exposure to English and Spanish changes. Therefore, their language performance on standardized testing is expected to be more variable in comparison to monolingual children who have had enough experience and input in a language to achieve stability. Consequently, it is crucial to examine within-child changes in Spanish and English standard scores both independently (i.e., in each language to examine language-specific patterns of growth) and in relation to each other (i.e., together to examine a potential shift in language dominance).

This Study

The purpose of this study is to examine changes in performance in the English and Spanish morphosyntax subtests of the BESA and the BESA-ME over time in bilingual children with and without language disorders. We ask the following research question: How do Spanish and English morphosyntactic skills change over time in bilingual children with and without language disorders?

Method

The data for this study were collected under approval from the institutional review board of the University of Houston. This study was originally planned as a 2-year, 3–time-points (every 12 months ±3 months), in-person longitudinal study. Data for Time 1 was collected in person between 2018 and 2019. Planned in-person data collection for Time 2 started in 2019 and was ongoing when the COVID-19 global pandemic affected in-person research protocols in 2020. Therefore, a remote online assessment option was added to continue data collection for Time 2. For Time 3, all participants completed remote online assessments about 24 months (±3 months) after their Time 1 session between 2020 and 2021 (see Procedure section for more details on the transition to remote online assessment). Using this approach, we obtained longitudinal data with in-person assessment data from all participants at Time 1, mixed in-person and remote online assessment data for Time 2, and remote online-only assessment data for all participants at Time 3 (see Table 1 for details).

Table 1.

Demographic information for participants.

Demographic information Timepoint 1 (N = 100)
Timepoint 2 (N = 90)
Timepoint 3 (N = 81)
n (%) n (%) n (%)
Age, M (SD) 70.07 (11.97) 85.04 (13.64) 93.94 (12.64)
Gender
 Female 45 (45.0) 40 (44.4) 39 (48.1)
 Male 55 (55.0) 50 (55.6) 42 (51.9)
Receiving SLP services
 No 54 (54.0) 49 (54.4) 46 (56.8)
 Yes 46 (46.0) 41 (45.6) 35 (43.2)
Maternal education
 Elementary school 27 (27)
 High school 30 (30)
 Some college 9 (9)
 Associate's degree 7 (7)
 Bachelor's degree 10 (10)
 Graduate degree 14 (14)
 No response 3 (3.0) 3 (3.3) 2 (2.5)
Language(s) at home
 Spanish-only 53 (53.0) 47 (52.2) 43 (53.1)
 Both Spanish and English 44 (44.0) 40 (44.4) 36 (44.4)
 No response 3 (3.0) 3 (3.3) 2 (2.5)
Testing modality
 In-person 100 (100.0) 38 (42.2) 0 (0.0)
 Remote 0 (0.0) 52 (57.8) 81 (100.0)

Note. SLP = speech-language pathologist.

Participants

Participants for this study were recruited from schools and the community in the greater Houston area. We recruited bilingual children with the assistance of SLPs and teachers in school districts in the greater Houston area. This strategy allowed us to recruit bilingual children receiving speech-language services in the school (i.e., likely to have DLDs) and bilingual peers from their classrooms who were not receiving services (i.e., likely to not have DLDs). Additional bilingual children receiving and not receiving SLP services were recruited through community outreach and advertising in local speech and language clinics in greater Houston area.

To be eligible for the study, children needed to (a) be between the ages of 4–8 years at Time 1, (b) speak and understand both Spanish and English as per parent report, (c) pass a hearing screening test, and (d) have nonverbal IQ score within normal limits. The hearing screening for all children was administered using an otoacoustic emission device at 1000–4000 Hz. Children who failed the hearing screening received a pure-tone screening test at 25 dB HL at 1000, 2000, and 4000 Hz. Nonverbal IQ was considered within normal limits using a score of 70 and above on the Matrices subtest of the Kaufman Brief Intelligence Test–Second Edition (KBIT-2; Kaufman & Kaufman, 2004). The KBIT-2 nonverbal IQ scores have acceptable to good internal consistency reliability (i.e., how well performance on this test generalizes to performance on similar tests), which ranges between .78 and .93 (M = .86 for ages 4–18 years), and good test–retest reliability (i.e., how consistently people would score on the same test taken at different times), which ranges between .76 and .89 (M = .83).

One hundred Spanish–English bilingual children met the eligibility criteria for this study, AgeTime1 (in months): M = 70.07, SD = 11.97; male = 55, female = 45. At the onset of the study, 89% of the children were attending Spanish–English bilingual programs (Spanish–English bilingual or Spanish–English dual language), one child was attending a Spanish immersion preschool, and 10 children were attending English-only schools (five of these children were attending an all-day Saturday Spanish school program). Using a parental questionnaire at Time 1, we gathered information about demographics and language used at home (see Table 1). Fifty-seven of the participants' mothers reported having maternal education of high school or below. At Time 1, all parents reported that Spanish was spoken at home. English was also used in the homes of 47% of these children. This descriptive information was only collected at the onset of the study. As shown in Table 1, there was some expected attrition during Time 2 (10%) and Time 3 (10%). There were three main reasons for the attrition of participants during the study: Families decided they did not want to continue with the study (n = 2), families had moved outside of the United States (n = 1), and loss of contact with the participant by study staff (n = 16).

Diagnosis Classification

Bilingual children with DLD were identified using a converging evidence approach (Castilla-Earls et al., 2020). For children to qualify as DLD, they had to meet two out of three of the following criteria:

  • 1.  The child was receiving or being evaluated for eligibility for speech-language services in the school or in a clinic. Although we recruited children who were receiving services for language, it is possible that some of these children may have also had other types of communication disorders (e.g., speech sound disorder, or fluency). Forty-six of the children enrolled in this study were receiving or were being evaluated for speech-language services, and 54 children were not receiving services at Time 1.

  • 2.  The child had a qualifying score from standardized testing at Time 1. We followed the principles of standardized testing interpretation in the work of Castilla-Earls et al. (2019). First, we used the BESA/ME morphosyntax “best language” standard score, the 95% confidence interval (CI), and the age-based cutoff test for each test. We administered the BESA to 84 children and the BESA-ME to 16 children at Time 1. For the BESA, a child was considered to have a qualifying score when the full range of scores using the CI was below the cutoff. Forty children in this study had a CI above the age-based cutoff, and 29 had a CI under the cutoff. When the CI included the cutoff (15 children), we used the scaled scores of the sentence repetition subtest in the Clinical Evaluation of Language Fundamentals for both languages (M = 10, SD = 3; Clinical Evaluation of Language Fundamentals–Fourth Edition [CELF-4] for Spanish, Semel et al., 2003; CELF-5 for English, Wiig et al., 2013) to improve our confidence in the results of the standardized testing for these children. The Sentence Repetition subtest from the CELF-5 in English and CELF-4 in Spanish were chosen because sentence repetition is a well-established measure of language ability (Archibald & Joanisse, 2009; Armon-Lotem & Meir, 2016). Both CELFs were standardized in the United States, with the norming sample of the Spanish CELF-4 representative of the Spanish-speaking Hispanic population of the United States, including Spanish–English bilingual children. For 12 of these 15 children, the sentence repetition scaled score in both languages was equal to 7 or less; therefore, these 12 children were considered to have a qualifying score from standardized testing. The remaining three children with sentence repetition scaled scores above 7 in both languages were not considered to have a qualifying score. For the BESA-ME experimental version, there was no CI. We followed the same steps for classification as for the BESA, but we considered borderline scores to be in the 78–92 range. Twelve of the children who took the BESA-ME had a score over 92, and one had a score under 78. Of the three children with borderline scores, two had scores below 7 in the Sentence Repetition subtest, therefore considered to have a qualifying score from standardized testing. Using this approach to obtain a qualifying score from standardized testing, 45 children qualified under the standardized testing criteria, and 55 did not.

  • 3.  The child had a qualifying score from spontaneous language samples. First, we calculated the PGU and the mean length of utterance in words from Spanish and English language samples elicited through both a story retell and a story generation tasks using Frog stories. Then, we used the cutoff provided in the work of Hernandez et al. (2023), which combines grammaticality and utterance length in the best language with adequate diagnostic accuracy (sensitivity: 90.3%; specificity: 86.8%). For children who were older than 7 years, we did not have age-based cutoffs for PGU, so we used PGU under 80% as the cutoff in their best language (Restrepo, 1998). This approach resulted in 40 children with a qualifying score and 60 children who did not obtain a qualifying score.

Figure 1 depicts the overlap in qualifying criteria for the classification of DLD in this study. There were 41 children who did not meet any qualifying criteria and 16 children who met only one of the criteria (seven children with a qualifying score from language samples, four children with a qualifying score from standardized assessment, and five children who were receiving services). These 57 children were considered to have TD language skills. Recall that children had to meet at least two out of the three criteria to classify as DLD. There were 15 children who met two criteria (10 receiving services and a qualifying standardized score, two with a qualifying score in both standardized testing and language samples, and three with qualifying scores in language samples who were also receiving services). In addition, there were 28 children who qualified using all three criteria.

Figure 1.

A Venn diagram consisting of 3 circles. The circles are labeled Language Sample, Standardized Assessment, and Speech-Language Services and the blue numbers marked in these circles are 7, 4, and 5, respectively. A white number, 28, is marked in the region where all the 3 circles overlap. A white number, 2, is marked in the overlapping region of circles labeled Language Sample and Standardized Assessment. A white number, 10, is marked in the overlapping region of circles labeled Standardized Assessment, and Speech-Language Services. A white number, 3, is marked in the overlapping region of circles labeled Language Sample, and Speech-Language Services. A blue number, 41, is marked in the universal set.

Classification criteria. The Venn diagram depicts the classification categorization. White numbers represent children classified as children with developmental language disorder. Blue numbers represent children classified as typically developing.

This classification process resulted in 43 children classified as children with DLD (M age = 67.5, SD = 11.9 in months) and 57 children classified as children with TD (M age = 72.0, SD = 11.7 in months) at the onset of the study. There were 29 boys (67%) in the DLD group and 26 (46%) in the TD group. Forty-one children in the DLD group (95%) were receiving speech-language services at the onset of the study.

Measures

Morphosyntax

To measure morphosyntax abilities longitudinally in both English and Spanish, we used standard scores in the Morphosyntax subtest of the BESA (Peña et al., 2018) and the BESA-ME field test version (Peña et al., 2016). The BESA/ME are standardized language assessments for Spanish–English bilingual children. The BESA is used to assess language skills in children ages 4;0–6,11 (years;months) while the BESA-ME is an experimental measure from the authors of the BESA used to assess language in children ages 7;0–9;11. Both assessments have been normed with Spanish–English bilingual children in the United States and assess language abilities in both Spanish and English. The Morphosyntax subtests of the BESA/ME consist of a cloze section and a sentence repetition section targeting a variety of grammatical morphemes and sentence structures that have been shown to be clinical markers for DLD in English and Spanish. In Spanish, the BESA close task examines articles, clitics, present progressive, and subjunctive mood. In English, the BESA close task examines possessive –s, third-person singular, plural nouns, past and present progressive, copula BE, negatives, and passives. In the English BESA-ME, the close task examines possessive –s, regular and irregular past, relative clauses, passives, and questions. In Spanish, the BESA-ME close task assesses irregular past, relative clauses, subjunctive, imperfect, and adjective agreement.

Since the children in this study ranged from 4;0 to 8;2 at the onset of the study, both versions of the assessment were administered at each timepoint as follows, Time 1: 85 children were administered the BESA and 15 children the BESA-ME; Time 2: 41 children were administered the BESA and 49 the BESA-ME; Time 3: 18 children were administered the BESA and 63 the BESA-ME. In this study, we followed the bilingual administration of the BESA/ME Morphosyntax subtests specified in the test manuals. Although the “best language” standard score was used and its 95% CI was used, along with age-based cutoff criteria in the test manuals, as one of the converging criteria to categorize participants as TD (typical development) or DLD, the standard scores in each language were used to longitudinally estimate the growth of morphosyntactic skills in Spanish and English.

Procedure

Recruitment for this study was part of a larger longitudinal effort to examine how languages develop over time in Spanish–English bilingual children with and without DLD. The recruitment strategy was targeted at schools with bilingual programs in the greater Houston area. SLPs in the schools would identify eligible bilingual children in their caseload to send information home about our study. Eligible bilingual children from the same classrooms as the children receiving services also received information about the study. Parental consent documents and parent language questionnaires were sent to interested families via the regular school–home communication methods at Time 1. Parental consent was received to participate for the 2 years of the longitudinal study at Time 1. However, due to the COVID-19 pandemic, we had to stop in-person data collection after collecting Time 2 data for 38 of the participants. A communication was sent to all parents of children enrolled to inform them of the switch to remote online-only data collection for the remainder of the study. A new parental consent was collected via an online form from parents who wanted to continue in the study to allow for the collection of data for remaining participants for Time 2 and Time 3 with additional permission to audio- and video-record the entire remote session. The new parental consent form was available in both Spanish and English and with mobile-friendly capabilities detailing all the changes resulting from the shift to remote data collection. For the remainder of Time 2 participants who agreed to continue with the study, we administered sessions fully remote approximately 15–18 months from Time 1. All the sessions for Time 3 took place remotely, approximately 24 months from Time 1 (±3 months).

At Time 1, each child was administered three assessment sessions in person: general cognitive skills, Spanish language skills, and English language skills. All sessions at Time 1 were conducted in person in quiet classrooms at the school or in a quiet room in the child's home. At Time 2 and Time 3, children were invited back for only the two language skills assessment sessions. Each session lasted approximately 50 min and took place on separate days within 2 weeks of each other. The order of administration for each session and the order of administration for assessments within a session were randomized for each participant. Native English and Spanish bilingual research assistants administered all sessions entirely in the language being evaluated. Research assistants were blinded to whether participants were receiving speech and language services in the school or at a speech-language clinic and to the results of the other sessions to minimize interviewer bias during data collection. The general cognitive skills session included the KBIT-2 for nonverbal IQ in addition to standardized and experimental assessments of executive function (e.g., working memory, attention). The language sessions included various assessments of language abilities, including the BESA/ME, the CELF sentence repetition, receptive vocabulary, and language samples.

Our remote testing procedures were conducted via video conference calls with participants using Zoom software (Zoom Video Communications, 2020). Stimuli for the BESA/ME Cloze section of the Morphosyntax subtest were presented using MS PowerPoint slides after obtaining authorization from the authors to do so. Examiners used screen sharing functionality in Zoom to show children the stimuli that were relevant for the Cloze section of the BESA/ME. To ensure that children maintained interest for the duration of the session, examiners used engaging conversation and a variety of backgrounds for the session. All remote sessions were video-recorded.

Data Analysis Plan

To examine growth trajectories of Spanish and English morphosyntactic skills using the Morphosyntax subtest of the BESA/ME, we employed growth curve modeling. We used a similar approach to Castilla-Earls et al. (2019) to examine Spanish and English scores simultaneously in a single measurement model. To do so, we structured the data in a hyper-univariate form, so that all outcome scores were stacked into a single variable that was identified by the child's ID, the language of the observation (Spanish or English), and the age of the child at the time of the observation. All models were estimated using the MIXED command in Stata Version 16 (Statcorp, 2019). Time invariant covariates, such as mother education, were replicated on each record for the child, while time varying covariates, such as age, changed from one record to the next for a given child.

In Model 1, our baseline model, we estimated a measurement model that (a) estimated separate intercepts for Spanish and English outcomes, (b) allowed the intercepts to vary by child and language, and (c) partitioned the variance separately for Spanish and English. The equation for Model 1 is

BESA/MEitl=γ00l+0il+itl, (1)

where BESA/MEitl is the standard score in language l for person i at time t. Thus, l is an index designating the language of the outcome (Spanish or English), i is an index designating the individual and ranging from 1 to N g, which is the number of individuals in the group, and t is an index designating time that ranges from 1 to 3. In Equation 1, the variables define a person-and-language–specific intercept, 0il, which varies randomly across people, and a random error residual, ∈ itl , that captures the extent to which the score in language l for person i at a time t deviates from the expected score in that language plus the person-specific residual in that language. In summary, Model 1 allows the estimation of predicted values of BESA/ME standard scores by child and language and allows the variance to differ by language.

Model 2 included a growth parameter with age in months measuring time and allowed the slope for age to vary randomly across children. Age was included in the model using two variables. The first variable, age¯child , centered age at the age of the child at the onset of the study (Time 1) for each child. This variable allows the examination of age within child (i.e., the effect of age for a child in comparison to themselves). The second variable, age¯group, centered the age at the average age of all children at the onset of the study (Time 1), which allows the examination of the impact of age between children (i.e., the effect of age for a child in comparison with the children of average age at Time 1, or the extent to which scores differ between children due to differences in average age). This model also included the interaction between age¯child and age¯group to account for the possibility that individual growth rates vary as a function of the age of the child at the first assessment. Therefore, the intercept in Model 2 is interpreted as the standard score in the BESA/ME at the onset of the study for a child of average age at the onset of the study. These age effects are estimated separately for Spanish and English. The equation for Model 2 is:

BESA/MEitl=γ00l+0il+γ10l×age¯childit+γ20l×age¯groupit+γ20l×age¯childit×age¯groupit+1il×age¯childit+itl. (2)

Model 3 is designed to explain the variability between children in standard scores of the BESA/ME after accounting for age effects. In Model 3, we introduced a Level 2 variable: Language ability (TD = 0, and DLD = 1). The intercept in this model is then interpreted as the BESA/ME standard score in language l for a child with typical language skills at the onset of the study for a child of average age at the onset of the study. This variable allowed the estimation of the difference in intercepts, separately for Spanish and English, for a child of average age classified as DLD at the onset of the study compared to a child with typical language skills at the onset of the study. Finally, Model 4 included an interaction between DLD and age¯child to estimate if there was a differential effect of within-child age for children with DLD.

Models 2 and 4 allowed the estimation of the correlations between the intercept and slope random effects to assess the relationship between these two variables. Model fit between models was compared using the likelihood ratio test (LRT) and the Akaike information criterion (AIC). AIC is an indicator of model fit that takes into account the number of variables added when conducting model comparisons. Smaller values of AIC indicate better model fit.

Results

Descriptive results for BESA/ME standard scores by age are shown in Table 2. Random and fixed effects are shown for all models in Tables 3 and 4. Correlations between BESA/ME variables in Spanish and English over time are included in Supplemental Material S1. The results of Model 1 suggested that there was enough variance in the model at Level 1, within child, and at Level 2, between children, to proceed with the measurement models. The variance between children was larger than the variance within child in terms of standard scores in the BESA/ME. The overall average BESA/ME standard score was 81 in Spanish and 87 in English for all children in the study across all time points. AIC in Model 1 was 4,360.29.

Table 2.

BESA/ME descriptive information by age.

Group Age in years
4
5
6
7
8
9
M SD n M SD n M SD n M SD n M SD n M SD n
Spanish
 TD 102.1 14.3 8 92.9 20.6 28 93.2 16.1 35 83.4 22.1 42 85.2 22.4 28 91.2 18.8 13
 DLD 71.4 3.5 12 74.3 7.9 27 75.1 14.0 32 64.3 17.1 22 60.7 14.7 16 52 0 3
English
 TD 81.0 15.6 8 93.5 19.0 26 96.2 16.0 35 102.8 14.2 43 99.9 12.7 27 105.7 7.8 13
 DLD 66.4 6.2 11 70.5 10.7 26 71.8 12.1 31 79.9 13.5 21 76.9 17.1 16 84.3 21.3 3

Note. BESA-ME = Bilingual English–Spanish Assessment–Middle Extension; BESA/ME = BESA and BESA-ME; TD = typically developing; DLD = developmental language disorder.

Table 3.

Fixed effects.

Model Spanish
English
Difference between Spanish & English estimates
Estimate SE Sig Estimate SE Sig
Model 1
 BESA/ME 80.55 1.86 *** 86.77 1.80 ***
Model 2
 BESA/ME 82.58 1.77 *** 82.42 1.74 ***
age¯child −0.18 0.07 ** 0.37 0.05 *** p < .001
age¯group < −0.01 0.15 0.58 0.15 *** p = .005
age¯child×age¯group −0.01 < 0.01 −0.02 < 0.01 *** p = .072
Model 3
 BESA/ME 91.27 1.97 *** 92.39 1.72 ***
age¯child −0.18 0.07 ** 0.38 0.05 *** p < .001
age¯group −0.17 0.13 0.41 0.11 p < .001
 DLD −20.25 3.03 *** −23.45 2.60 *** p = .420
age¯child×age¯group −0.01 0.01 −0.02 < 0.01 *** p = .077
Model 4
 BESA/ME 91.28 1.98 *** 92.15 1.76 ***
age¯child −0.19 0.09 ** 0.40 0.06 *** p < .001
age¯group −0.17 0.13 0.41 0.11 *** p = .001
 DLD −20.25 3.05 *** −22.88 2.72 *** p = .521
 DLD × age¯child 0.01 0.14 −0.06 0.09 p = .719
age¯child×age¯group −0.01 0.01 −0.02 < 0.01 *** p = .086

Note. Sig = significance; BESA-ME = Bilingual English–Spanish Assessment–Middle Extension; BESA/ME = BESA and BESA-ME; DLD = developmental language disorder.

**

p < .01.

***

p < .001.

Table 4.

Random effects.

Model Spanish
English
AIC
Estimate SE Estimate SE
Model 1
 Level 1 110.89 12.07 78.48 8.68 4,360.29
 Level 2 299.81 48.77 287.23 45.55
Model 2
 Level 1 97.50 10.64 49.13 7.71
 Level 2 Intercept 228.43 44.20 253.13 43.08 4,276.25
 Level 2 Slope 0.04 0.03 0.01 0.04
 Cor int & slope 0.99 < 0.01 0.25 0.80
Model 3
 Level 1 94.87 10.29 48.82 7.65 4,182.24
 Level 2 intercept 134.99 32.68 129.57 25.85
 Level 2 slope 0.06 0.03 0.01 0.04
 Cor int & slope 1.00 < 0.01 0.05 0.60
Model 4
 Level 1 94.87 10.30 48.51 7.59 4,271.94
 Level 2 intercept 134.99 35.32 129.84 25.81
 Level 2 slope 0.06 0.03 0.01 0.04
 Cor int & slope 1.00 < 0.01 0.02 0.53

Note. AIC = Akaike Information Criterion; Cor int & slope = correlation intercept and slope.

In Model 2, the growth model, the average Spanish BESA/ME score in Spanish was 83 at the onset of the study for children of average age for all children at the onset of the study. The slope for Spanish was negative, indicating a decrease of .18 in BESA/ME standard scores per month (i.e., a decrease of 2.16 standard score points per year). For English, the average standard score at the onset of the study for children of average age was 82. Conversely, the English slope was positive, indicating that English standard scores were predicted to increase by about .37 standard score points per month (i.e., an increase of 4.44 standard score points per year). The difference between the age¯child in Spanish and English was statistically significant, indicating that developmental trajectories for English and Spanish differed. Between-children differences in age had a significant influence on the English intercept, but not on the Spanish intercept in this model, indicating that average English scores at the onset of the study were greater by .58 points for each additional month of age at the onset of the study (i.e., older children scored higher on average and younger children scored lower on average, with the difference being about .58 points for each additional month of age). The effect of age¯group on the intercept in Spanish was significantly different from the effect on the intercept in English. In addition, the interaction between age¯child and age¯group was statistically significant at −0.02 for English, suggesting a very small reduction in the within-child growth rate for English outcomes for older children relative to younger children. This interaction was not significant for Spanish. A decrease in Level 1 and Level 2 residuals was evident, as expected, since our age variables explained both within-child and between-children variance in scores. The inclusion of age in the model of standard scores explained 12% of the variance within children in Spanish and 37% of the variance within children in English. Model 2 had an AIC of 4,276.25, and the results of the LRT were statistically significantly different from Model 1, X 2(1) = 104.04, p < .001, indicating that the data in Model 2 had a better fit in comparison to Model 1.

Model 3 included language ability (i.e., TD vs. DLD) at the onset of the study as a Level 2 predictor. The average BESA/ME standard score for children who were TD at the onset of the study was 91 in Spanish and 93 in English. For English outcomes, the results of this model suggested that children with DLD scored on average 23 standard score points below children who were TD. In Spanish, children with DLD at the onset of the study scored 20 standard score points below children who were TD. The difference between the effect of DLD in Spanish and English was not statistically significant. Examination of the random effects in Model 3 suggests that this model explained a significant portion of the variance between children in intercepts: 41% for Spanish and 48% for English. Model 3's AIC was 4,182.62, which is lower than the AIC in Model 2 (4,276.25), and the results from the LRT between the models were statistically significant, X 2(1) = 97.63, p < .001, suggesting that a model including DLD fit the data better than a model without it.

Model 4 examined the interaction between language ability and age. The results suggested that the interaction between age¯child and DLD was not significant in either Spanish or English, indicating that children with and without language disorders have similar slopes. Model 4's AIC was 4,271.94, which is higher than the AIC from Model 3; the LRT test was not statistically significant, X 2(1) = 0.43, p = .805, suggesting that Model 4 did not provide better data fit. Therefore, Model 3 was considered the best model in this study to explain the data.

Figure 2 displays fixed effects results from Model 3. Figure 3 displays individual predicted growth curves (e.g., fixed effects plus random effects) also from Model 3. The top left quadrant displays the results in Spanish for the children with typical language skills. Of note in this graph are the growth lines showing a fanning effect so that children who have higher scores at the onset of the study tend to have positive or neutral growth lines while children who have lower scores tend to have negative growth lines. The same fanning pattern is observed in the bottom left quadrant in Spanish for children with DLD. The quadrants on the right side show the predicted growth lines for English scores. Importantly, both figures, TD children (top) and children with DLD (bottom), show that the increase in standard scores seems to be steep between the ages of 4 and 7 years but tends to flatten beyond age 7 years.

Figure 2.

A graph plotting the Bilingual English Spanish Assessment score and the Bilingual English Spanish Assessment Middle Extension score on the y axis and the age in years and months on the x axis. The y axis ranges from 55 to 100 in increments of 5. The x axis ranges from 4 years 6 months to 8 years 6 months in increments of 6 months. 2 solid blue and orange straight lines and 2 dashed blue and orange straight lines are plotted. The solid blue line linearly drops from a score of about 96 when the age is 4 years and 0 months to a score of 86 when the age is 8 years and 6 months. The solid orange line rises linearly from a value of about 84 when the age is 4 years and 0 months to a value of about 104 when the age is 8 years and 6 months. The dashed blue line drops linearly from a score of about 75 when the age is 4 years and 0 months to a score of about 65 when the age is 8 years and 6 months. The dashed orange line linearly rises from a score of about 63 when the age is 4 years and 0 months to a score of about 82 when the age is 8 years and 6 months. The legend for the graph is as follows. Solid blue line: Spanish Typically Developing Children. Solid orange line: English Typically Developing Children. Dashed blue line: Spanish children with Developmental Language Disorder. Dashed orange line: English children with Developmental Language Disorder. All values are estimated.

Predicted growth in BESA/ME standard scores by language over time. BESA-ME = Bilingual English–Spanish Assessment–Middle Extension; BESA/ME = BESA and BESA-ME; TD = typically developing children; DLD = children with developmental language disorder.

Figure 3.

4 graphs plotting the predicted Bilingual English Spanish Assessment score and Bilingual English Spanish Assessment Middle Extension score on the y axis and the age in months on the x axis for English and Spanish Typically Developing children and children with Developmental Language Disorder. In each graph the y axis ranges from 40 to 120 in increments of 20 and the x axis ranges from 40 to 120 in increments of 20. The first graph is for Spanish Typically Developing children. The score ranges from 95 to 110 when the age is about 45 and the score ranges from 45 to 110 when the age is 100. The second graph is for Spanish children with Developmental Language Disorder. The score ranges from about 70 to 80 when the age is about 45 and the score ranges from about 50 to 90 when the age is about 100. The third graph is for English typically developing children. The score ranges from about 75 to 105 when the age is 50 and the score ranges from about 65 to 110 when the age is about 110. The fourth graph is for English children with Developmental Language Disorder. The score ranges from about 55 to 70 when the age is about 50 and the score ranges from about 60 to 75 when the age is about 105. All values are estimated.

Individual predicted growth curves of BESA/ME standard scores by language over time. BESA-ME = Bilingual English–Spanish Assessment–Middle Extension; BESA/ME = BESA and BESA-ME; TD = typically developing children; DLD = children with developmental language disorder.

Discussion

The purpose of this study was to examine longitudinal changes in English and Spanish morphosyntactic standard scores in bilingual children. To do so, we used the Morphosyntax subtest of the BESA/ME, two standardized tests normed with Spanish–English bilingual children in the United States with good diagnostic accuracy for the identification of language disorders. In general, the results of this study suggest that there are distinct patterns of Spanish and English morphosyntactic growth. While the BESA/ME standard scores in English increased, the standard scores in Spanish decreased over time. Children with DLD showed persistent language difficulties in both Spanish and English over time in comparison to their peers, but their rate and trajectory of growth were similar to TD bilingual children. We discuss these results in detail below in each language separately first to examine language-specific patterns of growth and then both languages together to consider the changes in language dominance.

Growth of English Morphosyntactic Skills

The results of our modeling of growth in English suggest that at age 4 years, children with DLD were estimated to have, on average, a standard score of 61 in the BESA/ME while TD children had an average standard score of 84. This difference in performance between the two groups was maintained over time. In terms of growth, both children with and without DLD increased by about 4.4 standard score points on average per year. These results of English growth support findings from previous studies on English-speaking monolingual children showing that children with DLD have a persistently lower level of attainment but with similar rates of growth and growth trajectories in comparison to children with typical development (Conti-Ramsden et al., 2012; Norbury et al., 2017; Rice, 2013; Rice et al., 2006, 2010). These results are also in agreement with the results of Castilla-Earls et al. (2019), showing that bilingual children at risk of DLD show similar growth rates and trajectories in English than children not at risk.

In general, these findings are in agreement with current characterizations of DLD that include both strengths and weaknesses (see Rice, 2012, 2013). The bilingual children with DLD in this study followed the same track at the same rate as bilingual children without DLD, which is considered a strength in terms of learning mechanisms. Their weakness is the continued lower performance in comparison to the normative group in their English productive morphosyntax. Although learning of English morphosyntax is taking place as evident in their change of standard scores over time, their learning is, in general, not enough to catch up with their TD peers.

Notably, the findings in this study do not seem to support the idea that bilingual children with DLD learn English at a slower pace than children with typical development as evidenced by the lack of interaction between age and DLD in Model 4 (i.e., there were no differences in the growth rates in standard scores between children with and without DLD). This observation is important for assessment approaches attempting to differentiate typical versus atypical development in bilingual children using growth or learning rates in English using standardized scores. However, the results of this study suggest that perhaps an indicator such as continued lagged performance in English standard scores could be used as a potential indicator of DLDs, although Spanish performance also needs to be taken into account.

An interesting observation from Figure 3 is that the individual predicted growth curves suggest that morphosyntactic growth in English language skills seems to be steeper before the age of 7 years but more stable thereafter, particularly for TD children. First, it is important to note that we only modeled linear growth since we were limited to three-time measurement points. However, the individual growth curves capturing the fixed and random effects together suggest that a quadratic growth trajectory may be more appropriate to model the growth of morphosyntactic standard scores in bilingual children, particularly for TD children. This potential quadratic growth trajectory might be a better fit to explain the growth in standard scores since children who are learning English will eventually reach a stable level of performance, and a continued increase in standard scores would not be expected. This potential quadratic effect observed in the individual growth curves seems to be aligned with the findings that performance in standardized testing in English monolingual children is more stable after a certain time frame (Norbury et al., 2017). Although this time frame is suggested to be around 6 years of age in monolingual children, more time (i.e., more exposure to English) would be required in bilingual children to reach this stability in performance.

Growth of Spanish Morphosyntactic Skills

The results of this study show that the morphosyntactic Spanish standard scores of the children in this study were declining over time relative to normative expectations. The fixed effects suggested that there was a decline in their Spanish standard scores at a rate of about 2.2 standard points per year, with no difference in growth rate between children with and without DLD. In addition, there was a consistent difference between children with and without DLD of about 20 standard score points on the BESA/ME. Similar to what we found in English, these results are in agreement with studies with monolingual children showing that children with DLD show persistent lower scores but similar growth patterns compared to children without DLD (Conti-Ramsden et al., 2012; Norbury et al., 2017; Rice, 2013; Rice et al., 2006, 2010). Although, in this case, a decreasing overall pattern was observed, children with DLD showed the same growth pattern. However, the results of this study, in principle, differed from those of Castilla-Earls et al. (2019) in that we did not find different growth trajectories for children with and without DLD. Recall that Castilla-Earls et al. examined the growth of the PGU and found a decline for the children with typical development, but an initial increase followed by a plateau in children at risk of language disorders. A potential explanation for this difference in results may be the different measures used to evaluate language growth in both studies. In the work of Castilla-Earls et al. (2019), a broad index of grammaticality was used, which is based on a child's production during a story elicitation task. In contrast, in this study, we used the BESA/ME, which specifically targets morphosyntactic structures that are problematic for children with DLD and more prone to variation in bilingual children (e.g., Bedore & Leonard, 2001, 2005; Goebel-Mahrle & Shin, 2020; Shin, 2018). It is likely that these measures, which differed by elicitation method (language samples vs. elicitation task and sentence repetition), errors included (all errors vs. clinical markers), and type of measure (percentage vs. standard score) yield different results in terms of language growth.

The finding that Spanish standard scores, on average, were declining over time does not directly imply that children were losing their morphosyntactic skills. The decline in standard scores instead indicates that the bilingual children in this study might have been experiencing a plateau in their Spanish morphosyntactic performance, which might impact their ranking in comparison to the bilingual children in the normative group. For example, a child could obtain the same raw scores at two time points, indicating similar performance on a task over time, but the standard scores would decrease because of different age expectations. Importantly, we did not directly examine, for example, whether a particular grammatical structure that was used before correctly was no longer produced correctly at the end of the study. Studies examining changes in specific grammatical structures are needed to ascertain whether bilingual children maintain, increase, or lose some aspects of grammar. Current efforts in our research laboratory are addressing this important gap in our knowledge regarding morphosyntactic development in bilingual children with and without language disorders.

Crucially, the decline in Spanish morphosyntactic standard scores does not signify that bilingual children cannot successfully learn two languages, but it is likely a manifestation of the sociolinguistic characteristics influencing bilingual children in a context with limited support for the development of Spanish (Lutz, 2008; Montrul & Potowski, 2007; Shin, 2018). Another potential explanation suggested by Bedore et al. (2016) is that it is possible that these children suppressed/neglected their Spanish while they turn their attention to learning English because it is the language of school and of peers. It is possible that a different growth pattern emerges during the adolescent years as the social perception of Spanish shifts with schooling and peer groups.

It is crucial to also highlight the results of the random effects, which suggest that children who start with higher Spanish standard scores were more likely to show more stable performance in comparison to children who had lower Spanish scores at the onset of the study. This finding was also evident in the group of children with DLD: Children with higher scores within their group also tended to show more stable performance in Spanish, while children with lower scores showed a decline in their performance relative to the standardization sample. Although we characterize the average effect as a general decline in standard scores of about 2.2 standard scores per year, we observed that children with scores above 90 for the TD group and 70 for the DLD group tended to show a slight increase in standard scores or same standard scores 2 years later, while the rest of the children showed a decrease that tended to be steeper for those with the lowest scores at the onset. We interpret this finding as children with stronger morphosyntactic skills at the onset of the study having a higher probability of maintaining their ranking within their group. Both children with and without DLD can show stability in their Spanish performance, and they are more likely to do so when they initially perform higher in comparison to the normative group.

Changes in Language Dominance for Children With and Without DLD

At age 4 years, children, on average, had higher morphosyntax standard scores in Spanish than in English. By age 6 years, these children tended to have similar scores in Spanish and English. However, by age 8 years, English significantly surpasses Spanish standard scores by about 15 standard score points. These results are in agreement with a shift in language proficiency for speakers of minority languages in the United States (Castilla-Earls et al., 2019; Lutz, 2008; Montrul, 2010). Once these children start school, they tend to use Spanish less frequently than English, other than in limited contexts (i.e., home-only or bilingual classrooms in the early elementary years); for this reason, morphosyntactic development tends to lag in comparison to development in the majority language English. These results also support the findings from Castilla-Earls et al. (2019) who found a similar shift in language proficiency from Spanish to English using grammaticality measures from story retells during the early elementary years.

Importantly, this pattern of shift in language dominance was evident for children with and without DLD. Although the children with DLD have significantly lower scores in both languages at the onset of the study, English and Spanish morphosyntactic skills tended to grow and decline, respectively, at similar rates compared to their TD peers. These findings are in agreement with current views that bilingualism does not pose an additional risk for children with language disorder (e.g., Paradis, 2005, 2016).

The findings from this study seem to align with theoretical frameworks that define DLD as a variation on a continuum of language development abilities rather than a distinctly different class of development. In this study, children with DLD followed the same trajectory and rate of growth as children without DLD but persistently differed in their overall performance over time. In our view, the finding that the differences between groups are limited to only differences in performance, and not in growth trajectory or growth rate, is more consistent with the notion of DLD as a variation on a continuum of development rather than a distinctly different class of development (e.g., Dollaghan, 2004, 2011; Lancaster & Camarata, 2019).

A final point of discussion specific to children with DLD is that the majority of these children were receiving speech-language services at the onset of the study. It is important to note that we did not collect information regarding the language used during therapy, which might have impacted the outcomes in each language over time. That is, it is possible that more gains were seen in the language supported by therapy in school, as a result of both therapy directly and the implications of the value assigned to the language(s) in the child's context (Durán et al., 2016; Harvey et al., 2018). It is likely that English was the language of therapy for most children in the study, which might have offered support for English growth.

Contextual Considerations

Because of the impact of the COVID-19 pandemic had on in-person testing protocols, we modified our original design to conduct online testing at Time 2 for some children (those not tested yet in person when the COVID-19 pandemic affected in-person testing) and at Time 3 for all children. The decision to move to online testing at Time 2 was an adaptative strategy to the circumstances, but the decision to continue with online testing for all children at Time 3 was taken purposefully to ensure that all children in our sample had in-person data at the onset of the study and online data at the outset. If we had seen a pattern in which the testing approach (online vs. in person) had a negative effect on their performance in the BESA/ME, we should have observed that same effect in both languages. In addition, the administration of the BESA/ME is most likely to be unaffected by the type of administration since this is a language productive task, and most studies suggest that assessment using telepractice approaches yield similar results to in-person assessment approaches (Ciccia et al., 2011; Manning et al., 2020; Waite et al., 2010).

However, a point worth considering is that an effect of the type of language support children had available to them during the pandemic might have impacted the outcomes of this study. This study was not designed to test the potential effect of a pandemic, so it is crucial that we interpret the results considering that a pandemic was taken place during the second part of data collection. It is possible that language experiences at home changed while the restrictions to in-person classroom instruction took place in 2020. This interruption was relatively short in Texas, where the vast majority of children were back to in-person instruction by August 2020. However, it is possible that these bilingual children experienced some changes in the language practices at home that might explain some of the changes observed in this study, as suggested by a reviewer.

It is important to highlight that the results of this study might not generalize to all bilingual contexts within the United States. To interpret the results of this study, we must take into consideration the sociolinguistic context of our sample. This study took place in Houston, a city in which about 40% of the population speaks Spanish and where bilingual education is widely available. Since the vast majority of educational settings in the United States are English-only instruction programs, it is crucial to determine the Spanish growth trajectories for children who have no access to bilingual education programs and who are in a context where fewer people speak Spanish.

It is also possible that the results of this study point out a difference between the sample in this study and the standardization sample. In principle, children in this study match the bilingual children in the standardization sample closely in terms of demographic information. However, it is important to highlight that standardization samples use a cross-sectional approach. Therefore, the performance over time of the bilingual children in this study is compared to different bilingual children at different ages at different time points. Is it possible to consider that perhaps the bilingual children in the standardization sample at later ages had higher Spanish skills than the children in this study for them to qualify as Spanish speakers for the standardization sample at that age. We think that it is, in fact, a possibility due to the heterogeneity in bilingual's relative proficiency between their languages and the different trajectories we see in bilingual profiles over time, but we cannot determine that at this point. We could be comparing the bilingual children in this study, who are stronger in Spanish morphosyntax at the onset of the study but tend to decrease in scores and shift in proficiency over time, to bilingual children of different ages who are still proficient in Spanish (even at older ages) to qualify for the standardization sample. This is one of the many challenges facing bilingual standardization samples in that there is variation both within children and between children that is challenging to consider when determining appropriate reference groups.

Clinical Implications

The findings from this study provide important insights into the growth trajectories of morphosyntactic skills of bilingual children with and without DLD growing up in the United States that are important to consider when assessing bilingual children. We observed that bilingual children start with higher Spanish morphosyntax standard scores compared to English, but, over time, there seems to be a shift in language dominance from Spanish to English. It is important for clinicians to evaluate skills in both languages as bilingual children will vary widely in terms of their initial language skills in each language but also in terms of when this shift occurs. It does appear, based on the results of this study, that there is a period of rapid increase of English morphosyntax skills before age 7 years that tends to stabilize beyond that. Furthermore, although Spanish morphosyntax scores tend to decline on average over time, children who start with higher scores in Spanish morphosyntax tend to show stability in their Spanish standard scores. Importantly for clinical practice, however, is that these patterns of growth rates in their languages are similar for children with DLD compared to their TD peers.

Limitations

There are limitations of this study that are important to acknowledge. First, because children were allowed to enter the study at different ages, there are some age bands where there are a limited number of children. Our children with DLD were, on average, younger than our children with typical language skills at the onset of the study. Second, we used the experimental version of the BESA-ME. Results using the final published version, expected to be released soon, might differ from the results presented here. Third, we are unable to examine differences in specific grammatical structures or between the cloze task and the sentence repetition task. We use both the BESA and the BESA-ME, but the BESA-ME we used was an experimental version. For the cloze and sentence repetition sections, the BESA provides scaled scores but the BESA-ME experimental version provides standardized scores. Comparison of performance in these tasks will be an interesting area of future research as the task presentation (cloze or repetition) offers opportunities to produce different types of morphosyntactic structures and tap into different linguistic and nonlinguistic skills. In addition, although the BESA/ME are very similar tests, they do not test the same grammatical structures. For example, articles are included in the BESA but not the BESA-ME. Therefore, longitudinal examination of specific grammatical structures is not possible when using the BESA and the BESA-ME together.

Conclusions

In summary, this study provides initial insight into the developmental trajectories of Spanish and English morphosyntactic skills of bilingual children living in the United States. In English, despite significant variability of initial English morphosyntactic skills, children tended to increase in their English language skills. This increase was steeper before age 7 years, suggesting a rapid period of development of English language skills in early school years, and stabilized thereafter. In Spanish, children tended to decline in morphosyntax skills over time. Importantly, there was no significant difference in the rate of growth/decline of language skills between children with and without DLD, although, as expected, children with DLD tended to have significantly lower scores than their TD peers.

Data Availability Statement

The data set generated during and/or analyzed during this study is available from the corresponding author on reasonable request.

Supplementary Material

Supplemental Material S1. Correlations between BESA/ME in Spanish and English over time.

Acknowledgments

Research reported in this publication was supported by the National Institute on Deafness and Other Communication Disorders under Award Number K23DC015835 granted to Anny Castilla-Earls. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We would like to thank the families and children who participated in this study and the speech-language pathologists who helped us with recruiting. In addition, we would like to thank Elizabeth Peña for allowing us to use the examination version of the BESA-ME for research purposes.

Funding Statement

Research reported in this publication was supported by the National Institute on Deafness and Other Communication Disorders under Award Number K23DC015835 granted to Anny Castilla-Earls. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

References

  1. Archibald, L. M. D. , & Joanisse, M. F. (2009). On the sensitivity and specificity of nonword repetition and sentence recall to language and memory impairments in children. Journal of Speech, Language, and Hearing Research, 52(4), 899–914. 10.1044/1092-4388(2009/08-0099) [DOI] [PubMed] [Google Scholar]
  2. Armon-Lotem, S. , & Meir, N. (2016). Diagnostic accuracy of repetition tasks for the identification of specific language impairment (SLI) in bilingual children: Evidence from Russian and Hebrew. International Journal of Language & Communication Disorders, 51(6), 715–731. 10.1111/1460-6984.12242 [DOI] [PubMed] [Google Scholar]
  3. Artiles, A. J. , Harry, B. , Reschly, D. J. , & Chinn, P. C. (2002). Over-identification of students of color in special education: A critical overview. Multicultural Perspectives, 4(1), 3–10. 10.1207/s15327892mcp0401_2 [DOI] [Google Scholar]
  4. Bedore, L. M. , & Leonard, L. B. (2001). Grammatical morphology deficits in Spanish-speaking children with specific language impairment. Journal of Speech, Language, and Hearing Research, 44(4), 905–924. 10.1044/1092-4388(2001/072) [DOI] [PubMed] [Google Scholar]
  5. Bedore, L. M. , & Leonard, L. B. (2005). Verb inflections and noun phrase morphology in the spontaneous speech of Spanish-speaking children with specific language impairment. Applied Psycholinguistics, 26(2), 195–225. 10.1017/s0142716405050149 [DOI] [Google Scholar]
  6. Bedore, L. M. , Peña, E. D. , Griffin, Z. M. , & Hixon, J. G. (2016). Effects of age of English exposure, current input/output, and grade on bilingual language performance. Journal of Child Language, 43(3), 687–706. 10.1017/s0305000915000811 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bishop, D. V. M. , Snowling, M. J. , Thompson, P. A. , Greenhalgh, T. , & CATALISE consortium . (2016). CATALISE: A multinational and multidisciplinary Delphi consensus study. Identifying language impairments in children. PLOS ONE, 11(7), Article e0158753. 10.1371/journal.pone.0158753 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bishop, D. V. M. , Snowling, M. J. , Thompson, P. A. , Greenhalgh, T. , & CATALISE-2 consortium . (2017). Phase 2 of CATALISE: A multinational and multidisciplinary Delphi consensus study of problems with language development: Terminology. The Journal of Child Psychology and Psychiatry, 58(10), 1068–1080. 10.1111/jcpp.12721 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Castilla-Earls, A. , Bedore, L. , Rojas, R. , Fabiano-Smith, L. , Pruitt-Lord, S. , Restrepo, M. A. , & Peña, E. (2020). Beyond scores: Using converging evidence to determine speech and language services eligibility for dual language learners. American Journal of Speech-Language Pathology, 29(3), 1116–1132. 10.1044/2020_AJSLP-19-00179 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Castilla-Earls, A. , Francis, D. , Iglesias, A. , & Davidson, K. (2019). The impact of the Spanish-to-English proficiency shift on the grammaticality of English learners. Journal of Speech, Language, and Hearing Research, 62(6), 1739–1754. 10.1044/2018_JSLHR-L-18-0324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Ciccia, A. H. , Whitford, B. , Krumm, M. , & McNeal, K. (2011). Improving the access of young urban children to speech, language and hearing screening via telehealth. Journal of Telemedicine and Telecare, 17(5), 240–244. 10.1258/jtt.2011.100810 [DOI] [PubMed] [Google Scholar]
  12. Conti-Ramsden, G. , St Clair, M. C. , Pickles, A. , & Durkin, K. (2012). Developmental trajectories of verbal and nonverbal skills in individuals with a history of specific language impairment: From childhood to adolescence. Journal of Speech, Language, and Hearing Research, 55(6), 1716–1735. 10.1044/1092-4388(2012/10-0182) [DOI] [PubMed] [Google Scholar]
  13. Dollaghan, C. A. (2004). Taxometric analyses of specific language impairment in 3- and 4-year-old children. Journal of Speech, Language, and Hearing Research, 47(2), 464–475. 10.1044/1092-4388(2004/037) [DOI] [PubMed] [Google Scholar]
  14. Dollaghan, C. A. (2011). Taxometric analyses of specific language impairment in 6-year-old children. Journal of Speech, Language, and Hearing Research, 54(5), 1361–1371. 10.1044/1092-4388(2011/10-0187) [DOI] [PubMed] [Google Scholar]
  15. Durán, L. K. , Hartzheim, D. , Lund, E. M. , Simonsmeier, V. , & Kohlmeier, T. L. (2016). Bilingual and home language interventions with young dual language learners: A research synthesis. Language, Speech, and Hearing Services in Schools, 47(4), 347–371. 10.1044/2016_LSHSS-15-0030 [DOI] [PubMed] [Google Scholar]
  16. Ebert, K. D. , Kohnert, K. , Pham, G. , Rentmeester Disher, J. , & Payesteh, B. (2014). Three treatments for bilingual children with primary language impairment: Examining cross-linguistic and cross-domain effects. Journal of Speech, Language, and Hearing Research, 57(1), 172–186. 10.1044/1092-4388(2013/12-0388) [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Fulcher-Rood, K. , Castilla-Earls, A. , & Higginbotham, J. (2019). Diagnostic decisions in child language assessment: Findings from a case review assessment task. Language, Speech, and Hearing Services in Schools, 50(3), 385–398. 10.1044/2019_LSHSS-18-0044 [DOI] [PubMed] [Google Scholar]
  18. Fulcher-Rood, K. , Castilla-Earls, A. P. , & Higginbotham, J. (2018). School-based speech-language pathologists' perspectives on diagnostic decision making. American Journal of Speech-Language Pathology, 27(2), 796–812. 10.1044/2018_AJSLP-16-0121 [DOI] [PubMed] [Google Scholar]
  19. Goebel-Mahrle, T. , & Shin, N. L. (2020). A corpus study of child heritage speakers' Spanish gender agreement. International Journal of Bilingualism, 24(5–6), 1088–1104. 10.1177/1367006920935510 [DOI] [Google Scholar]
  20. Goffman, L. , & Leonard, J. (2000). Growth of language skills in preschool children with specific language impairment: Implications for assessment and intervention. American Journal of Speech-Language Pathology, 9(2), 151–161. 10.1044/1058-0360.0902.151 [DOI] [Google Scholar]
  21. Gusewski, S. , & Rojas, R. (2017). Tense marking in the English narrative retells of dual language preschoolers. Language, Speech, and Hearing Services in Schools, 48(3), 183–196. 10.1044/2017_LSHSS-16-0093 [DOI] [PubMed] [Google Scholar]
  22. Harvey, H. , Allaway, H. , & Jones, S. (2018). The effectiveness of therapies for dual language children with developmental language disorder: A systematic review of interventional studies. International Journal of Bilingual Education and Bilingualism, 24(7), 1043–1064. 10.1080/13670050.2018.1536112 [DOI] [Google Scholar]
  23. Hernandez, M. , Ronderos, J. , & Castilla-Earls, A. (2023). Diagnostic accuracy of grammaticality and utterance length in bilingual children. Unpublished manuscript. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Kaufman, A. S., & Kaufman, N. L. (2004). Kaufman Brief Intelligence Test–Second Edition. Pearson Assessments. [Google Scholar]
  25. Kohnert, K. (2010). Bilingual children with primary language impairment: Issues, evidence and implications for clinical actions. Journal of Communication Disorders, 43(6), 456–473. 10.1016/j.jcomdis.2010.02.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Lancaster, H. S. , & Camarata, S. (2019). Reconceptualizing developmental language disorder as a spectrum disorder: Issues and evidence. International Journal of Language & Communication Disorders, 54(1), 79–94. 10.1111/1460-6984.12433 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Lazewnik, R. , Creaghead, N. A. , Smith, A. B. , Prendeville, J. A. , Raisor-Becker, L. , & Silbert, N. (2019). Identifiers of language impairment for Spanish–English dual language learners. Language, Speech, and Hearing Services in Schools, 50(1), 126–137. 10.1044/2018_LSHSS-17-0046 [DOI] [PubMed] [Google Scholar]
  28. Leonard, L. B. (2014). Specific language impairment across languages. Child Development Perspectives, 8(1), 1–5. 10.1111/cdep.12053 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Lutz, A. (2008). Negotiating home language: Spanish maintenance and loss in Latino families. Latino(a) Research Review, 6, 37–64. [Google Scholar]
  30. Manning, B. L. , Harpole, A. , Harriott, E. M. , Postolowicz, K. , & Norton, E. S. (2020). Taking language samples home: Feasibility, reliability, and validity of child language samples conducted remotely with video chat versus in-person. Journal of Speech, Language, and Hearing Research, 63(12), 3982–3990. 10.1044/2020_JSLHR-20-00202 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Montrul, S. (2010). Current issues in heritage language acquisition. Annual Review of Applied Linguistics, 30, 3–23. 10.1017/S0267190510000103 [DOI] [Google Scholar]
  32. Montrul, S. , & Potowski, K. (2007). Command of gender agreement in school-age Spanish–English bilingual children. International Journal of Bilingualism, 11(3), 301–328. 10.1177/13670069070110030301 [DOI] [Google Scholar]
  33. Morgan, P. L. , Farkas, G. , Hillemeier, M. M. , Li, H. , Pun, W. H. , & Cook, M. (2017). Cross-cohort evidence of disparities in service receipt for speech or language impairments. Exceptional Children, 84(1), 27–41. 10.1177/0014402917718341 [DOI] [Google Scholar]
  34. Norbury, C. F. , Vamvakas, G. , Gooch, D. , Baird, G. , Charman, T. , Simonoff, E. , & Pickles, A. (2017). Language growth in children with heterogeneous language disorders: A population study. Journal of Child Psychology and Psychiatry and Allied Disciplines, 58(10), 1092–1105. 10.1111/JCPP.12793 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Paradis, J. (2005). Grammatical morphology in children learning English as a second language: Implications of similarities with specific language impairment. Language, Speech, and Hearing Services in Schools, 36(3), 172–187. 10.1044/0161-1461(2005/019) [DOI] [PubMed] [Google Scholar]
  36. Paradis, J. (2016). The development of English as a second language with and without specific language impairment: Clinical implications. Journal of Speech, Language, and Hearing Research, 59(1), 171–182. 10.1044/2015_JSLHR-L-15-0008 [DOI] [PubMed] [Google Scholar]
  37. Peña, E. , Gutiérrez-Clellen, V. F. , Iglesias, A. , & Goldstein, B. A. . (2018). Bilingual English–Spanish Assessment (BESA). Brookes. [Google Scholar]
  38. Peña, E. D. , Bedore, L. M. , Gutiérrez-Clellen, V. F. , Iglesia, A. , & Goldstein, B. A. (2016). Bilingual English-Spanish Assessment–Middle Extension Field Test Version (BESA-ME). Unpublished manuscript. [Google Scholar]
  39. Peña, E. D. , Bedore, L. M. , Lugo-Neris, M. J. , & Albudoor, N. (2020). Identifying developmental language disorder in school age bilinguals: Semantics, grammar, and narratives. Language Assessment Quarterly, 17(5), 541–558. 10.1080/15434303.2020.1827258 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Peña, E. D. , Gillam, R. B. , Iglesias, A. , Bedore, L. M., & Bohman, T. M. (2011). Risk for poor performance on a language screening measure for bilingual preschoolers and kindergarteners. American Journal of Speech-Language Pathology, 20(4), 302–314. 10.1044/1058-0360(2011/10-0020) [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Restrepo, M. A. (1998). Identifiers of predominantly Spanish-speaking children with language impairment. Journal of Speech, Language, and Hearing Research, 41(6), 1398–1411. 10.1044/jslhr.4106.1398 [DOI] [PubMed] [Google Scholar]
  42. Rice, M. L. (2012). Toward epigenetic and gene regulation models of specific language impairment: Looking for links among growth, genes, and impairments. Journal of Neurodevelopmental Disorders, 4(1), Article 27. 10.1186/1866-1955-4-27 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Rice, M. L. (2013). Language growth and genetics of specific language impairment. International Journal of Speech-Language Pathology, 15(3), 223–233. 10.3109/17549507.2013.783113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Rice, M. L. , Redmond, S. M. , & Hoffman, L. (2006). Mean length of utterance in children with specific language impairment and in younger control children shows concurrent validity and stable and parallel growth trajectories. Hearing Research, 49(4), 793–808. 10.1044/1092-4388(2006/056) [DOI] [PubMed] [Google Scholar]
  45. Rice, M. L. , Smolik, F. , Perpich, D. , Thompson, T. , Rytting, N. , & Blossom, M. (2010). Mean length of utterance levels in 6-month intervals for children 3 to 9 years with and without language impairments. Journal of Speech, Language, and Hearing Research, 53(2), 333–349. 10.1044/1092-4388(2009/08-0183) [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Rice, M. L. , Wexler, K. , & Hershberger, S. (1998). Tense over time: The longitudinal course of tense acquisition in children with specific language impairment. Journal of Speech, Language, and Hearing Research, 41(6), 1412–1431. 10.1044/jslhr.4106.1412 [DOI] [PubMed] [Google Scholar]
  47. Samson, J. F. , & Lesaux, N. K. (2009). Language-minority learners in special education: Rates and predictors of identification for services. Journal of Learning Disabilities, 42(2), 148–162. 10.1177/0022219408326221 [DOI] [PubMed] [Google Scholar]
  48. Semel, E. , Wiig, E. , & Secord, W. (2003). Clinical Evaluation of Language Fundamentals–Fourth Edition (CELF-4). The Psychological Corporation/A Harcourt Assessment Company. [Google Scholar]
  49. Shin, N. (2018). Child heritage speakers' Spanish morphosyntax: Rate of acquisition and crosslinguistic influence. In Potowski K. (Ed.), Handbook of Spanish as a heritage language (pp. 235–253). Routledge. 10.4324/9781315735139-16 [DOI] [Google Scholar]
  50. StataCorp. (2019). Stata Statistical Software: Release 16. [Google Scholar]
  51. U. S. Census Bureau. (2021). American Community Survey. S1601 Language Spoken at Home. https://data.census.gov/table?t=Language+Spoken+at+Home&g=310XX00US14460&tid=ACSST1Y2021.S1601 [Google Scholar]
  52. Waite, M. C. , Theodoros, D. G. , Russell, T. G. , & Cahill, L. M. (2010). Internet-based telehealth assessment of language using the CELF-4. Language, Speech, and Hearing Services in Schools, 41(4), 445–458. 10.1044/0161-1461(2009/08-0131) [DOI] [PubMed] [Google Scholar]
  53. Wiig, E. , Semel, E. , & Secord, W. (2013). Clinical Evaluation of Language Fundamentals—Fifth Edition (CELF-5). NCS Pearson, Inc. [Google Scholar]
  54. Zoom Video Communications. (2020). Zoom [Software]. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material S1. Correlations between BESA/ME in Spanish and English over time.

Data Availability Statement

The data set generated during and/or analyzed during this study is available from the corresponding author on reasonable request.


Articles from Journal of Speech, Language, and Hearing Research : JSLHR are provided here courtesy of American Speech-Language-Hearing Association

RESOURCES