Skip to main content
Journal of Speech, Language, and Hearing Research : JSLHR logoLink to Journal of Speech, Language, and Hearing Research : JSLHR
. 2025 Jun 3;68(7):3204–3225. doi: 10.1044/2025_JSLHR-24-00587

Grammaticality of Tag Questions as a Longitudinal Morphosyntactic Marker of Children With Specific Language Impairment Compared to Peers Ages 5–18 Years

Mabel L Rice a,, Kathleen Kelsey Earnest b, Lesa Hoffman c
PMCID: PMC12263191  PMID: 40460410

Abstract

Purpose:

Previous studies documenting longitudinal linguistic outcomes of children with specific language impairment (SLI) compared to their age peers focus on the property of obligatory finiteness marking in sentences across the age span of 5–18 years. This study evaluates tag questions as syntactically complex sentences that extend the demands of finiteness marking across clauses, requiring coordination of negation in the base sentence and tag question.

Method:

Five hundred eleven children (240 unaffected, 271 SLI affected), between 5 and 18 years of age participated, following a rolling recruitment longitudinal design, which included a total of 4,718 observations. The linguistic task was designed to evaluate four variations of tag questions, two of which targeted polarity requirements for tags and two of which were nonpolarity differences in the tag. Growth modeling methods were used to test hypotheses of group differences (SLI vs. unaffected) in understanding of tag questions over 5–18 years. Covariates were child nonverbal IQ, mother's education, and child sex.

Results:

Children with SLI's outcomes varied by age and item type. They performed below unaffected children across all tag outcomes at 10 years, scored correctly on nonpolarity items at 18 years (ceiling levels), and continued to lag unaffected children at 18 years on polarity items. Significant SLI effects on the outcomes were not moderated by the covariates.

Conclusions:

By 18 years, the SLI group performed the nonpolarity items correctly but continued to struggle with polarity items. Thus, polarity is of interest as a possible screener for SLI throughout the school years.


Children's early acquisition of morphosyntax reveals differences relative to adult-level grammatical representations; some components are relatively error-free, whereas others are omitted or used with overt errors. Finiteness marking in English-speaking children is the focus of this study, that is, the use of morphemes, such as third-person singular –s, regular past tense –ed, and DO and BE auxiliaries or main verbs, to mark tense and agreement in sentences. Finiteness is acquired relatively late by English-speaking children (as compared to children learning other languages), who are likely to omit finiteness marking sometimes in obligatory contexts, as if such forms were optional instead of obligatory (Wexler, 1994). This delay in finiteness marking is even more evident in children with specific language impairment (SLI), that is, children whose linguistic development is delayed although their hearing, nonverbal cognitive abilities, and general health are within normative age expectations. The condition of SLI can be considered a subset of the broader diagnostic category, developmental language disorder (https://www.nidcd.nih.gov/health/developmental-language-disorder). Such morphosyntactic delays in children with and without SLI can enhance our understanding of the patterns of language easily acquired by most children, as well as the morphosyntactic structures more difficult for children with SLI. Such late appearing components of grammar can be used as flags to identify affected children who might otherwise be undetected and overlooked for clinical services (Dale et al., 2018; Redmond, 2020; Rice et al., 2009, 2023; Rice & Wexler, 2001). This study is motivated by the need for better understanding of how grammar deficits in children with SLI can persist over time, as well as possible applications for identification of school-age children with SLI.

Our understanding of methods for measuring English morphosyntactic delays is strongest in young children in the preschool/early elementary school years (3–9 years). Assessments of English finiteness have focused on simple declarative sentences, such as “He (is) running away,” or “Now the bear want(s) a drink.” The parentheses indicate finiteness-marking sites/forms in the example sentences. These sentence forms are predominant in young children's spontaneous language samples early in language acquisition, with characteristic omissions of the forms, as indicated in parentheses in the example sentences. Spontaneous language samples are restricted by children's sometimes limited use of target sentence structures. By the late 1990s, experimental grammaticality judgment (GJ) tasks appeared, allowing for control of the morphosyntactic structures and frequency of examples of target forms in sentences designed to document age-related differences from the adult grammar in children's linguistic representations. A well-replicated outcome was that omission errors were accepted by children with SLI long after unaffected children had resolved this early stage of grammar (Rice et al., 1995, 1999; Rice & Wexler, 1996a, 1996b). To begin to map the knowledge of older children, a subsequent study (Rice et al., 2009) followed three groups of children (20 in the SLI group, 20 age controls, and 18 language-matched controls), from ages 6 to 15 years, to investigate their growth in understanding finiteness marking in single-clause questions requiring either the movement of finite forms, as in “(Is) the bug happy?” or insertion of auxiliary DO in wh-questions, such as “(Does) the bug like to eat?” The experimental task required children to make GJs for simple affirmative question sentences in which the target finiteness-marking forms, as shown in parentheses in the examples above, were present or omitted. The main findings were as follows: (a) The accuracy level for the affected group was persistently below that of the unaffected group throughout the age span. (b) The growth trajectories of the two groups did not differ. (c) The control children achieved near- or at-adult-level asymptote levels of performance throughout the age range. (d) The SLI group did not “catch up” to their age peers on these simple GJ tasks and instead maintained a generally flat level of performance throughout.

A recent study (Rice et al., 2023) extended our understanding of children's finiteness knowledge in longitudinal data from 5 to 18 years (in which children were assessed twice per year in the 5–8 years age range and annually for children ages 9–18 years). This study compared children with SLI and a control group of children without language impairments, in larger group samples (213 unaffected and 270 affected children), on a complex sentence GJ (GJ Complex) task. The base sentence for the GJ Complex task was a sentence composed of two underlying clauses: “When did the man say he painted a door?” Note that we can identify two underlying clauses because there are two locations requiring finiteness marking, as shown by “did” in the upper clause and “painted” in the lower clause. Of interest here, three sets of ungrammatical sentences were contrasted with the set of base sentences, where omitted forms are indicated by open underlines: (a) “When __ the man say he painted a door?”; (b) “When did the man say he paint__ a door?”; (c) “When __the man say he paint__ a door?” The findings were consistent with the outcomes of the earlier studies: (a) Levels of performance for the SLI group were consistently lower than the unaffected group. (b) The growth trajectories (or rate of change for the slopes) were the same for the SLI and unaffected groups. Note that for children with SLI to close the performance gap (i.e., catch up to controls), they must have a significantly stronger rate of change across age than control children. As this was not the case, children with SLI's low level of performance persisted across age. (c) The results held across the three tests of omissions in obligatory grammatical contexts. (d) As in our earlier studies, covariates of child nonverbal IQ, mother's education, and child sex did not significantly moderate these effects.

These previous studies provide conceptual and empirical strengths to build on. First, conceptually, there are strong guidelines established for characterizing the morphosyntax of finiteness in English sentences, a property well studied and documented by linguists studying adult grammar (Quirk et al., 1985) as well as child English (Pollock, 1989) and conceptually linked to acquisition of other languages as well (Guasti, 2016). Second, a series of empirical studies replicate patterns of group differences, comparing children with SLI to their age peers, over the span of 5–18 years, for their understanding of obligatory finiteness marking in simple and complex sentences. Children in the SLI group persistently performed at lower levels on finiteness marking in GJ tasks, as compared to their age peers, across sentence structures. Statistically significant group differences are replicated across studies, suggesting possible clinical applications for identification of unidentified children with SLI during their school-age years. Third, outcomes are linguistically interpretable, providing a pathway to identify other linguistic contexts likely to yield similar group differences and potentially additional tasks suitable for clinical studies.

However, conceptual and empirical limitations exist that need to be addressed. So far, GJ studies focusing on finiteness marking in complex sentences focused on presence versus absence of finiteness in licensed sites across clauses within a sentence. Yet, children must master how finiteness requirements interact with other elements of grammar. Consider the need to align finite forms across clauses, as in “The girl likes cookies, doesn't she?” This sentence requires alignments of finiteness across two different finite sites within the sentence, that is, “likes,” following the subject, “doesn't,” preceding the second subject, and coordinated use of negation in one but not both clauses. Such questions are known as “tag questions,” in reference to the “tag” relationship between the main clause and the “add-on” element that turns the sentence into a question. The complexity of tag questions could extend our understanding of children's knowledge of finiteness marking and the challenges for children with SLI.

Tag Questions

Four linguistic elements are evident in tag questions: (a) subject–verb agreement, (b) word order, (c) polarity (i.e., affirmative vs. negative), and (d) sociolinguistics (i.e., interpretation of the meaning of the tag). The present study, described below, focuses on tag questions in American English, where tag questions turn a statement into a question. The focus here is on the sociolinguistic usage of turning a statement into a question, checking information that the speaker assumes to be true.

Consider these examples:

  1. (a)  The girl is looking for a dog, isn't she?

    (b)  The girl is not looking for a dog, is she?

  2. (a)  She doesn't like dogs, does she?

    (b)  She likes dogs, doesn't she?

  3. (a)  They like dogs, don't they?

    (b)  They don't like dogs, do they?

  4. (a)  The boy has two dogs, doesn't he?

    (b)  The boy doesn't have two dogs, does he?

As shown in the examples, tag questions are made using an auxiliary verb and a subject pronoun. Negative question tags are usually contracted. If the main clause contains an auxiliary verb, as in Item 1, the same verb is used in the tag question. If there is no auxiliary verb, “do/does” or “did” are inserted in the tag (as in Item 2b). Note that the finiteness marking in the main clause must be coordinated with finiteness marking in the tag question, such that negation can appear once across the two possible sites, that is, double marking is not allowed. Word order of subject/verb must be coordinated across the two clauses, such that the finiteness marking slot precedes the subject of the tag question, whereas in the base statement, the subject precedes the finiteness marking slot. Finally, polarity, that is, affirmative versus negative, must alternate, such that if the base statement has an affirmative verb, the tab verb must be negative and vice versa.

Although tag questions are potentially rich sources of information about children's language acquisition, precedents in the literature are relatively sparce. An early study (McGrath & Kunze, 1973) established that tag question elicitation could reveal developmental differences in 48 typically developing children distributed across ages 5–11 years, beyond the age of primary language acquisition. From most to least difficult were the following requirements: (a) addition or deletion of negation, (b) auxiliary verb selection, (c) pronoun selection, and (d) inversion of the pronoun and the auxiliary verb. A subsequent study (Dennis et al., 1982) of 50 typically developing children, ages 6–14 years, reported that tag production improved from 6 to 8 years, but not thereafter. The various rules (i.e., pronoun, verb, polarity, and inversion) were acquired at different ages, such that the polarity rule was mastered by only half the oldest subjects, whereas the inversion rule was well established in half of the youngest group. The conclusion was that “linguistic skills involving simultaneous manipulation of various surface–structure syntactic features are acquired late in language development” (p. 1254).

Using the same experimental task as Dennis et al. (1982), called the Tags Question Task, a follow-up study (Weckerly et al., 2004) reported outcomes in atypical populations in children with SLI and children with early focal lesions, with each clinical group compared to a typically developing age group. Children were 4–16 years, divided into three age groups: 4–7, 8–11, and 12–16 years. Across the three groups (typical, SLI, and focal lesions), the data supported a “delayed” development of language behavior for the two clinical groups. Of interest here, they hypothesized that children with SLI would show a profile of poorer performance on agreement and auxiliary compared to subject and polarity requirements of tag structures. They found the opposite pattern, that is, agreement and auxiliary selection was robust in this group, although polarity was at lower levels of accuracy, through 8–16 years. This is consistent with a subsequent study of preschool children's sentence repetition, with outcomes that supported the conclusion that polarity requirements of tag questions are difficult for young children to acquire (Weeks, 1992).

Limitations of Previous Studies Addressed by This Study

Overall, previous studies, with promising outcomes despite a large gap in the research literature, point to the need for longitudinal studies of tag questions, with larger samples of children with and without SLI, across the full course of childhood. This study would parallel the outcomes of longitudinal GJs of tense marking in complex questions (Rice et al., 2023) and extend the evidence base for prolonged development of tag questions across the various potential morphosyntactic errors. Tag questions present opportunities to manipulate GJs of polarity. Consider four possible errors, shown as the “b” items below, from the GJ Tag task in the present study. These items show violation of the expectation of a single marker of negation coordinated across the two clauses of a tag question (see Table 1).

Table 1.

Grammaticality Judgment Tag task structure and example items.

Item type Double Affirmative Double Negative Bad Agreement Be Do Form Class Total
Grammatical 5 5 5 5 20
Ungrammatical 5 5 5 5 20
Total 10 10 10 10 40
Example items
 Grammatical She is looking for her mom, isn't she? You are not feeling better, are you? They miss their moms, don't they? He has a big head, doesn't he?
 Ungrammatical She is looking for her mom, is she? You are not feeling better, aren't you? They miss their moms, doesn't they? He has a big head, isn't he?

Note. Task structure is shown at the top, and example items are shown at the bottom. Underlined words indicate changes relative to the grammatical control items. The experimental task and individual items within the task are copyrighted and reprinted with permission. Rice Grammaticality Judgment Task–Tag Questions, Copyright 2002 by Mabel L. Rice (first author).

  • 5.  (a)  Grammatical: She is looking for her mom, isn't she?

      (b)  Ungrammatical: She is looking for her mom, is she?

We will refer to this error as “Double Affirmative,” that is, violating the rule that requires negation in the tag if the base sentence is affirmative.

  • 6.  (a)  Grammatical: You are not feeling better, are you?

      (b)  Ungrammatical: You are not feeling better, aren't you?

We call this error as “Double Negative,” that is, violating the rule that requires affirmative in the tag if the base sentence is negative.

  • 7.  (a)  Grammatical: They miss their moms, don't they?

      (b)  Ungrammatical: They miss their moms, doesn't they?

We call this error as “Bad Agreement,” in which the tag verb agreement is not aligned with the subject of the tag.

  • 8.  (a)  Grammatical: He has a big head, doesn't he?

      (b)  Ungrammatical: He has a big head, isn't he?

This error is “Be Do Form Class,” in which the rules for Be versus Do in the tag clause must follow the rules for Be/Do alignment of the base and tag sentences.

These four types of errors are of central interest for the present study. Examples 5b and 6b are errors of polarity, that is, alignment of negation in the base clause with the tag. As shown in the review above, previous studies of typical children and children with SLI, in roughly the same age groups, report lower levels of performance on tag questions for the clinical group. We expect group differences such that the clinical group will be more likely to accept tag question errors of Double Affirmative (Example 5b) or Double Negative (Example 6b) than their age peers. On the other hand, we do not expect such errors to be the result of an inability to detect poorly formed tags of different types. We predict similar group performance on Bad Agreement (Example 7b) or Be Do Form Class (Example 8b) errors. It is unknown if these error patterns will persist throughout childhood (for either group) or if the control group will resolve the errors and the SLI group will persist in the predicted errors into adolescence.

Goals of the Present Study

The primary goal of this study was to model the developmental outcomes of growth over time in SLI-affected and -unaffected children in four types of grammatical judgments to tag questions listed above: (a) Double Affirmative, (b) Double Negative, (c) Bad Agreement, and (d) Be Do Form Class. We predicted that SLI-affected children would score lower than unaffected children on all four GJ Tag outcomes and that these differences would be stable between 5 and 18 years of age. Next, we examined the extent to which covariates of child nonverbal IQ, maternal education, and child sex predicted overall performance level and growth over time in the four outcomes. We expected that SLI would remain a significant predictor of performance across all four outcomes even after adjusting for these covariates.

Following the precedent of our earlier study (Rice et al., 2023), our analyses examined the following questions for each of the four GJ Tag outcomes, providing descriptive outcomes per group and group comparisons, an examination of covariate effects on the grammatical judgment outcomes, and formal growth curve modeling of the outcomes.

  1. What is the overall baseline growth trajectory for each outcome across all children?

  2. Do SLI-affected children differ from unaffected children in their level of performance at the intercept (age of 10 years) and in their growth over time for each outcome?

  3. Do covariates of child nonverbal IQ, maternal education, and child sex significantly predict level of performance and rate of change in each outcome?

  4. Do the effects of SLI on the outcome trajectories differ by levels of the covariates?

Method

Ethics

This study was approved and carried out in accordance with the rules and regulations of the University of Kansas Institutional Review Board (Protocol 8223). Prior to study participation, parents provided written informed consent, and minor children provided verbal assent. Small reimbursements for effort were provided to participants, including toys for young children, gift cards for older children, and check payments to parents.

Participants

Participants were part of a longitudinal family-based study of children with SLI and their siblings, as well as control children and their siblings, that spanned 25 years of data collection, ending in the year 2020 (Rice et al., 2009, 2010, 2023; Rice & Hoffman, 2015). The study generated an archival database referred to here as the GJ Tag data set. The family-based design allows for pedigree-based genetic studies not reported in this article. As reported previously (Rice et al., 2023; Rice & Hoffman, 2015), SLI-affected children were recruited from speech-language pathologists in public schools (> 100 schools and attendance centers in the Midwestern United States), unaffected children were recruited from the same schools, and siblings were recruited into the study following the target SLI-affected and unaffected children. All children who met the following criteria were included: (a) monolingual native speakers of English in homes using General American English (GAE) as a dialect (Oetting, 2020; children with suspected dialectical variation were screened using the Diagnostic Evaluation of Language Variation [Seymour et al., 2003] and were not included if dialectal variation was confirmed); (b) passed a hearing screening (collected off-site) using 25 dB (30 dB in noisy environments) at 1000, 2000, and 4000 Hz; (c) no diagnosis of autism, intellectual, behavioral, or social impairments; and (d) typical intellectual functioning, defined as a score of 85 or above on an age-appropriate test of nonverbal IQ (see the Measures section) for target SLI-affected and unaffected children (who were screened at entry). For siblings or extended family members of the target children (who were enrolled regardless of their initial nonverbal IQ scores), the following criteria were used for nonverbal IQ: (a) Their score at the initial assessment (ages 5+ years) was 83 or above, or (b) their scores averaged across the first three measurement occasions (ages 5+ years) were 83 or above.

The GJ Tag data set included 511 individuals (296 males, 215 females), assessed longitudinally between 5 and 18 years of age, for a total of 4,718 observations in the analysis.1 The number of measurement occasions per person ranged from 3 to 19 (M = 9.23, SD = 3.95); children with fewer than three measurement occasions were excluded from analysis. On average, children were 7 years 7 months of age at their first GJ Tag assessment (range: 5;0–16;7 [years;months]). Children came from 256 families, which included some extended family members, with an average of two children per family participating in this study (range: 1–11). Based on omnibus language scores at study entry, 240 children were classified as unaffected (47%), and 271 were classified as SLI affected (53%). The race and ethnicity percentages were consistent with regional census data: 80.43% White, 1.37% Black, 5.09% American Indian, 11.74% multiracial, and 1.37% unknown or not reported; 7.24% reported as Hispanic. This distribution of race and ethnicity is representative of the Midwestern region from which the sample was drawn.

Procedure

In preliminary data collection, starting in 1995, we piloted items to verify local dialectal assumptions, by presenting items and asking respondents to judge them as “right” or “not so good.” The respondents were local residents, adults and children. The task was finalized in 1997 and subsequently entered data collection protocols. As described extensively in past reports on the archival longitudinal database (Rice et al., 2023; Rice & Hoffman, 2015), data collection followed an accelerated longitudinal design with rolling recruitment in which children varied continuously in age at first assessment, resulting in multiple age cohorts. Between December of 1997 and March of 2020, the GJ Tag task was administered every 6 months from 5 through 8 years of age and at 12-month intervals thereafter. To lessen demands on the families and to encourage ongoing participation, trained examiners traveled to the participants' home or school and collected data in vans customized for mobile testing. Assessments were administered by an individual examiner to an individual participant in sessions of approximately 15 min as part of a longer study protocol.

Measures

Exclusionary and Inclusionary Assessments for SLI Affectedness

Nonverbal IQ. Nonverbal IQ was measured at initial assessment. The nonverbal IQ measures included the Columbia Mental Maturity Scale (Burgemeister et al., 1972) for ages 5;0–5;11, the Wechsler Intelligence Scale for Children–Third Edition (Wechsler, 1991) for ages 6;0–15;11, and the Wechsler Adult Intelligence Scale–Third Edition (Wechsler, 1997) for those starting at 16 years (the oldest starting age in the analysis). All children in the present study met inclusion criteria for an initial nonverbal IQ score of 83 or higher.

Omnibus language. Children were classified as SLI affected or unaffected at study entry based on their performance on one of two age-based omnibus language measures: (a) Test of Language Development–Primary: Second Edition (Newcomer & Hammill, 1988) from 4 to 5;11 and (b) the Clinical Evaluation of Language Fundamentals–Third Edition (Semel et al., 1995) from 6+ years. Participants with a standard score of ≤ 85 (i.e., ≤ 16th percentile) were classified as SLI affected, and those scoring at 86 or above were classified as unaffected. The SLI-affected group also included children recruited into the study based on performance in the affected range on the Test of Early Language Development–Third Edition (Hresko et al., 1999) from ages 2 to 3;11 or on the Test of Early Grammatical Impairment (Rice & Wexler, 2001) from 2;6 to 8;11, where performance ≤ 1 SD below the mean was considered affected. Using these criteria, of the full sample of N = 511 participants, 53% were classified as SLI affected and 47% were classified as unaffected.

Family Questionnaire

Maternal education was obtained from a family questionnaire at the first time of measurement, with the following categories: some high school, high school graduate or general equivalency diploma (GED), some college, bachelor's degree, some graduate school, and graduate degree.

Experimental GJ Tag Task

Table 1 shows the overall structure of the GJ Tag task as well as example items. The child listened to tag questions and responded with their judgment as to whether the sentence sounded “right” (grammatical) or “not so good” (ungrammatical). The task is composed of seven practice items that can be repeated if necessary for understanding, followed by 40 test items. The 40 test items include four categories of tag questions, each containing 10 items each (five grammatical and five ungrammatical). The 40 GJ Tag questions (20 grammatical and 20 ungrammatical) were presented in the same randomized fashion across participants, with no more than three consecutive ungrammatical items. The task was administered via headphones using an audio recording of the sentences with a female voice in natural prosody. The examiner recorded the participant's oral judgments on a hard copy of the assessment document. Each session was video-recorded to ensure data integrity and to provide a backup of the participant's verbal response/judgments, if needed.

Training and Reliability

Details regarding training and reliability are the same as those reported for the GJ Complex task (Rice et al., 2023). As described in that article, the examiners consisted of doctoral students and full-time laboratory staff who were not blind to the affectedness status of the children, but they did not report on the “correctness” of the items at the time of data collection (to avoid giving unintended cues about the accuracy of the child's response). Examiners received extensive training in administration and scoring of the standardized and experimental measures (including the GJ Tag task), followed by ongoing monitoring, all under the supervision of a lead examiner and study principal investigator (PI; the first author). As part of their training, examiners (a) read the test manuals for standardized measures and internal documentation on the GJ Tag task; (b) viewed videos of test administration on study children; (c) practiced administration of all tasks on an experienced laboratory staff member until they were knowledgeable about, and at ease, in administering the tasks; (d) practiced administration on at least three typically developing children not enrolled in the study while the lead examiner and PI observed and provided feedback afterwards; (e) viewed an experienced examiner administer the full protocol to a study child in the field; and (f) administered the full protocol to a study child in the field under the supervision of an experienced examiner. Training steps were repeated if necessary. Regular validity checks were performed by trained research assistants who viewed videos of data collection and noted any deviations from the standard protocol. Sessions were regularly observed in the field by the lead examiner to ensure consistency in administration across examiners. For the GJ Tag task, each test form was reviewed for completion by the examiner who collected the measure and confirmed by a second examiner, as only the participant's response (and not the accuracy) was noted on the form. If an item response was missing, the examiner reviewed video footage from the session to obtain the participant's verbal response. Undergraduate student workers entered participant responses into SPSS as “R” or “N” for each item. Regular data audits were performed by the data manager to monitor and ensure integrity of the data.

Results

Calculation of A′ Dependent Variables for the GJ Tag Task

In our lab, following precedents in the literature on GJ tasks, the outcome variables are calculated according to a formula for A′ (Linebarger et al., 1983): A=0.5+yx1+yx4y1x , in which x is the proportion of false alarms (i.e., number of “right” responses to ungrammatical items divided by the number of “right” responses to both ungrammatical and grammatical items) and y is the proportion of hits (i.e., number of “right” responses to grammatical items divided by number of “right” responses to both ungrammatical and grammatical items). A′ is preferred because it adjusts for the tendency of children to provide affirmative responses and is interpreted as the proportion correct in a two-alternative (i.e., grammatical vs. ungrammatical) forced-choice procedure (Green, 1964; Grier, 1971). For validity of the assumptions, it is important that an equal number of grammatical and ungrammatical items be entered into the formula above, and complete responses to all items are necessary; we met these assumptions by only working with complete data. The highest possible value is 1.00, indicating perfect discrimination of grammatical from ungrammatical items. An A′ of .50 could indicate responses of “right” to both item types, and an A′ of less than .50 could indicate predominately “not so good” responses to both item types.

A′ outcomes for this study are reported in Table 2, per item type and child group (unaffected/SLI affected). There were 4,718 administrations of the GJ Tag task across the four A′ outcomes; this resulted in a total of 18,872 data points. Due to the longitudinal testing, children enter the table calculations at least 3 times, with an average of nine occasions each. The younger age groups contain the highest number of participants, peaking at 8 years of age and diminishing thereafter.

Table 2.

Group means and standard deviations for each grammaticality judgment (GJ) A′ outcome per age level.

GJ outcome Age level Unaffected (children n = 240)
SLI affected (children n = 271)
n M SD n M SD
Double Affirmative 5–5;11 179 0.52 0.24 145 0.49 0.22
Double Affirmative 6–6;11 229 0.57 0.21 234 0.52 0.20
Double Affirmative 7–7;11 251 0.60 0.23 308 0.52 0.20
Double Affirmative 8–8;11 272 0.71 0.24 326 0.53 0.24
Double Affirmative 9–9;11 175 0.82 0.19 217 0.58 0.23
Double Affirmative 10–10;11 157 0.85 0.19 201 0.63 0.21
Double Affirmative 11–11;11 156 0.88 0.16 191 0.69 0.24
Double Affirmative 12–12;11 154 0.90 0.15 182 0.72 0.23
Double Affirmative 13–13;11 150 0.92 0.13 166 0.77 0.21
Double Affirmative 14–14;11 126 0.94 0.10 150 0.79 0.23
Double Affirmative 15–15;11 116 0.94 0.10 125 0.82 0.20
Double Affirmative 16–16;11 95 0.97 0.06 110 0.84 0.19
Double Affirmative 17–17;11 83 0.97 0.08 91 0.83 0.19
Double Affirmative 18–18;11 63 0.95 0.10 66 0.84 0.21
Double Negative 5–5;11 179 0.54 0.27 145 0.53 0.25
Double Negative 6–6;11 229 0.63 0.25 234 0.52 0.27
Double Negative 7–7;11 251 0.70 0.26 308 0.58 0.25
Double Negative 8–8;11 272 0.79 0.22 326 0.61 0.26
Double Negative 9–9;11 175 0.82 0.24 217 0.67 0.22
Double Negative 10–10;11 157 0.84 0.23 201 0.68 0.24
Double Negative 11–11;11 156 0.86 0.22 191 0.67 0.25
Double Negative 12–12;11 154 0.88 0.19 182 0.70 0.29
Double Negative 13–13;11 150 0.89 0.21 166 0.72 0.29
Double Negative 14–14;11 126 0.89 0.21 150 0.71 0.29
Double Negative 15–15;11 116 0.94 0.12 125 0.73 0.29
Double Negative 16–16;11 95 0.93 0.16 110 0.77 0.28
Double Negative 17–17;11 83 0.96 0.12 91 0.75 0.28
Double Negative 18–18;11 63 0.93 0.17 66 0.75 0.28
Bad Agreement 5–5;11 179 0.66 0.27 145 0.59 0.24
Bad Agreement 6–6;11 229 0.85 0.23 234 0.66 0.26
Bad Agreement 7–7;11 251 0.93 0.14 308 0.80 0.24
Bad Agreement 8–8;11 272 0.97 0.10 326 0.90 0.17
Bad Agreement 9–9;11 175 0.98 0.09 217 0.91 0.18
Bad Agreement 10–10;11 157 0.98 0.07 201 0.94 0.14
Bad Agreement 11–11;11 156 0.99 0.04 191 0.95 0.12
Bad Agreement 12–12;11 154 0.99 0.05 182 0.95 0.15
Bad Agreement 13–13;11 150 0.99 0.05 166 0.96 0.09
Bad Agreement 14–14;11 126 1.00 0.02 150 0.98 0.08
Bad Agreement 15–15;11 116 0.99 0.03 125 0.97 0.10
Bad Agreement 16–16;11 95 1.00 0.02 110 0.98 0.06
Bad Agreement 17–17;11 83 0.99 0.03 91 0.97 0.08
Bad Agreement 18–18;11 63 0.99 0.04 66 0.98 0.04
Be Do Form Class 5–5;11 179 0.60 0.29 145 0.49 0.25
Be Do Form Class 6–6;11 229 0.82 0.23 234 0.62 0.26
Be Do Form Class 7–7;11 251 0.92 0.16 308 0.75 0.26
Be Do Form Class 8–8;11 272 0.96 0.12 326 0.85 0.19
Be Do Form Class 9–9;11 175 0.98 0.08 217 0.88 0.19
Be Do Form Class 10–10;11 157 0.98 0.09 201 0.91 0.16
Be Do Form Class 11–11;11 156 0.99 0.04 191 0.93 0.15
Be Do Form Class 12–12;11 154 0.99 0.05 182 0.92 0.18
Be Do Form Class 13–13;11 150 0.99 0.06 166 0.96 0.10
Be Do Form Class 14–14;11 126 0.99 0.03 150 0.97 0.09
Be Do Form Class 15–15;11 116 0.99 0.03 125 0.96 0.10
Be Do Form Class 16–16;11 95 0.99 0.03 110 0.98 0.06
Be Do Form Class 17–17;11 83 0.99 0.04 91 0.97 0.13
Be Do Form Class 18–18;11 63 1.00 0.02 66 0.97 0.10

Note. Children were tested twice per year through 8 years of age and annually thereafter. Calculations are per age level, with children entering the table more than once due to longitudinal testing. SLI = specific language impairment.

Descriptively, the unaffected group shows higher average A′ scores across age for all four outcomes when compared to the SLI-affected group. The A′ means for the unaffected group are at ceiling level (A′ = 1.00) at ages 14 and 16 years for Bad Agreement and 18 years for Be Do Form Class. For the Double Affirmative and Double Negative outcomes, the A′ means approach—but do not quite reach 1.00—at ages 16 and 17 years. Conversely, the SLI group is consistently at lower levels of performance across the full age range for Double Affirmative and Double Negative items, per our expectations, whereas for Bad Agreement and Be Do Form Class at younger ages, the SLI group trails the unaffected group, although they seem to catch up by 16 years of age. These descriptive differences are shown in Figure 1, in which the means per outcome were estimated via saturated means models.

Figure 1.

A line graph of the age ranges from 0 to 18 versus A prime ranges from 0.0 to 1.0. The graph plots eight lines of unaffected bad agreement, unaffected be do form class, unaffected double affirmative, unaffected double negative, specific language impairment bad agreement, specific language impairment be do form class, specific language impairment double affirmative, and specific language impairment double negative. All lines have starts at age 5 between the A prime of 0.4 and 0.7, rises, and ends at age 18 between the A prime of 0.8 and 1.0.

A′ performance across age for unaffected versus SLI-affected groups for the four grammatical judgment tag outcomes. A′ values are saturated means generated from separate statistical models placed in the same figure for visual comparison. SLI = specific language impairment; Unaff = unaffected.

Predicting Grammatical Judgment Outcomes

The predictors of A′ scores included child age and group (i.e., SLI vs. unaffected group) and the covariates (nonverbal IQ scores, maternal education, child sex), following those from previous reports on the 25-year longitudinal study (Rice et al., 2023; Rice & Hoffman, 2015). First, the time-varying predictor of children's exact age at each occasion, hereafter referred to as “time,” was log transformed and centered at log-age 10 years. This allowed for the linear age slope to approximate the observed exponential-type growth trends (Singer & Willett, 2003). Second, because children in this accelerated longitudinal design ranged in age from 5;0 to 16;9 at their first GJ assessment, 64% of the variation in log-age reflected cross-sectional age differences. Accordingly, a time-invariant continuous predictor, hereafter referred to as age cohort (centered at log-age 8 years), was used to model the differential effects of cross-sectional age. Finally, other time-invariant predictors included SLI affectedness (0 = unaffected, 1 = affected) and covariates of child nonverbal IQ (a standard score centered at 100), child sex (0 = girl, 1 = boy), and maternal education (some high school, high school graduate or GED, some college, bachelor's degree, some graduate school, or graduate degree). Given preliminary analyses showing no improvement in prediction as a categorical predictor, maternal education was modeled as an interval predictor with a linear slope in the models reported below (centered such that 0 = high school graduate or GED).

Table 3 shows predicted means, standard errors, and confidence intervals for the time-invariant predictors at the initial assessment for the 240 unaffected and 271 SLI-affected children. The means and group comparisons were provided via Stata Mixed Version 17 (StataCorp, 2021) using models (estimated with residual maximum likelihood and Satterthwaite denominator degrees of freedom) that included a random intercept variance component for the relatedness of children within families and a single predictor for SLI affectedness. SLI-affected children scored significantly lower on omnibus language, consistent with the definition of affectedness here (omnibus language ≤ 85 and nonverbal IQ > 82), and as reported previously also scored lower on nonverbal IQ and had lower levels of maternal education (Rice et al., 2023). More precise estimates of longitudinal group differences were then obtained from multilevel models of growth across age in the four types of grammatical judgments to tag questions, as described next.

Table 3.

Predicted means (M), standard errors (SE), and confidence intervals (CI) for key variables for unaffected and specific language impairment (SLI)–affected participants on initial assessment.

Variable Unaffected (n = 240)
SLI affected (n = 271)
Group comparisons
M SE 95% CI
M SE 95% CI
t df Cohen's d
LL UL LL UL
First GJ age yearsa 7.52 0.17 7.19 7.86 7.55 0.16 7.24 7.87 0.12 506 0.01
Omnibus languageb 100.46 0.62 99.24 101.67 78.56 0.59 77.41 79.72 26.78 506 −2.38
Nonverbal IQc 104.07 0.74 102.62 105.52 98.93 0.70 97.54 100.31 5.49 491 −0.50
Maternal educationd 3.49 0.09 3.32 3.66 3.34 0.09 3.17 3.51 3.46 277 −0.42

Note. Bold values indicate p < .01. LL = lower limit; UL = lower limit; GJ = grammaticality judgment.

a

Age in years at first GJ Tag assessment.

b

First available of Test of Language Development–Primary: Second Edition or Clinical Evaluation of Language Fundamentals–Third Edition.

c

Columbia Mental Maturity Scale or Wechsler Intelligence Scale for Children–Third Edition or Wechsler Adult Intelligence Scale–Third Edition (first available starting at age 5 years).

d

Coded where 1 = some high school, no diploma; 2 = high school graduate, diploma, or general equivalency diploma; 3 = some college, no degree; 4 = bachelor's degree; 5 = some graduate work; 6 = graduate degree.

Model Specification

Examination of Ceiling Effects and Residual Variances

Given that preliminary descriptive statistics suggested the presence of ceiling effects, we examined the percentage of observations with an A′ at ceiling (i.e., A′ = 1.00) across each age level and outcome. At age 10 years, the Bad Agreement and Be Do Form Class outcomes showed the highest percentage of observations at ceiling: 78%–83%. Ceiling effects were present, although less pronounced, for Double Affirmative and Double Negative outcomes: 23%–34%. Consequently, given that the use of a standard prediction model may result in expected outcomes that exceed the possible range of the A′ scale, we followed the advice given by Wang et al. (2009) to use a Tobit model instead.

Model Selection

Two-level Tobit longitudinal models in which time (Level 1) was nested within persons (Level 2) were used to examine growth over time in each of the four outcomes of grammatical judgments to tag questions. Models were estimated in Mplus Version 8.8 (Muthén & Muthén, 1998–2017) using full-information robust marginal maximum likelihood with 7 points of numeric integration per random effect dimension. Start values were used to aid in convergence as needed. Given this complexity of estimation, a clustered sampling correction was used to account for the dependency of children from the same family. The four grammatical judgment measures were each predicted as censored outcomes in separate models using Tobit regression, in which the continuous dependent variable was predicted on its original scale as censored from above (for the A′ ceiling of 1.00). Under the assumption of a conditional normal distribution (for the time-specific residuals here), the Tobit model uses the proportion of observations past the censoring point to inform what the model parameters would be if the outcome were not censored by limitations of measurement (Long, 1997; Twisk & Rijmen, 2009; Wang et al., 2009). The significance of individual fixed effects was evaluated via their Wald test p values, whereas the significance of multiple fixed effects or of random effects variances and covariances was evaluated via −2ΔLL tests (i.e., likelihood ratio tests using degrees of freedom equal to the difference in the number of estimated parameters).

Total R2 was calculated for each model as the squared correlation between the actual outcomes and the outcomes predicted by the model fixed effects (i.e., analogous to R2 in a single-level regression; Hoffman, 2015). These were calculated using predicted outcomes from models estimated using Stata METOBIT, which are equivalent to the Tobit models estimated within Mplus. Similarly, figures showing predicted outcomes from the Tobit models were generated using the MARGINS predict(ystar(.,1)) postestimation command in Stata to obtain estimates at each age, with predicted A′ values truncated at the censoring point of 1.00 for the purpose of visual display.

To supplement total R2 as a general effect size, we also provide partial Cohen's d in a standardized mean difference metric for significant group differences (e.g., SLI affected vs. unaffected), as well as partial eta (η) in a correlation metric for significant slopes for continuous predictors (e.g., nonverbal IQ). “Partial” refers to the unique contribution of each predictor after controlling for its overlap with other predictors in the model relative to the amount of unexplained variance. Effect sizes were calculated using the test statistics (for each estimate divided by its standard error) for the fixed effects after approximating denominator degrees of freedom (df) for the model fixed effects (as the number of participants minus the number of fixed effects) using the formulas below (Darlington & Hayes, 2016).

d=2×tvaluedf
η=tvaluetvalue2+df

The modeling sequence included the following steps for each of the four A′ outcomes: (1) an empty means, random intercept only model; (2) add fixed and random linear and quadratic effects of time-varying log-age as time, forming an unconditional growth model; (3) add fixed effects of age cohort to the unconditional growth model; (4) add SLI group and its interactions with time, forming a conditional growth model; (5) add the time-invariant covariates of initial child nonverbal IQ, maternal education, and child sex, as well as their interactions with time; and (6) add interactions between SLI and each covariate, as well as SLI by covariate by time, to examine potential moderation of the SLI effects by the covariates. Total R2 was calculated for each model as well as R2 change between successive models.

On Steps 5 and 6 above, the models for the Bad Agreement outcome initially struggled to converge, and the results indicated that the random intercept–quadratic time slope covariance was approaching a correlation of −1.00. To maintain a positive definite solution, we used MODEL CONSTRAINT to calculate correlations for each of the three random effect covariances and to constrain the intercept–quadratic correlation to be > −.99. The models converged thereafter without issues.

Question 1: What Is the Overall Baseline Growth Trajectory for Each Outcome Across All Children?

Baseline (Unconditional) Growth Models

The A′ means across age for each grammatical judgment outcome were first modeled for descriptive purposes, separately per outcome, using a saturated means, random intercept model on age integer instead of exact age, using Stata Mixed. The average A′ score approached ceiling levels for Bad Agreement and Be Do Form Class relatively early (i.e., by 9–10 years of age), as compared to Double Affirmative and Double Negative outcomes, in which growth is visible across the full age span and the predicted average A′ score remains below ceiling at age 18 years.

Unconditional growth models were then estimated for each outcome. Fixed and random higher order linear and quadratic effects of time were examined and retained when significant, followed by effects of age cohort. The total R2 and percentage of additional variance in total R2 after adding predictors to each successive model are shown in Table 4. Results from the final unconditional growth models are shown in Table 5. As shown, effects of age cohort on the intercept, linear, and quadratic time slopes, as well as a quadratic age cohort effect on the intercept, were examined. The only significant cohort moderation effect was for the Double Affirmative outcome, in which there was an interaction between linear age cohort and the quadratic time slope. In this model, the quadratic time slope at the age cohort was small and not statistically significant. This was moderated by the linear age cohort effect, such that for each year older at study entry, children showed greater deceleration of change over time. As shown in Table 4 (Model 3), the effects of age cohort on total R2 were minimal, accounting for less than or equal to 1.10% of additional variance in the A′ outcomes. These outcomes constitute the base models for the subsequent evaluation of the contributions of predictors, as reported below.

Table 4.

Total R2 and R2 change (Δ) for unconditional and conditional longitudinal models.

Modeling sequence Double Affirmative
Double Negative
Bad Agreement
Be Do Form Class
Total R2 ΔR2 Total R2 ΔR2 Total R2 ΔR2 Total R2 ΔR2
1. Empty means, random intercept-only model
2. Unconditional growth: add fixed and random effects of time .266 .113 .286 .310
3. Unconditional growth: add cohort effects .277 1.10% (2 vs. 3) .122 0.90% (2 vs. 3) .285 −0.10% (2 vs. 3) .311 0.10% (2 vs. 3)
4. Conditional growth: add SLI and SLI × Time .358 8.10% (3 vs. 4) .198 7.60% (3 vs. 4) .311 2.60% (3 vs. 4) .351 4.00% (3 vs. 4)
5. Conditional growth: add covariates and Covariates × Time .365 0.70% (4 vs. 5) .201 0.30% (4 vs. 5) .320 0.90% (4 vs. 5) .359 0.80% (4 vs. 5)
6. Conditional growth: add SLI × Covariates and SLI × Covariates × Time .366 0.10% (5 vs. 6) .205 0.40% (5 vs. 6) .317 −0.30% (5 vs. 6) .360 0.10% (5 vs. 6)

Note. Model 1 is shown above since it is the first in the sequence, but there is no R2, as it is an empty model. ΔR2 = change in R2 between successive models in the sequence, multiplied by 100 to show the percentage change in total R2 after adding predictors to each model. SLI = specific language impairment.

Table 5.

Unconditional growth model parameters.

Parameter Double Affirmative
Total R2 = .277
Double Negative
Total R2 = .122
Est SE 95% CI
p < Est SE 95% CI
p <
LL UL LL UL
Fixed effects
Intercept (age 10 years) 0.72 0.01 0.70 0.75 .001 0.77 0.02 0.73 0.81 .001
Linear time slope 0.62 0.04 0.54 0.70 .001 0.49 0.05 0.39 0.58 .001
 Quadratic time slope 0.07 0.08 −0.08 0.22 .357 0.02 0.11 −0.18 0.23 .830
 Linear age cohort on intercept −0.03 0.05 −0.12 0.07 .588 −0.04 0.05 −0.14 0.07 .486
 Linear age cohort on linear time −0.08 0.13 −0.33 0.18 .554 −0.03 0.16 −0.35 0.29 .862
Linear age cohort on quadratic time −0.73 0.18 −1.09 −0.38 .001 0.11 0.25 −0.38 0.60 .651
 Quadratic age cohort on intercept 0.15 0.13 −0.11 0.42 .247 0.33 0.17 −0.01 0.66 .054
Random effects: variance components
L1 residual 0.04 0.00 0.04 0.04 .001 0.07 0.00 0.06 0.07 .001
L2 intercept 0.04 0.00 0.04 0.05 .001 0.07 0.01 0.05 0.08 .001
L2 linear time 0.11 0.01 0.08 0.14 .001 0.16 0.02 0.12 0.20 .001
L2 quadratic time 0.23 0.05 0.13 0.33 .001 0.30 0.08 0.14 0.45 .001
L2 intercept–linear time 0.04 0.01 0.03 0.05 .001 0.07 0.01 0.05 0.09 .001
L2 intercept–quadratic time −0.08 0.01 −0.10 −0.05 .001 −0.07 0.02 −0.10 −0.03 .001
 L2 linear–quadratic time 0.00 0.02 −0.04 0.03 .845 0.04 0.04 −0.03 0.11 .249
Parameter Bad Agreement
Total R 2 = .285
Be Do Form Class
Total R 2 = .311
Est SE 95% CI p < Est SE 95% CI p <
LL UL LL UL
Fixed effects
Intercept (age 10 years) 1.31 0.02 1.27 1.36 .001 1.24 0.03 1.19 1.29 .001
Linear time slope 0.81 0.06 0.70 0.93 .001 0.85 0.06 0.74 0.96 .001
Quadratic time slope −1.00 0.12 −1.24 −0.75 .001 −0.85 0.13 −1.11 −0.60 .001
 Linear age cohort on intercept −0.01 0.08 −0.16 0.14 .912 0.05 0.08 −0.12 0.21 .587
 Linear age cohort on linear time 0.30 0.18 −0.06 0.65 .098 0.06 0.17 −0.28 0.39 .745
 Linear age cohort on quadratic time −0.54 0.29 −1.11 0.04 .067 −0.33 0.35 −1.01 0.35 .340
 Quadratic age cohort on intercept 0.06 0.18 −0.29 0.41 .745 0.49 0.22 0.06 0.92 .027
Random effects: variance components
L1 residual 0.08 0.00 0.07 0.08 .001 0.07 0.00 0.06 0.08 .001
L2 intercept 0.09 0.01 0.07 0.12 .001 0.11 0.01 0.09 0.13 .001
L2 linear time 0.07 0.02 0.03 0.10 .001 0.09 0.02 0.05 0.14 .001
L2 quadratic time 0.27 0.07 0.14 0.41 .001 0.39 0.08 0.24 0.55 .001
 L2 intercept–linear time 0.01 0.01 −0.01 0.04 .186 0.03 0.01 0.01 0.05 .015
L2 intercept–quadratic time −0.15 0.03 −0.21 −0.10 .001 −0.19 0.03 −0.25 −0.14 .001
 L2 linear–quadratic time −0.02 0.03 −0.08 0.03 .453 −0.01 0.03 −0.06 0.04 .577

Note. The time-varying predictor of age (“time”) and effects of age cohort were log-transformed. The intercept parameter is time-varying log-age, centered at log-age 10 years, and represents the average A′ outcome at 10 years of age. The estimates for age cohort represent the cross-sectional effects of children's exact age at study entry, centered at log-age 8 years. Bold values indicate p < .01. Covariances are indicated with dash (−). SE = standard error; CI = confidence interval; LL = lower limit; UL = upper limit; Est = estimate; L1 = Level 1: variance over time and within persons; L2 = Level 2: variance across persons within families.

Questions 2 and 3 (Conditional Models)

2. Do SLI-Affected Children Differ From Unaffected Children in Their Level of Performance at the Intercept (Age 10 Years) and in Their Growth Over Time for Each Outcome?

Next, the main effect of SLI and its interactions with time were added to each model, resulting in increases in total variance ranging from 2.60% to 8.10% (see Table 4, Model 4). As shown in Figure 2, the main effect of SLI group was a significant predictor of the intercept for all four grammatical judgment outcomes, such that SLI-affected children at age 10 years had lower A′ by approximately .29 on average. Affected children also had a significantly weaker linear time slope for Double Affirmative and Double Negative outcomes at age 10 years, although for Double Affirmative, this slope became significantly more positive with age. Even so, the affected group was still predicted to be significantly lower than the unaffected group at age 18 years on the Double Affirmative and Double Negative outcomes, as noted above. For Bad Agreement and Be Do Form Class, affected children had a nonsignificantly stronger linear time slope at age 10 years than unaffected children. This linear time slope became significantly more positive with age, resulting in a closing of the gap between the groups at the older ages for the Bad Agreement and Be Do Form Class outcomes, as noted earlier.

Figure 2.

Four line graphs with error bars. Graph 1 of age ranges from 5 to 18 versus double affirmative A prime ranges from 0.0 to 1.0. The unaffected line begins at (5, 0.25), rises, and ends at (18, 0.95) and the specific language impairment affected begins at (5, 0.35), rises, and ends at (18, 0.85). Graph 2 of age ranges from 5 to 18 versus double negative A prime ranges from 0.0 to 1.0. The unaffected line begins at (5, 0.4), rises, and ends at (18, 0.9) and the specific language impairment affected begins at (5, 0.45), rises, and ends at (18, 0.75). Graph 3 of age ranges from 5 to 18 versus bad agreement A prime ranges from 0.0 to 1.0. The unaffected line begins at (5, 0.4), rises, and ends at (18, 0.97) and the specific language impairment affected begins at (5, 0.23), rises, and ends at (18, 0.96). Graph 4 of age ranges from 5 to 18 versus be do form class A prime ranges from 0.0 to 1.0. The unaffected line begins at (5, 0.35), rises, and ends at (18, 0.95) and the specific language impairment affected begins at (5, 0.25), rises, and ends at (18, 0.95).

Effects of SLI affectedness on the four grammatical judgment A′ outcomes: predicted means and standard errors. SLI = specific language impairment.

3. Do Covariates of Child Nonverbal IQ, Maternal Education, and Child Sex Significantly Predict Level of Performance and Rate of Change in Each Outcome?

Table 6 shows the final model results after adding covariates of initial child nonverbal IQ, maternal education, and child sex, along with their interactions with linear and quadratic time to each grammatical judgment model. Figure 3 displays the growth trajectories for SLI-affected and unaffected children per outcome from the final models. As shown, the SLI group effects on the intercept, linear, and quadratic time slopes were unchanged. The covariates and their interactions with time accounted for a relatively small amount of additional variance (i.e., 0.90% or less; see Table 4, Model 5). As shown by the Cohen's d effect sizes in Table 7, after accounting for all other predictors in the model, the average predicted A′ in the SLI-affected group at log-age 10 years was 0.66–0.99 SDs lower than the average predicted A′ in the unaffected group.

Table 6.

Conditional growth model parameters.

Parameter Double Affirmative
Total R2 = .365
Double Negative
Total R2 = .201
Est SE 95% CI
p < Est SE 95% CI
p <
LL UL LL UL
Fixed effects
Intercept (age 10 years) 0.84 0.02 0.80 0.88 .001 0.94 0.03 0.88 1.00 .001
Linear time slope 0.70 0.05 0.60 0.81 .001 0.69 0.07 0.56 0.81 .001
 Quadratic time slope −0.04 0.12 −0.27 0.19 .722 −0.05 0.16 −0.35 0.26 .769
 Linear age cohort on intercept 0.00 0.04 −0.09 0.09 .996 −0.03 0.06 −0.14 0.09 .644
 Linear age cohort on linear time 0.01 0.13 −0.24 0.26 .930 0.09 0.16 −0.22 0.41 .567
Linear age cohort on quadratic time −0.82 0.18 −1.18 −0.46 .001 −0.02 0.26 −0.52 0.49 .953
 Quadratic age cohort on intercept −0.01 0.12 −0.25 0.23 .925 0.09 0.17 −0.23 0.42 .583
SLI on intercept −0.21 0.02 −0.25 −0.17 .001 −0.24 0.03 −0.30 −0.19 .001
SLI on linear time −0.21 0.04 −0.29 −0.12 .001 −0.30 0.05 −0.40 −0.20 .001
SLI on quadratic time 0.31 0.09 0.14 0.48 .001 0.12 0.11 −0.10 0.35 .287
 NVIQ on intercept 0.00 0.00 0.00 0.00 .062 0.00 0.00 0.00 0.00 .490
 NVIQ on linear time 0.00 0.00 0.00 0.01 .035 0.00 0.00 −0.01 0.00 .953
 NVIQ on quadratic time 0.00 0.00 −0.01 0.01 .440 0.00 0.01 −0.01 0.01 .760
 M-Ed on intercept 0.02 0.01 0.00 0.04 .025 0.02 0.01 0.00 0.04 .042
 M-Ed on linear time 0.01 0.02 −0.02 0.05 .411 0.02 0.02 −0.02 0.06 .313
 M-Ed on quadratic time −0.05 0.03 −0.12 0.01 .103 −0.05 0.04 −0.13 0.02 .144
BvG on intercept −0.04 0.02 −0.08 0.00 .053 −0.07 0.03 −0.12 −0.02 .005
 BvG on linear time 0.03 0.05 −0.06 0.11 .575 −0.07 0.05 −0.16 0.03 .161
 BvG on quadratic time 0.01 0.09 −0.17 0.19 .921 0.06 0.10 −0.14 0.25 .575
Random effects: variance components
L1 residual 0.04 0.00 0.04 0.04 .001 0.07 0.00 0.06 0.07 .001
L2 intercept 0.03 0.00 0.02 0.03 .001 0.05 0.01 0.04 0.06 .001
L2 linear time 0.10 0.01 0.07 0.12 .001 0.12 0.02 0.09 0.16 .001
L2 quadratic time 0.19 0.05 0.10 0.28 .001 0.26 0.08 0.12 0.41 .001
L2 intercept–linear time 0.02 0.00 0.02 0.03 .001 0.05 0.01 0.03 0.06 .001
L2 intercept–quadratic time −0.05 0.01 −0.07 −0.03 .001 −0.05 0.02 −0.08 −0.02 .001
 L2 linear–quadratic time 0.02 0.02 −0.02 0.05 .323 0.05 0.03 −0.02 0.11 .145
Parameter Bad Agreement
Total R 2 = .320
Be Do Form Class
Total R 2 = .359
Est SE 95% CI p < Est SE 95% CI p <
LL UL LL UL
Fixed effects
Intercept (age 10 years) 1.44 0.03 1.38 1.51 .001 1.44 0.04 1.36 1.51 .001
Linear time slope 0.76 0.07 0.63 0.88 .001 0.84 0.08 0.68 1.01 .001
Quadratic time slope −1.27 0.08 −1.43 −1.11 .001 −1.12 0.18 −1.46 −0.77 .001
 Linear age cohort on intercept 0.07 0.07 −0.07 0.22 .324 0.14 0.08 −0.01 0.29 .061
 Linear age cohort on linear time 0.33 0.14 0.05 0.60 .019 0.01 0.17 −0.32 0.35 .938
Linear age cohort on quadratic time −0.72 0.19 −1.08 −0.35 .001 −0.43 0.33 −1.08 0.22 .196
 Quadratic age cohort on intercept −0.11 0.15 −0.41 0.18 .455 0.33 0.20 −0.07 0.72 .110
SLI on intercept −0.24 0.03 −0.31 −0.18 .001 −0.31 0.04 −0.38 −0.24 .001
 SLI on linear time 0.04 0.05 −0.07 0.14 .475 0.06 0.05 −0.05 0.16 .304
SLI on quadratic time 0.31 0.11 0.10 0.52 .005 0.53 0.14 0.26 0.80 .001
NVIQ on intercept 0.01 0.00 0.00 0.01 .001 0.01 0.00 0.00 0.01 .001
 NVIQ on linear time 0.00 0.00 −0.01 0.00 .067 0.00 0.00 −0.01 0.00 .327
 NVIQ on quadratic time −0.01 0.01 −0.02 0.00 .059 −0.01 0.01 −0.03 0.00 .018
 M-Ed on intercept 0.03 0.01 0.00 0.05 .023 0.01 0.01 −0.01 0.04 .265
 M-Ed on linear time 0.02 0.02 −0.03 0.06 .467 −0.01 0.02 −0.05 0.04 .826
 M-Ed on quadratic time 0.02 0.04 −0.06 0.09 .663 0.01 0.05 −0.08 0.09 .891
 BvG on intercept −0.03 0.03 −0.09 0.04 .402 −0.04 0.03 −0.11 0.03 .265
 BvG on linear time 0.02 0.05 −0.07 0.11 .668 −0.07 0.05 −0.16 0.03 .192
 BvG on quadratic time 0.03 0.10 −0.17 0.22 .786 −0.04 0.14 −0.31 0.23 .777
Random effects: Variance components
L1 residual 0.08 0.00 0.07 0.08 .001 0.07 0.00 0.06 0.08 .001
L2 intercept 0.07 0.01 0.05 0.09 .001 0.08 0.01 0.06 0.09 .001
L2 linear time 0.05 0.01 0.03 0.08 .001 0.09 0.02 0.05 0.13 .001
L2 quadratic time 0.22 0.04 0.13 0.30 .001 0.29 0.07 0.16 0.42 .001
L2 intercept–linear time 0.02 0.01 0.00 0.04 .039 0.03 0.01 0.02 0.05 .001
L2 intercept–quadratic time −0.12 0.02 −0.16 −0.08 .001 −0.13 0.02 −0.18 −0.09 .001
 L2 linear–quadratic time −0.05 0.02 −0.08 −0.01 .011 −0.03 0.03 −0.08 0.02 .281

Note. The time-varying predictor of age (“time”) and effects of age cohort were log-transformed. The estimates for age cohort represent the cross-sectional effects of children's exact age at study entry, centered at log-age 8 years. The intercept for each GJ outcome reflects the expected A′ score for a 10-year-old girl, without SLI, who entered the study at 8 years of age with average-level performance on nonverbal IQ and whose mother has a high school education or GED. The Est for SLI reflects the difference in A′ score for SLI-affected compared to unaffected children. The Est for NVIQ reflects the increase in A′ score for each 1-unit increase in IQ score. The Est for M-Ed reflects the increase in A′ score for each 1-unit increase in maternal education. The Est for BvG reflects the difference in A′ score for boys when compared to girls. Bold values indicate p < .01. Covariances were indicated with dash (−). Est = estimate; SE = standard error; CI = confidence interval; LL = lower limit; UL = upper limit; SLI = specific language impairment (0 = unaffected); NVIQ = nonverbal IQ (centered at 100); M-Ed = maternal education (0 = high school/GED); BvG = boys versus girls (0 = girls); L1 = Level 1: variance over time and within persons; L2 = Level 2: variance across persons within families; GJ = grammaticality judgment; GED = general equivalency diploma.

Figure 3.

Four line graphs with error bars. Graph 1 of age ranges from 5 to 18 versus double affirmative A prime ranges from 0.0 to 1.0. The unaffected line begins at (5, 0.33), rises, and ends at (18, 0.9) and the specific language impairment affected begins at (5, 0.42), rises, and ends at (18, 0.85). Graph 2 of age ranges from 5 to 18 versus double negative A prime ranges from 0.0 to 1.0. The unaffected line begins at (5, 0.42), rises, and ends at (18, 0.9) and the specific language impairment affected begins at (5, 0.45), rises, and ends at (18, 0.75). Graph 3 of age ranges from 5 to 18 versus bad agreement A prime ranges from 0.0 to 1.0. The unaffected line begins at (5, 0.32), rises, remains constant, and ends at (18, 0.95) and the specific language impairment affected begins at (5, 0.2), rises, remains constant, and ends at (18, 0.95). Graph 4 of age ranges from 5 to 18 versus be do form class A prime ranges from 0.0 to 1.0. The unaffected line begins at (5, 0.31), rises, remains, and ends at (18, 0.95) and the specific language impairment affected begins at (5, 0.21), rises, and ends at (18, 0.95).

Effects of SLI affectedness on the four grammatical judgment A′ outcomes in conditional models with covariates: predicted means and standard errors. SLI = specific language impairment.

Table 7.

Partial Cohen's d and partial eta (η) effect sizes for fixed effects in the conditional growth models.

Parameter Double Affirmative
Double Negative
Bad Agreement
Be Do Form Class
Total R2 = .365
Total R2 = .201
Total R2 = .320
Total R2 = .359
d η d η d η d η
Linear time slope 1.17 .51 0.95 .43 1.05 .46 0.92 .42
Quadratic time slope −0.03 −.02 −0.03 −.01 −1.40 −.57 −0.58 −.28
Linear age cohort on intercept 0.00 .00 −0.04 −.02 0.09 .04 0.17 .08
Linear age cohort on linear time 0.01 .00 0.05 .03 0.21 .11 0.01 .00
Linear age cohort on quadratic time −0.40 −.20 −0.01 .00 −0.35 −.17 −0.12 −.06
Quadratic age cohort on intercept −0.01 .00 0.05 .02 −0.07 −.03 0.14 .07
SLI on intercept −0.99 −.44 −0.81 −.38 −0.66 −.31 −0.80 −.37
SLI on linear time −0.44 −.21 −0.54 −.26 0.06 .03 0.09 .05
SLI on quadratic time 0.33 .16 0.10 .05 0.26 .13 0.35 .17
NVIQ on intercept 0.18 .09 0.09 .05 0.45 .22 0.27 .13
NVIQ on linear time 0.14 .07 0.00 .00 −0.18 −.09 −0.09 −.05
NVIQ on quadratic time −0.07 −.03 0.04 .02 −0.16 −.08 −0.21 −.10
M-Ed on intercept 0.21 .10 0.18 .09 0.20 .10 0.10 .05
M-Ed on linear time 0.07 .04 0.09 .05 0.06 .03 −0.02 −.01
M-Ed on quadratic time −0.15 −.07 −0.13 −.06 0.04 .02 0.01 .01
BvG on intercept −0.18 −.09 −0.25 −.12 −0.08 −.04 −0.10 −.05
BvG on linear time 0.05 .03 −0.13 −.06 0.04 .02 −0.12 −.06
BvG on quadratic time 0.01 .00 0.05 .03 0.02 .01 −0.03 −.01

Note. Effects sizes shown in bold indicate significant fixed effects at p < .01 in the conditional growth models. “Partial” refers to the unique contribution of each predictor after controlling for its overlap with other predictors in the model. SLI = specific language impairment: unaffected (coded as 0) versus SLI affected (coded as 1); NVIQ = nonverbal IQ; M-Ed = maternal education; BvG = boys (coded as 0) versus girls (coded as 1).

Regarding individual covariate effects, initial child nonverbal IQ (see Table 6) was a significant predictor of A′ at the intercept for the Bad Agreement and Be Do Form Class outcomes. Growth trajectories for three levels of nonverbal IQ (−1 SD = 85, average = 100, +1 SD = 115) are shown in Figure 4 for all outcomes. At log-age 10 years, for each 1-unit increase in nonverbal IQ, A′ increased significantly by only 0.01 for Bad Agreement and Be Do Form Class; effects of nonverbal IQ on the linear and quadratic log-age slope were not significant. Effect sizes for the main effects of initial nonverbal IQ on Bad Agreement and Be Do Form class were η = .22 and η = .13, respectively (see Table 7). The effect of maternal education (see Table 6) was not a significant predictor for any of the four A′ outcomes. The effect of child sex (see Table 6) was a significant predictor of the Double Negative outcome, in which boys performed lower than girls by .07 at age 10 years, and this effect was consistent across time (i.e., nonsignificant interactions with linear and quadratic time). The effect size for the difference between boys and girls on Double Negative A′ was d = −0.25 (see Table 7). Overall, as predicted, these variables had limited predictive power. After adding all the covariates and their interaction with time, the total R2 increased by less than 1% for these GJ Tag outcomes.

Figure 4.

Four line graphs with error bars. Graph 1 of age ranges from 5 to 18 versus double affirmative A prime ranges from 0.0 to 1.0. The lines of NVIQ equals 85, NVIQ equals 100, and NVIQ equals 115 begin at age 5 between 0.2 and 0.5, rise, and end at age 18 between 0.9 and 1.0. Graph 2 of age ranges from 5 to 18 versus double negative A prime ranges from 0.0 to 1.0. The lines of NVIQ equals 85, NVIQ equals 100, and NVIQ equals 115 begin at age 5 between 0.3 and 0.6, rise, and end at age 18 between 0.85 and 1.0. Graph 3 of age ranges from 5 to 18 versus bad agreement A prime ranges from 0.0 to 1.0. The lines of NVIQ equals 85, NVIQ equals 100, and NVIQ equals 115 begin at age 5 between 0.1 and 0.5, rise, and end at age 18 between 0.9 and 1.0. Graph 4 of age ranges from 5 to 18 versus be do form class A prime ranges from 0.0 to 1.0. The lines of NVIQ equals 85, NVIQ equals 100, and NVIQ equals 115 begin at age 5 between 0.2 and 0.5, rise, and ends at age 18 between 0.9 and 1.0.

Effects of nonverbal IQ on the four grammatical judgment A′ outcomes in conditional models with covariates: predicted means and standard errors. Lines in each figure represent the expected A′ scores for three levels of nonverbal IQ: −1 SD (IQ = 85), average IQ (IQ = 100), and + 1 SD (IQ = 115). NVIQ = nonverbal IQ.

Question 4: Do the Effects of SLI on the Outcome Trajectories Differ by Levels of the Covariates?

Interactions between SLI and all covariates, including interactions with linear and quadratic time, were then added to the final models to determine if any of the covariates moderated the effects of SLI on the A′ outcomes. The results from these models are shown in Table 8, and effect sizes are shown in Table 9. The addition of these interactions accounted for a minimal amount of additional variance (≤ 0.40%; see Table 4, Model 6). None of the interactions between SLI and the covariates were significant. Thus, the previously discussed significant effects of SLI were not moderated by covariates of child nonverbal IQ, maternal education, or child sex across the four GJ Tag outcomes.

Table 8.

Conditional growth model parameters including interactions between specific language impairment, covariates, and time.

Parameter Double Affirmative
Total R2 = .366
Double Negative
Total R2 = .205
Est SE 95% CI
p < Est SE 95% CI
p <
LL UL LL UL
Fixed effects
Intercept (age 10 years) 0.82 0.03 0.77 0.87 .001 0.92 0.04 0.85 0.99 .001
Linear time slope 0.71 0.06 0.59 0.83 .001 0.62 0.07 0.47 0.76 .001
 Quadratic time slope 0.03 0.13 −0.22 0.28 .808 −0.20 0.18 −0.56 0.15 .255
 Linear age cohort on intercept 0.00 0.04 −0.09 0.09 .989 −0.03 0.06 −0.14 0.09 .639
 Linear age cohort on linear time 0.00 0.13 −0.25 0.26 .980 0.09 0.16 −0.22 0.40 .572
Linear age cohort on quadratic time −0.79 0.18 −1.15 −0.43 .001 0.00 0.25 −0.48 0.48 .997
 Quadratic age cohort on intercept 0.01 0.12 −0.24 0.25 .959 0.11 0.16 −0.21 0.43 .508
SLI on intercept −0.17 0.03 −0.23 −0.11 .001 −0.25 0.04 −0.33 −0.16 .001
SLI on linear time −0.22 0.08 −0.39 −0.06 .008 −0.20 0.09 −0.37 −0.03 .020
SLI on quadratic time 0.22 0.14 −0.06 0.49 .118 0.52 0.19 0.14 0.90 .008
 NVIQ on intercept 0.00 0.00 0.00 0.01 .202 0.00 0.00 0.00 0.01 .256
 NVIQ on linear time 0.00 0.00 0.00 0.01 .606 0.00 0.00 −0.01 0.01 .788
 NVIQ on quadratic time −0.01 0.01 −0.02 0.00 .211 −0.01 0.01 −0.02 0.00 .128
 NVIQ × SLI on intercept 0.00 0.00 0.00 0.00 .941 0.00 0.00 −0.01 0.00 .220
 NVIQ × SLI on linear time 0.01 0.00 0.00 0.01 .148 0.00 0.00 −0.01 0.01 .764
NVIQ × SLI on quadratic time 0.01 0.01 −0.01 0.02 .209 0.02 0.01 0.01 0.04 .011
 M-Ed on intercept 0.03 0.01 0.01 0.06 .011 0.03 0.02 0.00 0.06 .026
 M-Ed on linear time 0.02 0.02 −0.02 0.07 .320 0.06 0.03 0.01 0.11 .031
 M-Ed on quadratic time −0.06 0.05 −0.16 0.03 .175 0.00 0.05 −0.11 0.10 .980
 M-Ed × SLI on intercept −0.02 0.01 −0.05 0.00 .093 −0.02 0.02 −0.06 0.01 .176
 M-Ed × SLI on linear time −0.01 0.03 −0.08 0.05 .711 −0.06 0.03 −0.13 0.00 .055
 M-Ed × SLI on quadratic time 0.02 0.06 −0.09 0.13 .731 −0.11 0.07 −0.25 0.03 .125
 BvG on intercept −0.03 0.03 −0.09 0.03 .305 −0.11 0.04 −0.19 −0.02 .015
 BvG on linear time 0.00 0.06 −0.11 0.12 .995 −0.06 0.08 −0.21 0.09 .446
 BvG on quadratic time −0.05 0.13 −0.30 0.21 .732 0.28 0.16 −0.03 0.60 .074
 BvG × SLI on intercept −0.01 0.04 −0.08 0.07 .834 0.07 0.05 −0.03 0.17 .169
 BvG × SLI on linear time 0.05 0.09 −0.13 0.22 .609 −0.01 0.11 −0.23 0.20 .912
 BvG × SLI on quadratic time 0.09 0.16 −0.22 0.41 .566 −0.47 0.22 −0.90 −0.04 .032
Random effects: variance components
L1 residual 0.04 0.00 0.04 0.04 .001 0.07 0.00 0.06 0.07 .001
L2 intercept 0.03 0.00 0.02 0.03 .001 0.05 0.01 0.04 0.06 .001
L2 linear time 0.10 0.01 0.07 0.12 .001 0.12 0.02 0.09 0.16 .001
L2 quadratic time 0.19 0.05 0.10 0.28 .001 0.23 0.07 0.09 0.36 .001
L2 intercept–linear time 0.02 0.00 0.02 0.03 .001 0.05 0.01 0.03 0.06 .001
L2 intercept–quadratic time −0.05 0.01 −0.07 −0.03 .001 −0.05 0.02 −0.08 −0.02 .001
 L2 linear–quadratic time 0.02 0.02 −0.02 0.05 .374 0.04 0.03 −0.02 0.10 .225
Parameter Bad Agreement
Total R 2 = .317
Be Do Form Class
Total R 2 = .360
Est SE 95% CI p < Est SE 95% CI p <
LL UL LL UL
Fixed effects
Intercept (age 10 years) 1.45 0.04 1.38 1.53 .001 1.45 0.05 1.36 1.54 .001
Linear time slope 0.79 0.08 0.63 0.95 .001 0.82 0.09 0.65 0.99 .001
Quadratic time slope −1.31 0.10 −1.51 −1.10 .001 −1.21 0.18 −1.56 −0.85 .001
 Linear age cohort on intercept 0.07 0.07 −0.07 0.22 .335 0.14 0.07 −0.01 0.28 .068
 Linear age cohort on linear time 0.34 0.15 0.05 0.63 .020 0.04 0.17 −0.29 0.38 .812
Linear age cohort on quadratic time −0.73 0.19 −1.10 −0.36 .001 −0.44 0.32 −1.07 0.19 .173
 Quadratic age cohort on intercept −0.13 0.15 −0.43 0.16 .377 0.30 0.20 −0.10 0.70 .145
SLI on intercept −0.26 0.05 −0.35 −0.16 .001 −0.32 0.06 −0.43 −0.21 .001
 SLI on linear time 0.00 0.09 −0.17 0.17 .968 0.11 0.10 −0.08 0.31 .248
SLI on quadratic time 0.38 0.17 0.06 0.71 .021 0.67 0.22 0.24 1.10 .002
NVIQ on intercept 0.01 0.00 0.00 0.01 .002 0.01 0.00 0.00 0.01 .038
 NVIQ on linear time −0.01 0.00 −0.01 0.00 .084 0.00 0.00 −0.01 0.00 .195
 NVIQ on quadratic time −0.02 0.01 −0.03 0.00 .022 −0.02 0.01 −0.03 0.00 .022
 NVIQ × SLI on intercept 0.00 0.00 −0.01 0.00 .675 0.00 0.00 −0.01 0.01 .695
 NVIQ × SLI on linear time 0.00 0.01 −0.01 0.01 .574 0.00 0.01 −0.01 0.01 .492
 NVIQ × SLI on quadratic time 0.01 0.01 −0.01 0.03 .270 0.01 0.01 −0.01 0.03 .462
 M-Ed on intercept 0.03 0.02 0.00 0.06 .049 0.01 0.02 −0.02 0.05 .484
 M-Ed on linear time 0.01 0.03 −0.05 0.07 .760 0.01 0.03 −0.05 0.07 .687
 M-Ed on quadratic time 0.03 0.05 −0.06 0.12 .457 0.07 0.06 −0.05 0.19 .262
 M-Ed × SLI on intercept −0.01 0.02 −0.05 0.04 .766 0.00 0.03 −0.05 0.05 .915
 M-Ed × SLI on linear time 0.01 0.04 −0.06 0.09 .728 −0.03 0.04 −0.10 0.05 .508
 M-Ed × SLI on quadratic time −0.06 0.07 −0.20 0.08 .413 −0.12 0.08 −0.28 0.04 .135
 BvG on intercept −0.06 0.05 −0.14 0.03 .225 −0.04 0.05 −0.14 0.05 .377
 BvG on linear time 0.00 0.08 −0.15 0.15 .981 −0.05 0.08 −0.20 0.10 .527
 BvG on quadratic time 0.05 0.14 −0.23 0.33 .734 −0.06 0.17 −0.40 0.29 .747
 BvG × SLI on intercept 0.04 0.06 −0.07 0.16 .440 0.01 0.06 −0.12 0.13 .891
 BvG × SLI on linear time 0.03 0.10 −0.16 0.23 .751 −0.05 0.11 −0.25 0.16 .676
 BvG × SLI on quadratic time 0.01 0.19 −0.37 0.38 .980 0.04 0.25 −0.45 0.52 .882
Random effects: variance components
L1 residual 0.08 0.00 0.07 0.08 .001 0.07 0.00 0.06 0.08 .001
L2 intercept 0.07 0.01 0.05 0.09 .001 0.08 0.01 0.06 0.09 .001
L2 linear time 0.05 0.01 0.03 0.08 .001 0.09 0.02 0.05 0.13 .001
L2 quadratic time 0.21 0.04 0.13 0.30 .001 0.27 0.06 0.15 0.39 .001
L2 intercept–linear time 0.02 0.01 0.00 0.04 .047 0.03 0.01 0.02 0.05 .001
L2 intercept–quadratic time −0.12 0.02 −0.16 −0.09 .001 −0.13 0.02 −0.17 −0.09 .001
L2 linear–quadratic time −0.04 0.02 −0.08 −0.01 .008 −0.03 0.02 −0.07 0.02 .263

Note. The time-varying predictor of age (“time”) and effects of age cohort were log-transformed. The estimates for age cohort represent the cross-sectional effects of children's exact age at study entry, centered at log-age 8 years. The intercept for each GJ outcome reflects the expected A′ score for a 10-year-old girl, without SLI, who entered the study at 8 years of age with average-level performance on nonverbal IQ and whose mother has a high school education or GED. The Est for SLI reflects the difference in A′ score for SLI-affected compared to unaffected children. The Est for NVIQ reflects the increase in A′ score for each 1-unit increase in IQ score. The Est for M-Ed reflects the increase in A′ score for each 1-unit increase in maternal education. The Est for BvG reflects the difference in A′ score for boys when compared to girls. Bold values indicate p < .01. Covariances are indicated with dash (−). Est = estimate; SE = standard error; CI = confidence interval; LL = lower limit; UL = upper limit; SLI = specific language impairment (0 = unaffected); NVIQ = nonverbal IQ (centered at 100); M-Ed = maternal education (0 = high school/GED); BvG = boys versus girls (0 = girls); L1 = Level 1: variance over time and within persons; L2 = Level 2: variance across persons within families; GJ = grammaticality judgment; GED = general equivalency diploma.

Table 9.

Partial Cohen's d and partial eta (η) effect sizes for fixed effects in conditional growth models including interactions between specific language impairment, covariates, and time.

Parameter Double Affirmative
Double Negative
Bad Agreement
Be Do Form Class
Total R2 = .366
Total R2 = .205
Total R2 = .317
Total R2 = .360
d η d η d η d η
Linear time slope 1.04 .46 0.76 .36 0.88 .40 0.85 .39
Quadratic time slope 0.02 .01 −0.10 −.05 −1.16 −.50 −0.60 −.29
Linear age cohort on intercept 0.00 .00 −0.04 −.02 0.09 .04 0.17 .08
Linear age cohort on linear time 0.00 .00 0.05 .03 0.21 .11 0.02 .01
Linear age cohort on quadratic time −0.39 −.19 0.00 .00 −0.35 −.17 −0.12 −.06
Quadratic age cohort on intercept 0.00 .00 0.06 .03 −0.08 −.04 0.13 .07
SLI on intercept −0.52 −.25 −0.51 −.25 −0.48 −.23 −0.53 −.26
SLI on linear time −0.24 −.12 −0.21 −.11 0.00 .00 0.10 .05
SLI on quadratic time 0.14 .07 0.24 .12 0.21 .10 0.28 .14
NVIQ on intercept 0.18 .09 0.09 .05 0.27 .14 0.23 .11
NVIQ on linear time 0.05 .02 −0.03 −.02 −0.18 −.09 −0.12 −.06
NVIQ on quadratic time −0.11 −.05 −0.14 −.07 −0.20 −.10 −0.20 −.10
NVIQ × SLI on intercept 0.00 .00 −0.14 −.07 −0.03 −.02 0.03 .02
NVIQ × SLI on linear time 0.15 .08 0.02 .01 0.05 .03 0.05 .03
NVIQ × SLI on quadratic time 0.12 .06 0.23 .12 0.10 .05 0.07 .03
M-Ed on intercept 0.23 .11 0.20 .10 0.18 .09 0.07 .03
M-Ed on linear time 0.09 .05 0.20 .10 0.03 .01 0.04 .02
M-Ed on quadratic time −0.12 −.06 0.00 .00 0.07 .03 0.10 .05
M-Ed × SLI on intercept −0.15 −.07 −0.12 −.06 −0.03 −.01 0.01 .01
M-Ed × SLI on linear time −0.03 −.02 −0.18 −.09 0.03 .02 −0.06 −.03
M-Ed × SLI on quadratic time 0.03 .02 −0.14 −.07 −0.07 −.04 −0.14 −.07
BvG on intercept −0.09 −.05 −0.22 −.11 −0.11 −.06 −0.08 −.04
BvG on linear time 0.00 .00 −0.07 −.03 0.00 .00 −0.06 −.03
BvG on quadratic time −0.03 −.02 0.16 .08 0.03 .02 −0.03 −.01
BvG × SLI on intercept −0.02 −.01 0.13 .06 0.07 .03 0.01 .01
BvG × SLI on linear time 0.05 .02 −0.01 −.01 0.03 .01 −0.04 −.02
BvG × SLI on quadratic time 0.05 .03 −0.19 −.10 0.00 .00 0.01 .01

Note. Effects sizes shown in bold indicate significant fixed effects at p < .01 in the final conditional growth models. “Partial” refers to the unique contribution of each predictor after controlling for its overlap with other predictors in the model. SLI = specific language impairment: unaffected (coded as 0) versus SLI affected (coded as 1); NVIQ = nonverbal IQ; M-Ed = maternal education; BvG = boys (coded as 0) versus girls (coded as 1).

Discussion

This study extended the generalizations obtained in previous studies from our lab following the same two-group comparison (i.e., children with and without SLI) long-term longitudinal design, in this case across the entire school-age range of 5–18 years of age, with a new linguistic outcome, tag questions. Tag questions are linguistic structures of potential informativeness because of the need to coordinate finiteness marking across a base sentence and an add-on tag question, such as “She is looking for her mom, isn't she?” The task used in this study compared tag questions requiring coordination of negativity or “polarity,” such as “She is looking for her mom, isn't she?” and “You are not feeling better, aren't you?” with tag questions not structured with polarity coordination, such as “They miss their moms, don't they?” and “They miss their moms, doesn't they?” As far as we can tell, this is the first time these structures have been studied for the purpose of testing for possible differences in item structures in acquisition trajectories over the school-age span.

Key outcomes were documentation of persistently lower levels of performance for the SLI group relative to age peers over the school-age range, for the experimental items assessing polarity but not the two comparison versions of tag questions whose errors were more likely to be detected by all children in the study. Although children with SLI performed below their same-age peers at the intercept point of 10 years of age on the error types that did not include polarity, the difference was narrow, that is, observed A′ mean difference of .04 for Bad Agreement and .07 for Be Do Form Class. Furthermore, they caught up to their unaffected peers by 18 years of age.

On the other hand, for the error types of polarity, these differences were .22 for Double Affirmative and .16 for Double Negative. The SLI group consistently scored below their age-peers throughout the 5- to 18-year age span. The persistent difficulty of the SLI group on the polarity violations was not a general problem of comparing the tag with the base sentence because they show they can handle this requirement in the Bad Agreement and Be Do Form Class test items. Instead, a problem of polarity comparisons, as in Double Affirmative and Double Negative tasks, is unresolved into adolescence.

Thus, the polarity grammatical violation is the first identified candidate in our literature for a grammatical marker that might effectively identify children with SLI from school entry to completion of high school, although this possibility would need to be verified in further studies. Such a wide age span could heighten our awareness of how children with SLI's likely avoidance of tag questions in conversations could influence how their peers and teachers perceive their social skills. In turn, this awareness could help ensure the unidentified older children with SLI continue to receive services. For example, “You want a cookie, don't you?” could be perceived as more polite or considerate than “You want a cookie?” Studies documenting the role of tag questions in adolescent conversations could reveal ways in which children with SLI are at a social disadvantage by such subtle gaps in their linguistic options. Another outcome of clinical interest is that, across the full age range, the significant effects of SLI were not moderated by covariates of child nonverbal IQ, maternal education, or child sex across the four GJ Tag outcomes—findings consistent with our other studies of GJ task outcomes in children with or without SLI. This replicated finding implies that children's performance on grammatical judgment tasks such as those marking finiteness or required alignment of grammar and polarity across clauses as in tag questions is not likely to be aligned with levels of nonverbal IQ (in the performance range of the subjects in our studies), maternal education, or child sex. Such grammar tasks are to some extent independent of children's general cognitive development that is above or within “normal” expectations or the advantages of higher maternal education, or a difference between boys and girls. It remains to be seen if children with overlapping deficits in language acquisition and cognitive development could have a broader profile of difficulties with GJ tasks of the sort tested in this study.

Limitations and Future Directions

It is possible that receipt of speech and language services in the SLI-affected group during the time of the study could have influenced the linguistic outcomes for some children; recall that most of the children in the SLI group were referred by speech-language pathologists for consideration as participants in the study. However, from other data in our longitudinal database, we know that almost all the children with SLI did not continue in speech-language services in the schools beyond early elementary grades. Furthermore, recall that the SLI-affected group did not close the gap with age peers, indicating if such services were continued, the children did not close the gap with age peers.

Another possible limitation is the focus on monolingual speakers of English who use GAE as a dialect. Additional studies using the GJ Tag task are needed in diverse populations and across languages and dialects, following conventions developed for particular languages and dialects (see Oetting, 2020; Oetting & McDonald, 2002; Rice et al., 2020). As noted earlier, regions of the United States vary widely in proportions of households with multilingual experiences. It is important to adjust the methods of this study for future investigations that could validly generalize across languages and ethnicities. Further investigation of these tag questions is also needed for children with below-average nonverbal IQ, with or without language impairments and/or neurological disorders. As noted above, one early study investigated tag questions for children with early focal lesions (Weckerly et al., 2004).

Overall, this study is the first to document a persistent gap for children with SLI in their mastery of morphosyntax over the full school-age period of development, using the grammatical requirement of polarity coordination across elements of tag questions. The outcomes of this study addressed and answered the four questions posed for evaluation, identifying polarity requirements of some kinds of tag questions as particularly persistently problematic for children with SLI. This finding has immediate clinical relevance and supports the use of a potential linguistic screening task to meet the needs of unidentified children with SLI over the school-age range. Research on children's understanding of tag questions has been sparse for some time, although the findings of this study suggest potential benefits from systematic studies of these linguistic structures across grammatical contrasts and ages of children. Furthermore, there is a need for formal studies of how best to teach such linguistic structures, and potentially related structures, to help children with SLI improve their performance on such commonplace conversational uses of language.

Data Availability Statement

The data sets generated and/or analyzed during this study are not publicly available due to pending arrangements for deposit in a public repository.

Acknowledgments

This study was supported by Grants R01DC001803 (August 1, 1993, to May 31, 2023) and T32DC000052 (July 1, 1995, to June 30, 2023; Rice, principal investigator for both awards). The authors acknowledge and thank the participants and their families who participated in the study, the trained examiners who collected the data, the day care providers and schools who supported onsite data collection, and the lab assistants who entered raw data into databases.

Funding Statement

This study was supported by Grants R01DC001803 (August 1, 1993, to May 31, 2023) and T32DC000052 (July 1, 1995, to June 30, 2023; Rice, principal investigator for both awards).

Footnote

1

It is important to note that although the sample sizes for the SLI-affected and unaffected groups are similar between this article and the recent GJ Complex article (Rice et al., 2023), the samples are not an exact match. The GJ Tag was incorporated in the longitudinal protocol approximately 4 years prior to the GJ Complex task; thus, there are 18 children (of 483) in the GJ Complex article who were not included in the present study and 46 children (of 511) in the present study who were not included in the GJ Complex article.

References

  1. Burgemeister, B. B., Blum, L. H., & Lorge, I. (1972). The Columbia Mental Maturity Scale. The Psychological Corporation. [Google Scholar]
  2. Dale, P. S., Rice, M. L., Rimfeld, K., & Hayiou-Thomas, M. E. (2018). Grammar clinical marker yields substantial heritability for language impairments in 16-year-old twins. Journal of Speech, Language, and Hearing Research, 61(1), 66–78. 10.1044/2017_JSLHR-L-16-0364 [DOI] [PubMed] [Google Scholar]
  3. Darlington, R. B., & Hayes, A. F. (2016). Regression analysis and linear models: Concepts, applications, and implementation. Guilford Press. [Google Scholar]
  4. Dennis, M., Sugar, J., & Whitaker, H. A. (1982). The acquisition of tag questions. Child Development, 53(5), 1254–1257. 10.2307/1129014 [DOI] [Google Scholar]
  5. Green, D. M. (1964). General prediction relating yes-no and forced choice results. The Journal of the Acoustical Society of America, 36(Suppl. 5), Article 1042. 10.1121/1.2143339 [DOI] [Google Scholar]
  6. Grier, J. B. (1971). Nonparametric indexes for sensitivity and bias: Computing formulas. Psychological Bulletin, 75(6), 424–429. 10.1037/h0031246 [DOI] [PubMed] [Google Scholar]
  7. Guasti, M. T. (2016). Language acquisition: The growth of grammar (2nd ed.). MIT Press. [Google Scholar]
  8. Hoffman, L. (2015). Longitudinal analysis: Modeling within-person fluctuation and change. Routledge. 10.4324/9781315744094 [DOI] [Google Scholar]
  9. Hresko, W. P., Reid, D. K., & Hammill, D. D. (1999). Test of Early Language Development–Third Edition. Pro-Ed. [Google Scholar]
  10. Linebarger, M., Schwartz, M., & Saffran, E. (1983). Sensitivity to grammatical structure in so-called agrammatic aphasics. Cognition, 13(3), 361–392. 10.1016/0010-0277(83)90015-X [DOI] [PubMed] [Google Scholar]
  11. Long, S. J. (1997). Limited outcomes: The Tobit model. In Regression models for categorical and limited dependent variables (pp. 187–216). Sage. [Google Scholar]
  12. McGrath, C. O., & Kunze, L. H. (1973). Development of phrase structure rules involved in tag questions elicited from children. Journal of Speech and Hearing Research, 16(3), 498–512. 10.1044/jshr.1603.498 [DOI] [PubMed] [Google Scholar]
  13. Muthén, L. K., & Muthén, B. O. (1998–2017). Mplus user's guide (8th ed). Muthén & Muthén. [Google Scholar]
  14. Newcomer, P. L., & Hammill, D. D. (1988). Test of Language Development–Primary: Second Edition. Pro-Ed. [Google Scholar]
  15. Oetting, J. B. (2020). From my perspective/opinion: General American English as a dialect: A call for change. The ASHA LeaderLive. https://leader.pubs.asha.org/do/10.1044/leader.FMP.25112020.12/full/
  16. Oetting, J. B., & McDonald, J. L. (2002). Methods for characterizing participants' nonmainstream dialect use in child language research. Journal of Speech, Language, and Hearing Research, 45(3), 505–518. 10.1044/1092-4388(2002/040) [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Pollock, J. (1989). Verb movement, universal grammar, and the structure of IP. Linguistic Inquiry, 20(3), 365–424. [Google Scholar]
  18. Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A comprehensive grammar of the English language. Longman. [Google Scholar]
  19. Redmond, S. M. (2020). Clinical intersections among idiopathic language disorder, social (pragmatic) communication disorder, and attention-deficit/hyperactivity disorder. Journal of Speech, Language, and Hearing Research, 63(10), 3263–3276. 10.1044/2020_JSLHR-20-00050 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Rice, M. L., Earnest, K. K., & Hoffman, L. (2023). Longitudinal grammaticality judgments of tense marking in complex questions in children with and without specific language impairment, ages 5–18 years. Journal of Speech, Language, and Hearing Research, 66(10), 3882–3906. 10.1044/2023_JSLHR-22-00507 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Rice, M. L., & Hoffman, L. (2015). Predicting vocabulary growth in children with and without specific language impairment: A longitudinal study from 2;6 to 21 years of age. Journal of Speech, Language, and Hearing Research, 58(2), 345–359. 10.1044/2015_JSLHR-L-14-0150 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Rice, M. L., Hoffman, L., & Wexler, K. (2009). Judgments of omitted BE and DO in questions as extended finiteness clinical markers of specific language impairment (SLI) to 15 years: A study of growth and asymptote. Journal of Speech, Language, and Hearing Research, 52(6), 1417–1433. 10.1044/1092-4388(2009/08-0171) [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Rice, M. L., Smolik, F., Perpich, D., Thompson, T., Rytting, N., & Blossom, M. (2010). Mean length of utterance levels in 6-month intervals for children 3 to 9 years with and without language impairments. Journal of Speech, Language, and Hearing Research, 53(2), 333–349. 10.1044/1092-4388(2009/08-0183) [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Rice, M. L., Taylor, C. L., Zubrick, S. R., Hoffman, L., & Earnest, K. K. (2020). Heritability of specific language impairment and nonspecific language impairment at ages 4 and 6 years across phenotypes of speech, language, and nonverbal cognition. Journal of Speech, Language, and Hearing Research, 63(3), 793–813. 10.1044/2019_JSLHR-19-00012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Rice, M. L., & Wexler, K. (1996a). A phenotype of specific language impairment: Extended optional infinitives. In M. L. Rice (Ed.), Toward a genetics of language (pp. 215–237). Erlbaum. [Google Scholar]
  26. Rice, M. L., & Wexler, K. (1996b). Toward tense as a clinical marker of specific language impairment in English-speaking children. Journal of Speech and Hearing Research, 39(6), 1239–1257. 10.1044/jshr.3906.1239 [DOI] [PubMed] [Google Scholar]
  27. Rice, M. L., & Wexler, K. (2001). Rice/Wexler Test of Early Grammatical Impairment. The Psychological Corporation. [Google Scholar]
  28. Rice, M. L., Wexler, K., & Cleave, P. L. (1995). Specific language impairment as a period of extended optional infinitive. Journal of Speech and Hearing Research, 38(4), 850–863. 10.1044/jshr.3804.850 [DOI] [PubMed] [Google Scholar]
  29. Rice, M. L., Wexler, K., & Redmond, S. M. (1999). Grammaticality judgments of an extended optional infinitive grammar: Evidence from English-speaking children with specific language impairment. Journal of Speech, Language, and Hearing Research, 42(4), 943–961. 10.1044/jslhr.4204.943 [DOI] [PubMed] [Google Scholar]
  30. Semel, E., Wiig, E., & Secord, W. (1995). Clinical Evaluation of Language Fundamentals–Third Edition. The Psychological Corporation. [Google Scholar]
  31. Seymour, H. N., Roeper, T. W., & de Villiers, J. (2003). Diagnostic Evaluation of Language Variation (DELV). The Psychological Corporation. [Google Scholar]
  32. Singer, J., & Willett, J. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. Oxford University Press. 10.1093/acprof:oso/9780195152968.001.0001 [DOI] [Google Scholar]
  33. StataCorp. (2021). Stata statistical software.
  34. Twisk, J., & Rijmen, F. (2009). Longitudinal Tobit regression: A new approach to analyze outcome variables with floor or ceiling effects. Journal of Clinical Epidemiology, 62(9), 953–958. 10.1016/j.jclinepi.2008.10.003 [DOI] [PubMed] [Google Scholar]
  35. Wang, L., Zhang, Z., McArdle, J. J., & Salthouse, T. A. (2009). Investigating ceiling effects in longitudinal data analysis. Multivariate Behavioral Research, 43(3), 476–496. 10.1080/00273170802285941 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Wechsler, D. (1991). Wechsler Intelligence Scale for Children–Third Edition. The Psychological Corporation. [Google Scholar]
  37. Wechsler, D. (1997). Wechsler Adult Intelligence Scale–Third Edition. The Psychological Corporation. [Google Scholar]
  38. Weckerly, J., Wulfeck, B., & Reilly, J. (2004). The development of morphosyntactic ability in atypical populations: The acquisition of tag questions in children with early focal lesions and children with specific-language impairment. Brain and Language, 88(2), 190–201. 10.1016/S0093-934X(03)00098-1 [DOI] [PubMed] [Google Scholar]
  39. Weeks, L. A. (1992). Preschoolers' production of tag questions and adherence to the polarity-contrast principle. Journal of Psycholinguistic Research, 21(1), 31–40. 10.1007/BF01068307 [DOI] [PubMed] [Google Scholar]
  40. Wexler, K. (1994). Finiteness and head movement in early child grammars. In D. Lightfoot & N. Hornstein (Eds.), Verb movement (pp. 305–350). Cambridge University Press. 10.1017/CBO9780511627705.016 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data sets generated and/or analyzed during this study are not publicly available due to pending arrangements for deposit in a public repository.


Articles from Journal of Speech, Language, and Hearing Research : JSLHR are provided here courtesy of American Speech-Language-Hearing Association

RESOURCES