Abstract
Purpose
This study examined the efficacy of the Vocabulary Acquisition and Usage for Late Talkers (VAULT) treatment in a version that manipulated the length of clinician utterance in which a target word was presented (dose length). The study also explored ways to characterize treatment responders versus nonresponders.
Method
Nineteen primarily English-speaking late-talking toddlers (aged 24–34 months at treatment onset) received VAULT and were quasirandomly assigned to have target words presented in grammatical utterances matching one of two lengths: brief (four words or fewer) or extended (five words or more). Children were measured on their pre- and posttreatment production of (a) target and control words specific to treatment and (b) words not specific to treatment. Classification and Regression Tree (CART) analysis was used to classify responders versus nonresponders.
Results
VAULT was successful as a whole (i.e., treatment effect sizes of greater than 0), with no difference between the brief and extended conditions. Despite the overall significant treatment effect, the treatment was not successful for all participants. CART results (using participants from the current study and a previous iteration of VAULT) provided a dual-node decision tree for classifying treatment responders versus nonresponders.
Conclusions
The input-based VAULT treatment protocol is efficacious and offers some flexibility in terms of utterance length. When VAULT works, it works well. The CART decision tree uses pretreatment vocabulary levels and performance in the first two treatment sessions to provide clinicians with promising guidelines for who is likely to be a nonresponder and thus might need a modified treatment plan.
Supplemental Material
The current study explored the efficacy of an expressive vocabulary treatment protocol for late-talking toddlers when the length of clinician utterances was manipulated. The general protocol, Vocabulary Acquisition and Usage for Late Talkers (VAULT), is a one-on-one, in-person therapy that leverages principles of implicit statistical learning (i.e., learning that occurs without conscious awareness or effort; Plante & Gómez, 2018) in a therapeutic context. VAULT capitalizes on the principles of regularity and variability (Plante & Gómez, 2018) by using high dose input rates and cross-situational learning opportunities (i.e., different physical and linguistic contexts), which increase the saliency of target words in the input. The VAULT protocol's feasibility was explored in Alt et al. (2014), and its efficacy was demonstrated in Alt et al. (2020). In the current study, we expanded upon this previous work by continuing to explore the parameters of clinician input that may enhance the expressive vocabulary of late talkers. The primary goals of this study were (a) to compare the efficacy of two different doses, that is, whether late talkers made greater gains when target words were presented in brief (four words or fewer) or extended (five words or more) utterances and (b) to replicate Alt et al.'s (2020) findings regarding the efficacy of the general VAULT protocol. Our secondary goal was to explore which individual characteristics best classified participants as treatment responders versus nonresponders. Consolidated Standards of Reporting Trials for Social and Psychological Interventions (Grant et al., 2018) guidelines were used to report details of the current study (see Appendix A).
Late Talkers
Toddlers who have significantly smaller expressive vocabularies than their peers with typical development are known as late talkers. A 24-month-old late talker, for example, might have an expressive vocabulary of 50 words or fewer, not produce multiword utterances, or both (Capone Singleton, 2018; Rescorla, 1989). Late talkers have no frank neurological impairments, sensory or motor deficits (e.g., hearing loss), or other diagnoses (e.g., autism spectrum disorder) that might otherwise account for their language deficits (Capone Singleton, 2018). Some late talkers show persistent language difficulties and are later diagnosed with developmental language disorder (American Speech-Language-Hearing Association [ASHA], n.d.), while others develop language skills that fall in the average range but remain below the skills of peers with typical early language development (Rescorla, 2005, 2009). Regardless, it is important to attend to late talkers' communication needs. Not being able to communicate negatively impacts family relationships. Parents are faced with raising children with whom they cannot effectively talk, and young children frequently develop negative behaviors in lieu of oral communication. For example, late talkers tend to have more frequent and severe tantrums than other toddlers, causing significant disruptions in home and day care settings (Manning et al., 2019). In addition, the late onset of talking and slower rate of word learning are early risk factors for lifelong problems with language, both oral and written (e.g., Hammer et al., 2017). These family, behavioral, and language factors drive the need for early treatment.
Treatment Parameters
Although intervention is generally effective for late talkers (Cable & Domsch, 2011), the specific treatment parameters that improve outcomes, such as how frequently a treatment is administered, remain largely unknown. In light of calls for more systematic approaches to intervention research (e.g., Warren et al., 2007) and following the Template for Intervention Description and Replication checklist (Hoffmann et al., 2014; see Appendix B for an adapted version of the Template for Intervention Description and Replication checklist), Alt et al. (2020) began the systematic investigation of VAULT. Alt et al. identified multiple parameters that could affect treatment outcomes (e.g., dose rate or treatment context) and investigated the effects of two of these parameters: number of target words and number of doses per target word. The current study continued this investigation of treatment parameters by manipulating the dose, specifically, the length of the utterance in which a target word was presented.
Utterance Length. In VAULT, a single dose is defined as a clinician's verbal model of a target word in a grammatical utterance, but we do not know if the length of this utterance affects dose efficacy. In naturalistic settings, longer parental utterances have been positively associated with children's vocabulary outcomes (Baker et al., 2015; Hoff, 2003; Hoff & Naigles, 2002), but shortened parental input may have a positive impact as well (Brent & Siskind, 2001). Experimental evidence is similarly ambiguous. For example, in a single-subject case study, Wolfe and Heilmann (2010) manipulated input length in an intervention for a late talker; the toddler made gains in both length conditions. Of note, these conditions differed in length and grammatical complexity. In the shorter length condition, agrammatical utterances such as “Yes, look dog!” were used. However, length and grammatical complexity need not go hand in hand. The target word “dog,” for instance, can be used in an utterance such as “The dog barks,” which is of equal length to the shorter utterances in Wolfe and Heilmann, yet is grammatical.
To our knowledge, no intervention study has directly examined the effect of input utterance length on the expressive vocabulary of late talkers while controlling for confounds related to grammatical complexity. The current study aimed to fill this gap by comparing treatment conditions in which target words were presented to late talkers in grammatical utterances of either brief or extended length.
Brief utterances: Working memory capacity. Given that there is limited research on utterance length, it is important to consider reasons that length might be relevant to treatment outcomes. One reason that shorter utterances might result in better expressive vocabulary outcomes for late talkers is the role of phonological working memory.
Phonological working memory is a capacity-limited resource that allows for maintenance and manipulation of verbal information over a short period of time (Adams et al., 2018). It is associated with existing vocabulary knowledge in children as young as 2 years old (Newbury et al., 2015; Stokes & Klee, 2009b; Stokes et al., 2017) and may directly support learners in forming phonological representations of new words (Baddeley et al., 1998; Gathercole, 2006; Montgomery et al., 2010). Children with developmental language disorder often demonstrate deficits in working memory (Alt, 2011; Graf Estes et al., 2007; Kapa & Erikson, 2019; Montgomery et al., 2010) and word learning (Kan & Windsor, 2010). There is emerging evidence that late talkers also demonstrate phonological working memory deficits relative to peers with typical language development (Marini et al., 2017; Stokes & Klee, 2009a).
This work suggests that it may be important to account for phonological working memory in a vocabulary intervention. Specifically, there is a strong likelihood that children who require vocabulary intervention may have limited phonological working memory skills. One way to compensate for limited phonological working memory is to shorten the input that the child receives. Long utterances that exceed the capacity of a child's phonological working memory may be truncated during encoding, meaning that target words and other supporting linguistic information could be lost. We know that children with language impairment are more prone to interference when initially encoding words and tend to focus more on word-initial phonemes than on word-final phonemes (Alt & Suddarth, 2012). In practice, if a child with a limited phonological working memory system was exposed to input such as “The cat is chasing the ball under the bed,” they might encode only a fragment of the utterance (e.g., “the cat is” or “cat chase ball”). If the child's target word was “bed,” such input would likely be useless. In contrast, embedding target words in brief, grammatical utterances (e.g., “There's his bed”) could help support word learning because brief utterances are more likely to be fully maintained and processed in working memory. This could help learners, especially those with limited phonological working memory, process and retain target words.
Extended utterances: Linguistic cues and linguistic variability. Although extended utterances have the potential to exceed a late talker's working memory capacity, extended utterances could lead to greater vocabulary gains because they offer more linguistic cues and opportunities for linguistic variability. Linguistic cues are morphosyntactic, semantic, or prosodic features that provide information about a word (e.g., part of speech or animacy). They may benefit learners by guiding their attention to the salient components of the input. For example, morphosyntactic cues have been found to direct toddlers' eye gaze toward target stimuli (Paquette-Smith & Johnson, 2016). Previous research suggests that linguistic cues aid vocabulary development in children with typical language development (Arnon & Clark, 2011; Bloom & Kelemen, 1995; Brady & Goodman, 2014; Ferguson et al., 2018; Gelman & Raman, 2003; Naigles, 1990; Rice et al., 2000). The presence of multiple linguistic cues may specifically support toddlers' vocabulary development (Kouider et al., 2006). Some evidence suggests that children with language-learning difficulties can also benefit from linguistic cues to support lexical acquisition, although perhaps to a lesser extent than their typically developing peers (Rice et al., 2000). Extended utterances could therefore benefit late talkers because they allow more linguistic cues to be included in the input.
Additionally, extended utterances provide opportunities for increased linguistic variability. They can contain a greater number of linguistic elements (e.g., adjectives, adverbs, or dependent clauses). They also allow target words to occur in linguistic contexts with more variability in sentence structure and adjacent words. Increasing the linguistic variability around a target supports learning by increasing the target's regularity and saliency, supporting both the regularity and variability principles of implicit learning (Plante & Gómez, 2018). Implicit or statistical learning can be enhanced in individuals with language delays or disorders by increasing the linguistic variability around a target (Alt et al., 2014; Plante et al., 2014; Torkildsen et al., 2013). Because extended utterances provide more opportunities to increase the linguistic variability around target words, they could result in enhanced vocabulary gains for late talkers.
Predicting Who Will Respond to Treatment
Based on the preceding evidence, there are theoretical reasons that both conditions could benefit learners. That said, given the variability of human beings, every clinical protocol includes responders (those who achieve the expected treatment outcomes) and nonresponders (those who do not respond to the treatment). For example, in Alt et al. (2020), we had an overall positive treatment effect, but about one third of the children were considered nonresponders. This is consistent with other language treatment work. For example, Plante et al.'s (2014) high-variability condition resulted in improved morphosyntactic production for preschool children with developmental language disorder. However, of the nine children in the high-variability group, three (one third) did not have higher accuracy on their target morphemes posttreatment. Response may also differ by situation. Peters-Sanders et al. (2020) found that their vocabulary intervention for preschoolers worked for all 17 participants but only in roughly 75% of the situations (i.e., specific books, specific vocabulary), meaning that children did not respond to the treatment in about 25% of the learning contexts.
While it is excellent to discover that a treatment works for the majority of children or circumstances, it would be ideal to identify children for whom a given treatment is not likely to work. There is some evidence that points to which late-talking toddlers may go on to be diagnosed with a language disorder, for example, presence of issues in areas other than language (Schachinger-Lorentzon et al., 2018), demographic risks such as lower socioeconomic status (Rescorla, 2011), or expressive vocabulary size (Fisher, 2017). However, to our knowledge, there is no information available to help determine which late talkers will respond positively to treatment. In order to analyze treatment responses, one needs a sufficiently large sample size. We planned to analyze this question if it was statistically appropriate to do so.
The Current Study 1
The VAULT protocol uses focused stimulation as the dose form. We maintained the same total dose number (270) and rate (nine doses per minute) found to be effective in the previous VAULT efficacy study (Alt et al., 2020). However, we changed the number of targets to 4, based on parent and clinician feedback (see Method for justification). In this protocol, a single dose is a spoken model of a target word embedded in a grammatical utterance, and we varied the dose in terms of the utterance length. Our research questions were as follows:
Do late talkers make more expressive vocabulary gains when doses are presented in brief (four words or fewer) or extended (five words or more) utterances?
Is the current version of VAULT efficacious?
Can we identify which, if any, individual toddler characteristics best classify who will respond to VAULT?
We predicted that both treatment conditions would produce positive treatment outcomes based on the mixed findings in the literature and that this version of VAULT would be an efficacious treatment. We were unsure if we would be able to classify responders versus nonresponders but thought this was an important question to explore because it could yield useful information for clinicians and families seeking effective treatment.
Method
Participants
We recruited participants using a flyer approved by the University of Arizona's Institutional Review Board. We distributed the flyer to local pediatricians, libraries, and nonprofits serving families of young children (e.g., the “Parents as Teachers” parent training program at Casa de los Niños and Easterseals Blake Foundation). We also recruited participants online (using the institutional review board–approved language, a PDF of the flyer, or both) through local e-newsletters targeting Tucson-area parents. The study occurred from spring 2018 to spring 2020, including the period of initial recruitment and final participant follow-up.
Study participants were late-talking toddlers, aged 24–34 months at the start of treatment. A total of 38 children were consented to participate in the study. Of these, 23 were allocated to one of two treatment conditions. However, we will discuss the study in terms of the 19 children whose data we ultimately analyzed. (See Appendix C for a modified Consolidated Standards of Reporting Trials flowchart with information regarding the path from consent to data analysis.) Children could not have other diagnoses (e.g., autism spectrum disorder or hearing impairment) or receive other speech-language treatment during the study. 2 To participate, families had to be able and willing to bring their children to assessment and treatment sessions and regularly complete paperwork (e.g., questionnaires). Per parent report, all participants were from primarily English-speaking households, with 10 participants receiving minimal exposure to an additional language in the home (e.g., teaching colors or numbers in Spanish). When we determined that a new participant met eligibility criteria, a senior lab member quasirandomly assigned them to one of two treatment conditions. (We did not use true randomization because, with our sample size, there would be a high probability of obtaining unequal groups, making meaningful comparisons impossible.) First, a senior lab member checked to see if the new participant matched an existing participant in terms of (a) sex and (b) age (within 3 months). If there was a match, the new participant was assigned to the condition opposite that of their matched participant, resulting in equal allocation of participants to each condition. If the new participant did not match an existing participant, they were randomly assigned to one of the two treatment conditions. Assignment was based on a list of random numbers generated at the beginning of the study using a random number generator site (Random.org, n.d.). We assigned children to the brief condition if the next number on the list was odd and the extended condition if the next number was even. Once a number was used, it was crossed off the list. Parents were blind to their child's treatment condition. Demographic data for the participants in each condition are provided in Table 1.
Table 1.
Demographic information for participants by treatment condition.
| Characteristic | Brief condition | Extended condition |
|---|---|---|
| n (male, female) | 10 (6, 4) | 9 (6, 3) |
| Age in months, M (SD) | 28 (3) | 27 (3) |
| Standard score on the Bayley a , M (SD) | 97.5 (11.60) | 96.66 (11.45) |
| No. of words produced on the initial MCDI b , M (SD) | 35 (24) | 31 (44) |
| No. of words understood on the initial MCDI, M (SD) | 248 (153) | 406 (139) |
| Race | 7 White, 2 more than one race, 1 no response | 9 White |
| Ethnicity | 2 Hispanic, 8 Non-Hispanic | 2 Hispanic, 7 Non-Hispanic |
| Maternal education | ||
| High school | 0 | 1 |
| Associate degree or some college | 2 | 2 |
| Bachelor's degree | 6 | 4 |
| Graduate degree | 2 | 2 |
Note. Bayley = Bayley Scales of Infant and Toddler Development–Third Edition (Bayley, 2006); MCDI = MacArthur–Bates Communicative Development Inventories: Words and Sentences (Fenson et al., 2007).
Administration of the Bayley was discontinued before a ceiling was reached for three participants in the brief condition and three participants in the extended condition, so these scores are likely underestimates of the children's abilities.
All but three participants fell below the 5th percentile on the MCDI, relative to their age and sex. These three were in the 5th–10th percentile range. Of these three, one participant was over 30 months old, and norms are only available through 30 months of age; this participant fell below the 5th percentile on the MCDI-III (for children 30 months of age or older; Dale, 2007).
Inclusionary and Exclusionary Criteria. We determined subject eligibility through multiple criteria at different stages of the study, as outlined in Table 2, which shows the number of children excluded per criterion. One of the authors contacted families who were not included in the study and provided information on speech and language resources (e.g., local service providers, state-funded services, and language nutrition techniques; Head Zauche et al., 2016).
Table 2.
Inclusionary criteria and number of excluded participants.
| Criterion | Purpose | Participants excluded (n) |
|---|---|---|
| Preevaluation phase | ||
| Between 24 and 47 months old at the start of treatment | To ensure participants fell within age range for late talkers | 0 |
| No diagnoses/concerns other than language delay (reported via parent interview) a | To rule out influence of other diagnoses on treatment outcomes | 2 |
| Primary home language of English (reported via parent interview) | To rule out influence of bilingualism on treatment outcomes | 0 |
| No outside speech or language therapy during study (reported via parent interview) | To rule out influence of nonstudy treatment | 3 |
| Score below 10th percentile reported on MCDI for children through 30 months or MCDI-III for children 30 months or older b | To establish expressive language delay and meet criteria for evaluation phase | 3 predelay |
| 2 postdelay | ||
| 3 no data | ||
| Evaluation phase | ||
| Standard score of 75 or above on Bayley | To establish normal nonverbal IQ | 0 |
| Pass on informal vision assessment c | To establish functional near vision | 0 |
| Pass on informal hearing assessment c | To establish functional hearing | 0 |
| Sufficient receptive vocabulary inventory d (determined on an individual basis during target/control word selection process) | To ensure that 20 words (targets and controls) could be selected for treatment | 2 |
Note. Bayley = Bayley Scales of Infant and Toddler Development–Third Edition (Bayley, 2006); MCDI = MacArthur–Bates Communicative Development Inventories: Words and Sentences (Fenson et al., 2007); MCDI-III = MacArthur-Bates Communicative Developmental Inventory-III (Dale, 2007).
Some children were excluded during the evaluation phase due to clinician concerns about more global needs or delays.
For participants with a delay period, this criterion was established twice. Scores below the 10th percentile were first established prior to the evaluation, and again after the postevaluation delay (prior to the start of treatment).
Evaluators were not able to complete formal vision and hearing screenings (e.g., 20/40 near vision screenings and play audiometry) due to participant fatigue, inattention, and/or headphone intolerance. Instead, participants received a “functional pass” if evaluators determined they were able to respond to visual and auditory stimuli.
Using the receptive vocabulary checklist created for this study based on the MCDI.
Materials and Procedures
Pretreatment. Interested families who contacted the lab learned about basic exclusionary criteria, that is, no other diagnosis (e.g., autism spectrum disorder or hearing impairment), no outside speech and/or language services during the study, and a primarily monolingual English household environment. Families who met these criteria received packets containing a MacArthur–Bates Communicative Development Inventories: Words and Sentences (MCDI; Fenson et al., 2007), a MacArthur–Bates Communicative Development Inventory-III (MCDI-III; Dale, 2007) for children at least 30 months old, a checklist for noting words their child understood on the MCDI and MCDI-III, a form for listing 50 words they would like their child to learn to say, and a consent form. The MCDI and MCDI-III are parent-reported measures of expressive vocabulary with norms available by sex for children through 30 months of age (for the MCDI) or from 30 to 37 months of age (for the MCDI-III). 3 The MCDIs are lists of words from various categories (e.g., animals, clothing, or food) with instructions for the parent to mark each word that their child says. We used these same words to create a separate checklist for the parent to indicate words that their child understood. Once a packet was returned, the child received a participant number, and the research team reviewed their MCDI (and MCDI-III, if applicable). If their score fell below the 10th percentile, the child was determined to have an expressive vocabulary delay, and the parent was invited to a phone interview. In the phone interview, a senior member of the research team asked a standardized list of questions in areas including developmental history, other diagnoses, parental concerns, family history of speech and/or language impairment (to better describe our sample), and bilingual or multilingual language exposure. Families were invited to an in-person evaluation if their child met all initial criteria (see Table 2 for criteria established prior to an evaluation).
At the in-person evaluation, a licensed, certified speech-language pathologist (SLP) used a combination of formal and informal measures to determine that the participant had functional hearing and vision (i.e., sufficient for interacting with the clinician and materials), as well as nonverbal intelligence within normal limits. We established a pretreatment delay period for a subset of 15 participants to determine if these children improved without treatment, due to maturation. 4 Before treatment began, parents of these children provided updated vocabulary data by filling out a second MCDI and, for children at least 30 months old, an MCDI-III. Our goal was an 8-week delay period between the first and second MCDIs; the actual delay ranged from 3.0 to 18.6 weeks. We assumed that participants who exceeded the cut-point of the 10th percentile on the postdelay MCDI were improving based on maturation alone; accordingly, such participants were disqualified from the study.
Target Selection. We chose 20 MCDI words for each child (10 target words and 10 control words). Target words (hereafter referred to as targets) were words that we planned to use in treatment. Control words (hereafter referred to as controls) were words that would not be treated but would be monitored to see if the child produced them. Targets and controls were selected and paired using a variety of criteria. Most importantly, we selected words that the child reportedly understood but did not say. Whenever possible, we incorporated words that the parent had listed on their list of 50 desired words. Targets and controls were paired based on similar attributes including MCDI category (e.g., Clothing or Food and Drink), grammatical class (e.g., noun or verb), number of syllables, and age of acquisition trajectories based on the Stanford Wordbank (Frank et al., 2016). For example, we paired “cheese” and “milk” because both are one-syllable nouns from the MCDI category “Food and Drink” with similar Wordbank trajectories (i.e., at 24 months, 87% of children say “cheese” and 86% say “milk”; at 30 months, 97% say “cheese” and 97% say “milk”; Frank et al., 2016). All 19 participants' targets included nouns and verbs. Seventeen participants (89%) also had an adjective as a target, and three (15%) had a preposition. The average distribution of word classes for participants was seven nouns, two verbs, and one adjective.
Baseline Sessions. In the 2 weeks prior to each child's first treatment session, the child participated in at least three baseline sessions over at least three different days to ensure that they did not produce any of the 10 potential target and control pairs. If a child did not say a word across three baseline sessions, the word was used in the study. During the baseline sessions, the examiner used a tablet to show the child a picture representing each word. The examiner pointed to the picture and used verbal prompts (e.g., “What is this?”) to elicit a response. If the child produced a potential target or control, we replaced that word. The child received the same opportunities to produce the new word. Accordingly, some children participated in more than three baseline sessions. Baseline sessions were held in person at the University of Arizona clinic or via video call (in order to minimize travel-related barriers for parents).
Treatment. Treatment was provided one-on-one at the university clinic. Treatment parameters are described in Table 3.
Table 3.
Treatment parameters identified and defined in the current study.
| Treatment parameter | Definition | Specification for current VAULT protocol | ||
|---|---|---|---|---|
| Dose | Action in treatment that produces change for a given target | Clinician's verbal model of the target word in a nontelegraphic, grammatical utterance with | ||
| 4 words or fewer (brief condition) |
or | 5 words or more (extended condition) |
||
| Dose number | Number of doses per target per individual session | 67 or 68 doses (270 doses/4 words) | ||
| Total number of doses | Combined number of doses across all targets per individual session | 270 doses per session for all target words combined | ||
| Dose rate | Total number of doses that occur within a given unit of time | Nine doses per minute (270 doses/30 min) | ||
| Dose form | Procedure used to administer the dose | Input-based, focused stimulation procedures using varied linguistic contexts and a variety of different activities | ||
| Treatment context | Setting/context in which the treatment is provided | Child-friendly clinic room with child, clinician, scorekeeper, reliability tracker, and child's family member(s) | ||
| Number of targets | Number of different targets addressed in an individual session | Four target words | ||
| Session frequency | Daily, weekly, or monthly schedule of treatment sessions | Two times per week a | ||
| Session duration | Duration of an individual session | 30 min | ||
| Total intervention duration | Total number of days, weeks, or months the intervention is provided | 8 weeks of treatment with an average of 2 sessions per week (16 sessions total) | ||
| Cumulative treatment intensity | Total Number of Doses × Session Frequency × Total Intervention Duration | 270 × 2 times a week × 8 weeks = 4,320 | ||
Note. This table is similar to Table 2 in the study of Alt et al. (2020), a previous iteration of VAULT. Both tables were inspired by the work of Warren et al. (2007). Some terms and definitions above are identical to those used by Warren et al. (2007) and/or Alt et al. (2020). However, some terms and definitions have been modified. VAULT = Vocabulary Acquisition and Usage for Late Talkers.
Due to factors like holidays, university closure, and illness, participants occasionally deviated from this schedule.
In our initial efficacy study (Alt et al., 2020), we targeted either three or six different words per session, depending on the assigned treatment condition. We found that both conditions resulted in expressive vocabulary gains. We received parent and research assistant feedback that three targets per session sometimes felt slow and made it difficult to maintain the toddler's interest, whereas six targets per session could be overwhelming for both clinician and toddler. Therefore, we opted for four targets per session, which both fell within our range of parameters proven to be effective and incorporated parent and researcher feedback.
We used the formula from Alt et al. (2020) to determine how many times to use each target in a session. For this study, we maintained 270 total doses per session and divided that by four target words, resulting in 67.5 doses per target word per session. Because it is impossible to give a half dose of a word, each session we alternated between 67 and 68 doses per target word.
Clinicians delivered targets within grammatical utterances that matched the length of the child's treatment condition: four words or fewer for brief and five words or more for extended. The boundary of four versus five words was used for practical reasons. First, clinicians were specifically instructed to avoid ungrammatical or telegraphic (incomplete) utterances (e.g., “he sits on chair”). A maximum below four words in the brief condition would have been highly limiting because utterances also needed to be grammatical. Second, a minimum above five words in the extended condition might have made the input unnatural. Our boundary allowed for a wide range of utterances in both conditions. If a clinician's utterance did not contain a target, the utterance could be of any length. The clinician could use more than one target per utterance, as long as they used the correct utterance length. Requiring utterances in both conditions to be grammatical allowed us to tease apart input quantity from quality. That is, brief utterances were of the same grammatical quality as extended utterances, leaving the conditions to differ only by length. Furthermore, although extended utterances can be more grammatically complex than brief utterances (e.g., due to the addition of a relative clause), they are not necessarily more complex. That is, extended utterances can also be formed by adding adjectives or adverbs to a simple subject–verb–object sentence, which serves to increase an utterance's number of words without increasing its complexity.
During each session, the clinician used each of the child's four current targets in 67 or 68 grammatical utterances matching their assigned condition length during a variety of play activities. For example, if the target was “eat,” the clinician might set up a picnic scenario with play food and figurines. In the brief condition, the clinician might say, “What should they eat?” or “Let's eat!” or “Let's see what they're doing. [pause] Eating!” As in the elliptical clause of the last example, utterances in the brief condition could consist of only one word if they made sense grammatically and pragmatically. In the extended condition, the clinician might say, “Here's some food for our friends to eat” or “We can eat bread and fruit” (see Appendix D for more information on acceptable forms of targets and utterance length). The clinician also incorporated the child's four current controls into the session by talking about each word without producing it. For example, if the control was “arm,” the clinician might pretend that a doll's arm was hurt and point to the arm while saying, “this part got hurt.”
At least once per session, the clinician gave the child an opportunity to produce each of the session's eight targets and controls. This was done by showing the child the object, action, or attribute and then using a prompt such as “What is this?” or “What am I doing?” followed by an expectant pause of approximately 5 s. During this pause, the clinician attempted to maintain eye contact and ceased playing with the child to show that a response was expected. If the child did not produce a response during the pause, the clinician resumed play. When a child produced a target in three out of five consecutive sessions, either following an expectant pause or spontaneously, that target and its paired control were replaced with the next word pair on the child's list. Children did not typically exhaust all 10 target and control pairs; however, if this happened, the clinician returned to the top of a child's word pair list and continued down the list as before.
Because variability supports implicit learning (Plante & Gómez, 2018), clinicians were instructed to incorporate variability into their utterances and activities. Regarding utterances, clinicians were instructed to deliver targets using different utterances and to vary targets' positions within utterances. Some words naturally occurred more in certain sentence positions. For example, the word bathroom tended to appear more at the end of sentences (e.g., “We wash in bathrooms” or “She's entering the bathroom”), so clinicians were reminded to also begin sentences with this word (e.g., “Bathrooms can be clean” or “Bathrooms have sinks”). Regarding activities, clinicians were instructed to use a variety of materials and activities. For example, with the target word “eat,” the clinician might present play food that characters could pretend to eat, read a book about people eating, and bring real snacks for the child to try. Clinicians were instructed not to use the same materials and activities 2 weeks in a row, but the research team did not formally monitor this.
Each child received treatment from two different clinicians on different days in order to control for clinician effects. Clinicians were undergraduate or graduate students in the Department of Speech, Language, and Hearing Sciences trained in the VAULT protocol by a certified SLP. Whenever possible, less experienced clinicians were paired with more experienced clinicians. Training included in-person meetings followed by practice sessions in which the SLP provided feedback and suggestions regarding activities, sentences, and strategies. An SLP or other senior member of the research team was present for all treatment sessions and was responsible for writing down child utterances, tracking expectant pauses and references to control words, and assisting with session management. After sessions, clinicians met with the SLP or senior member to receive feedback on the session, brainstorm for future sessions, and problem-solve situations.
Posttreatment. Within 1 week of a child's last treatment session, the parent was asked to fill out another MCDI (and MCDI-III, if applicable). Each child also participated in an immediate posttreatment probe session. As in the baseline sessions, the child was asked to name pictures of their 10 target and control pairs. Probe sessions occurred at the clinic or via video call.
After the immediate posttreatment probe session, one of the researchers conducted a phone interview with the parent to obtain feedback on parent and child perceptions of the study, parent perceptions of changes in the child, and so forth. Whenever possible, a researcher who had not worked closely with the family conducted the interview to make parents feel comfortable about giving honest feedback. These data will be discussed in a future manuscript.
Then, 3–6 weeks after the immediate posttreatment probe session (average of 4.87 weeks), the child participated in a follow-up probe session following the previously mentioned procedures. The parent was asked to fill out a final MCDI (and MCDI-III, if applicable). This marked the end of the family's participation in the study.
Fidelity and Reliability
To ensure treatment fidelity, a trained scorekeeper was present in all sessions and privy to the child's treatment condition. The scorekeeper recorded both the number of adult utterances containing target words and whether those utterances followed the correct utterance length for the given treatment condition. Parents were requested to limit their speech so that clinicians could control the dose condition. However, if parents did produce any of the target words, the scorekeeper counted those as doses. The scorekeeper also recorded when any adult utterance contained one of a child's control words. Scorekeepers supported clinicians by discreetly notifying them about utterance length violations, progress through each target (e.g., at halfway done and 10 doses remaining), use of control words by anyone other than the child, and excess doses.
Treatment fidelity for dose number was determined by comparing the number of doses planned per target word to the number of doses actually delivered per target word. Overall fidelity was high (average = 99%, brief = 99%, extended = 99%), with the lowest average participant fidelity for dose number at 98%. Treatment fidelity for utterance length was determined by calculating the percentage of doses that were delivered with the correct utterance length. The overall fidelity for utterance length was also high (average = 96%, brief = 96%, extended = 95%), with the lowest average participant fidelity at 94%. Fidelity data for dose number and utterance length were available for every session for each participant. In order to reliably determine the within-treatment outcome measure of child productions, we used the following procedures. All sessions were recorded via video. A certified SLP or senior member of the research team served as a reliability tracker. They attended each session, wrote down child utterances, and provided clinician support to ensure treatment fidelity (which necessitated that they be privy to condition assignment). Immediately after any child production, the clinician and reliability tracker verbally conferred, typically by one person repeating the child's production and the other agreeing, disagreeing, or indicating that they were not sure. In cases of disagreement or uncertainty, the reliability tracker noted the camera's time code for a future video check. Both clinicians and reliability trackers wrote down all child productions, including words and word approximations. If the clinician was unable to record a child production in the moment, she glossed it so the reliability tracker could record it for her, noting agreement (e.g., with a check) or disagreement (e.g., with a question mark and a note such as “[clinician's name] heard”). If neither the clinician nor the reliability tracker was able to record a child production in the moment, the reliability tracker and clinician both reviewed the video to record the missed production(s). In cases of questionable productions, the team asked parents if they had understood what the child had said; however, parent report was never used as the sole determiner of a child's production. Following each session, the clinician and reliability tracker reviewed their respective lists of child productions. Reliability was calculated by dividing the number of agreed-upon unique child productions by the number of total unique child productions. For the purposes of this calculation, all instances of a given utterance (e.g., 10 child productions of “yeah”) were only counted once; multiword utterances were treated as units (e.g., if the clinician heard “my plate” and the reliability tracker heard “me play,” this only counted as one disagreement), and any productions requiring a video check were not included in the calculation. For video checks, a senior member of the research team reviewed a video of the session. This person either agreed with one party, marked the production as unintelligible, or (if they heard something different) obtained another video check from another senior member of the research team. Reliability was calculated for almost all sessions. Overall reliability was high (average = 98%, brief = 98%, extended = 99%), with the lowest average participant reliability at 95%. One limitation to our reliability process was that one party's interpretation of the child's production may have been influenced by the other party's interpretation.
Results
Not every child who was allocated to a treatment condition received the full 16 treatment sessions (see Appendix C). Participants who completed a minimum of nine treatment sessions (i.e., more than half) were included in the analysis because they had sufficient data. Three children with fewer than nine treatment sessions were excluded due to scheduling issues. These children came from both treatment conditions, and their demographics matched those of the sample retained for analysis. An additional four children's treatment was interrupted by the COVID-19 outbreak in March of 2020. At that time, three of the children had received more than nine treatment sessions, and their progress to date was included in the data analyses. These families were offered continued treatment via telepractice, but the constraints of telepractice altered the treatment to such a degree that we did not include data from the telepractice sessions in our data analysis. In the end, our analysis included data from 19 participants. This number of participants was slightly smaller than our planned sample size of 10 participants per group, which was based on conservative estimates from studies such as that of Perry et al. (2010), [which manipulated object variability when teaching toddler words. However, COVID-19 made additional comparable data collection impossible.
Effect of Utterance Length
First, we examined whether the brief or extended utterance treatment condition yielded larger effect sizes. Group comparison revealed anecdotal evidence (BFINC = 0.41) in support of the null hypothesis that there were no group differences in the age at which participants began treatment (brief: M = 28.10, SD = 3.31; extended: M = 27.77, SD = 3.73). Therefore, age was not included in our model. We ran a Bayesian repeated-measures analysis of variance (ANOVA) that included the within-subject factor of word type (target vs. control) and the between-subjects factor of condition (brief vs. extended). The dependent variable was the treatment effect size (d), calculated as in the previous VAULT study (Alt et al., 2020), as adapted from Beeson and Robey's (2006) single-subject treatment work. That is, we subtracted the mean of the baseline sessions (in our case, it was always zero) from the mean of each participant's last three treatment sessions and then divided the difference by the standard deviation of the last three treatment sessions. If there was no variance, we used 0.577 (the smallest possible standard deviation). We used Bayesian statistics because they are well suited to smaller sample sizes and are interpretable in terms of how likely an outcome is, as well as whether it supports the null hypothesis (i.e., there is no difference between conditions) or alternative hypothesis (i.e., outcomes are better in one condition vs. another; Kruschke, 2013). Interpretations of Bayesian effect sizes (e.g., anecdotal, strong) are taken directly from Wagenmakers et al. (2018).
There was anecdotal evidence (BFINC = 0.940) for no difference between the brief versus extended conditions. Similarly, there was anecdotal evidence (BFINC = 0.484) for no interaction between word type and condition. Average effect sizes for target and control words by condition are described in Table 4. Seven out of 10 children in the brief condition were considered responders (effect size > 0), while only three out of nine children in the extended condition were considered responders. Given this, we ran a Fisher's exact test of independence to see if there were significantly more responders in the brief condition compared to the extended condition, but the two-tailed p value was not significant (p = .17).
Table 4.
Comparison of target and control words on seven different metrics.
| Variable | Target, M (SD) | Control, M (SD) | Evidence | Interpretation |
|---|---|---|---|---|
| Probes | ||||
| Immediate posttreatment | 1.52 (2.06) | 1.35 (1.86) | BF10 = 0.521 | Anecdotal evidence in favor of null |
| Posttreatment follow-up | 1.62 (2.18) | 1.53 (1.89) | BF10 = 0.322 | Moderate evidence in favor of null |
| Specific words on MCDI | ||||
| Immediate posttreatment | 2.79 (2.76) | 1.94 (2.14) | BF10 = 5.38 | Moderate evidence in favor of alternative |
| Posttreatment follow-up | 3.76 (3.63) | 3.35 (3.39) | BF10 = 1.08 | Anecdotal evidence in favor of alternative |
| During treatment | ||||
| No. of times words said | 45.63 (141.54) | 3.73 (7.85) | BF10 = 0.94 | Anecdotal evidence in favor of null |
| No. of words said at least 3 times | 1.73 (2.64) | 0.15 (0.50) | BF10 = 12.93 | Strong evidence in favor of alternative |
| First treatment session in which words were said | 3 (2.64) | 4.66 (4.35) | BF10 = 1.91 | Anecdotal evidence in favor of alternative |
Note. MCDI = MacArthur–Bates Communicative Development Inventories: Words and Sentences.
Overall VAULT Treatment Effect
Given that there was no statistical difference between the conditions, we collapsed them to determine whether the current iteration of VAULT was efficacious overall. First, we compared the effect sizes for target versus control words. Using a Bayesian repeated-measures ANOVA, we found moderate evidence (BF10 = 4.915) for a larger target word effect size (M = 0.90, SD = 1.24) than control word effect size (M = 0.48, SD = 1.07). 5 Of the 19 participants, 10 showed a target word treatment effect size greater than zero. Graphs of individual performance on target versus control words are presented in Supplemental Material S1 for participants identified as responders and Supplemental Material S2 for participants identified as nonresponders.
Next, we used the MCDI to compare participants' word learning rate during a pretreatment delay period versus during treatment. Fifteen of 19 of the participants had a delay period. Using a Bayes paired-samples t test, we found anecdotal evidence (BFINC = 2.704) for a higher word learning rate during treatment (M = 5.11, SD = 6.58) compared to the pretreatment delay period (M = 2.37, SD = 3.16). Eight of the 15 toddlers with a delay period had higher rates of learning during treatment (see Figure 1). All eight were already characterized as responders based on our criterion of a treatment effect size greater than 0.
Figure 1.
Participants' rates of vocabulary change (words per week) as measured by the MacArthur–Bates Communicative Development Inventories: Words and Sentences (MCDI). The rate of vocabulary change for each participant is depicted for three periods of time: the pretreatment delay period, during treatment, and the posttreatment period (from immediate posttreatment to follow-up). Participants have been grouped by responder profiles. aParticipant gained more words per week during treatment than during the pretreatment delay period. bParticipant gained more words per week during the posttreatment period than during the pretreatment delay period. cParticipant demonstrated a rate of zero words per week during one or more periods of measurement (pretreatment delay period, during treatment, posttreatment period). dRate for posttreatment period could not be calculated for this participant because one or more MCDI forms were not returned.
We also compared MCDI word learning rates during the pretreatment delay period versus during the posttreatment period. Posttreatment follow-up data were available for 13 of the 15 toddlers who had a delay period. 6 We found anecdotal evidence (BFINC = 1.268) for a higher word learning rate during the posttreatment period (M = 7.63, SD = 12.90) compared to the pretreatment delay period (M = 2.52, SD = 3.37). Eight of the 13 toddlers with posttreatment follow-up data had a higher rate of word learning during the posttreatment period (see Figure 1).
Comparing VAULT Treatment Protocols
Neither our current (hereafter referred to as Study 2) nor the previous (Alt et al., 2020; hereafter referred to as Study 1) VAULT experiments revealed differences in specific treatment manipulations, and both were efficacious in general. Accordingly, we wanted to compare the overall protocols to determine if one version had an advantage over the other. To do this, we used the Tau statistic, which compares the nonoverlap between baseline and treatment conditions for single-subject designs and generates a z score that can be compared across studies (Parker et al., 2011). For each word type (target vs. control), each data point represented the cumulative number of the 10 unique words in the child's set that they had said. Only the first production of each unique word was included in this calculation, meaning that the word was captured by the statistic even if it was not produced in subsequent sessions. This makes Tau valuable to use with our data because not all words were targeted during all sessions. See Figure 2 for an example generated from these data. We calculated these z scores using an online calculator (Single Case Research, n.d.) and compared the z scores using a Bayesian repeated-measures ANOVA with VAULT study (1 vs. 2) as the between-subjects measure and word type (target vs. control) as the within-subject variable.
Figure 2.
An example of a participant's cumulative unique target and control word productions across baseline and treatment sessions used in the Tau analysis.
This combined analysis provided anecdotal evidence for no difference between VAULT Studies 1 and 2 (BFINC = 0.389) and anecdotal evidence against an interaction between VAULT study and word type (BFINC = 0.685). The mean z scores for target words were 1.65 (SD = 1.05) for Study 1 and 1.58 (SD = 1.16) for Study 2. z Scores for the control condition were below 1.0 for both VAULT studies (Study 1: z = .60, SD = 0.84; Study 2: z = .97, SD = 1.15).
Given no difference between Studies 1 and 2, we reinforced the finding that VAULT is efficacious, with a main effect for word type, indicating extreme evidence that z scores for the treatment condition were higher than those for the control condition (BFINC = 8341.507).
Identification of Responder Profiles
Despite the clear evidence that the VAULT treatment is efficacious, it is unsatisfying that the treatment only appears to work for a subset of participants (Study 1, 15 of 24 participants; Study 2, 10 of 19 participants). It would be ideal to predict for which participants the treatment is likely to be effective. In VAULT Study 1, we were unable to determine any patterns. By combining the data from Studies 1 and 2, we were able to find some statistically significant differences between responders and nonresponders on certain measures. We were then able to use Classification and Regression Tree (CART) analysis to make a clinical decision tree. The demographics of the children used in this analysis can be found in Table 5.
Table 5.
Demographic information for all combined VAULT participants.
| Characteristic | Combined VAULT participants | |
|---|---|---|
| n (male, female) | 43 (25, 18) | |
| Age in months, M (SD) | 29 (4) | |
| Standard score on the Bayley a , M (SD) | 95 (10) | |
| Number of words produced on the initial MCDI, M (SD) | 71 (96) | |
| Number of words understood on the initial MCDI, M (SD) | 369 (150) | |
| Race | 34 White, 7 more than one race, 1 Black/African American, 1 no response | |
| Ethnicity | 15 Hispanic, 28 Non-Hispanic | |
| Maternal education | ||
| High school | 2 | |
| Associate degree or some college | 13 | |
| Bachelor's degree | 15 | |
| Graduate degree | 13 |
Note. VAULT = Vocabulary Acquisition and Usage for Late Talkers; Bayley = Bayley Scales of Infant and Toddler Development–Third Edition (Bayley, 2006); MCDI = MacArthur–Bates Communicative Development Inventories: Words and Sentences (Fenson et al., 2007).
Administration of the Bayley was discontinued before a ceiling was reached for eight participants, so these scores are likely underestimates of the children's abilities.
CART analysis is a statistical technique used for decision making that takes multiple factors (both continuous and binary) into account and determines which factors result in the best classification (Morgan, 2014). Given that we had a binary decision to make (i.e., responders vs. nonresponders), we created a classification tree using the RStudio package rpart (RStudio Team, 2020). Potential decision factors that went into the equation included both binary and continuous factors. The binary factors were sex, presence versus absence of family risk factors, and whether or not a target word was produced in the first two treatment sessions. The continuous factors were age at treatment onset in months, number of words reported on the MCDI (one measure each for receptive and expressive) at the start of treatment, number of words on the MCDI adjusted by age (one measure each for receptive and expressive), and socioeconomic status as measured by years of maternal education.
Across VAULT Studies 1 and 2, we had 25 responders and 18 nonresponders as defined by a positive treatment effect size. We used the rpart analysis measure with the predictive variables described above, the method “class” for classification, and the complexity parameter set to 0.001. Doing so meant that the minimum amount of improvement at each node of the decision tree had to be at least 0.001. The best-fitting model had two nodes (see Figure 3). The first node was whether or not the child produced a target word within the first two treatment sessions. The second node was the number of words on the MCDI (expressive) at the start of treatment. If a child did not produce a target word within the first two sessions and had fewer than 60 words on the MCDI at the start of treatment, the model classified them as a nonresponder. Alternatively, if a child either (a) did produce a target word in the first two sessions or (b) did not produce a target word in the first two sessions but had at least 60 words on the MCDI, they were considered by the model to be a responder. This decision tree resulted in the correct classification of 15/18 nonresponders (83.33%) and 21/25 responders (84.00%).
Figure 3.
A decision tree to inform clinicians' decisions on whether to maintain the Vocabulary Acquisition and Usage for Late Talkers (VAULT) treatment protocol or make adjustments to the treatment plan. MCDI = MacArthur–Bates Communicative Development Inventories: Words and Sentences.
Discussion
Efficacy
We demonstrated the efficacy of the current version of the VAULT protocol and confirmed that there was no statistically significant difference between the brief and extended conditions. It may not be surprising that late talkers can benefit from the brief condition (which was designed to compensate for putative working memory limitations). It is also helpful to know that some late talkers were able to benefit from the extended condition, implying that they were able to incorporate the increased linguistic variability and cues present in the extended utterances. From a practical standpoint, we were pleased not to find a difference between the conditions because it was taxing for the clinicians in both conditions to obtain such a high degree of fidelity. Thus, the freedom to use grammatical utterances of mixed lengths will likely allow for better fidelity to other aspects of the treatment protocol as we move to the effectiveness stage of treatment research.
Who Responds to Treatment
We were pleased with the clear evidence from this study and Alt et al. (2020) that both VAULT versions to date have been efficacious. However, we recognize the difficulty that comes with knowing that not all children responded to the treatment. On the one hand, if one removes the nonresponders from the analysis, the effect size for treatment jumps from 0.90 to 1.71. In other words, when VAULT works, it really works. On the other hand, it is disheartening that every second or third participant does not experience the planned treatment outcomes. Discovering who is likely to respond to treatment has the potential to save clinicians' and families' precious resources during intervention by avoiding a protocol that is unlikely to yield success.
To our knowledge, this is one of the first language treatment studies to include a clinical decision-making tree. When selecting variables with the potential to affect treatment outcomes, we considered differences between responders and nonresponders in our descriptive data, as well as variables suggested by the literature on long-term outcomes of late talkers (e.g., expressive vocabulary). Our two-node decision tree achieves at least 80% classification accuracy—the threshold for acceptable sensitivity and specificity (Plante & Vance, 1994; Spaulding et al., 2006)—for both responders and nonresponders. In addition to determining the number of words a child uses from the MCDI, clinicians considering VAULT need to complete two VAULT sessions. This is a relatively low cost: 60 min of a treatment for which there is no evidence of negative outcomes. After only two sessions of treatment, if a child is unlikely to respond to VAULT based on the decision tree, the clinician could modify the protocol or switch to a different treatment altogether. Clearly, a clinician can always use their clinical judgment to override the decision tree's suggestion if there are other relevant factors (e.g., child's lack of attention to task or child showing improvement in nonlinguistic communicative attempts).
Some might wonder what types of modification could work for a potential nonresponder. There is likely no one-size-fits-all answer, but there is some evidence for two children with some similar characteristics. Navarro et al. (2020) followed up with two VAULT nonresponders using a modified protocol after completion of VAULT. Although the follow-up happened before we ran the CART analyses, both of these children would have been classified as nonresponders by our decision tree. These children had similar profiles: strong receptive vocabulary as reported by parents, excellent nonverbal communication, poor speech skills, and a continued use of protowords. Motivated by their poor speech skills and apparent overreliance on protowords, we introduced an augmentative and alternative communication (AAC) option into our modified protocol and required a production attempt via speech or AAC. Both children responded positively in the few experimental sessions we provided, increasing their expressive vocabulary output by over 600%. It is unclear how these participants would have responded if we had identified them as nonresponders to our standard protocol and introduced the AAC modifications in the second week of treatment. The point is simply to illustrate that there are likely alternative approaches that will better serve some of the children classified as nonresponders.
Limitations
While these results give us some insight into which treatment parameters are most likely to lead to positive outcomes, questions remain. Some might wonder if our conditions (i.e., brief vs. extended) were truly different, as there was no buffer between lengths. However, we would point out that the boundary between four and five words per utterance was simply that—a boundary. Brief utterances could range from one to four words, and extended utterances could range from five words on up. In practice, most of the extended utterances were relatively long, because clinicians were wary of missing their fidelity targets. So, while the conditions may seem similar, in practice they differed.
Due to the fact that parents were necessarily present during treatment, they may have been influenced by the treatment and changed their behaviors at home, influencing the outcomes. This opportunity to change behaviors was equally distributed across the conditions. Based on posttreatment interviews with parents, roughly half of the parents reported changing their behaviors. Most parents did not report using the study-related words often at home but did report using study-related techniques (e.g., repetition, narration). Without measures of precisely what parents did at home, we are limited in interpreting the effect of parental behavior on treatment outcomes. However, this area is ripe for further investigation.
It was not ideal that three of our active participants did not receive the full 16 sessions of treatment due to COVID-19 (and that our overall planned N had to be reduced). However, when we reran the data excluding these three participants, our primary findings did not differ. Thus, this perturbation in the protocol likely did not negatively affect the study outcomes.
It would have been ideal to be able to have a valid measure of phonological working memory for each participant to see if the children who responded (or not) in the extended condition did so based on their phonological working memory skills. However, we were unaware of any measure of phonological working memory that would have given us a valid measure of late-talking toddlers' skills. Finding an appropriate measure was particularly challenging given the limited phonological production skills of many of our participants.
Finally, although our sample was not homogenous, it was not as fully diverse as our community, and children from homes with lower levels of maternal education were underrepresented. Consequently, if an effect of maternal education might be driven by the lower end of the range, we would not be able to identify this with our sample. Our recruiting efforts did extend to a wide range of families, and we had practices in place to help with issues related to transportation, but in future efforts, we will need to be more creative to obtain a more representative sample.
Future Directions
Because our VAULT research is still at the efficacy stage, we needed to tightly control our treatment parameters. That means, for example, we do not yet have data about the effectiveness of VAULT for bilingual late talkers or for when families are more involved in the intervention. ASHA's guiding principles for early intervention (i.e., for ages 0–3 years) include that treatment should be family centered, culturally and linguistically appropriate, and conducted in children's natural environments (ASHA, 2008). Although we have included families by obtaining their input about meaningful words to target, in the future, we hope to see if caregivers can be trained as effective intervention agents. This would allow greater incorporation of the learning principles into the child's natural environment. We have established evidence for efficacy and hope to establish evidence for effectiveness.
Moving into the effectiveness arena will likely require some changes to the protocol. While we have demonstrated positive treatment outcomes with flexibility in terms of dose number per target word within a range (Alt et al., 2020) and dose relative to utterance length, we do not yet have detailed answers to every clinical question relative to flexibility of the parameters. For example, we designed this treatment to follow principles of statistical learning, including the Regularity Principle, which takes into account frequency of occurrence (Plante & Gómez, 2018). Frequency needs to be high, but the details of how high are not yet clear. What we can say is that our cumulative treatment intensity works for the majority of our participants. We have specified ranges within which a clinician may have some flexibility, but more work is needed to determine the amount of flexibility. At this point, using our parameters is one way to provide some security that the treatment should work, provided that one's client fits the responder profile via our decision tree.
Conclusions
The VAULT protocol was efficacious for the majority of the late-talking toddlers we trained. When it works, the effect sizes are large. Clinicians have flexibility with some parameters (e.g., dose number per target word and utterance length), within ranges. However, not all children respond to the VAULT protocol. Using a two-node decision tree that includes the child's number of words produced on the MCDI and performance on two VAULT sessions allows clinicians an adequate way to determine if continuing with VAULT is likely to lead to good outcomes for an individual child.
Author Contributions
Mary Alt: Conceptualization (Lead), Data Curation (Equal), Formal Analysis (Lead), Funding Acquisition (Lead), Investigation (Supporting), Methodology (Equal), Supervision (Supporting), Writing - Original Draft (Equal), Writing – review & editing (Supporting). Cecilia R. Figueroa: Conceptualization (Supporting), Data curation (Equal), Investigation (Equal), Project administration (Lead), Supervision (Equal), Writing – original draft (Supporting), Writing – review & editing (Supporting). Heidi M. Mettler: Formal analysis (Supporting), Investigation (Supporting), Project administration (Supporting), Visualization (Equal), Writing – original draft (Equal), Writing – review & editing (Equal). Nora Evans-Reitz: Data curation (Supporting), Methodology (Supporting), Project administration (Supporting), Writing – original draft (Supporting), Writing – review & editing (Lead). Jessie A. Erikson: Formal analysis (Supporting), Visualization (Lead), Writing – original draft (Supporting), Writing – review & editing (Supporting).
Supplementary Material
Acknowledgments
This work was supported by funding from the National Institute on Deafness and Other Communication Disorders Grant 1R01 DC015642-01, awarded to Mary Alt and Elena Plante, for which we are very grateful. We are also thankful for the families who partnered with us on this project. Their dedication to their children's language development was inspirational. Finally, we thank all of the members of the L4 Lab. You all make our research community fun and functional. We appreciate you.
Appendix A
CONSORT-SPI 2018 Checklist
| Section | Item no. | CONSORT-SPI 2010 | CONSORT-SPI 2018 | Reported on page no. |
|---|---|---|---|---|
| Title and Abstract | ||||
| 1a | Identification as a randomized trial in the title | 1235, 1238 | ||
| 1b | Structured summary of trial design, methods, results, and conclusions (for specific guidance see CONSORT for Abstracts) | Refer to CONSORT extension for social and psychological intervention trial abstracts | 1235 | |
| Introduction | ||||
| Background and objectives | 2a | Scientific background and explanation of rationale | 1235-1237, Appendix B | |
| 2b | Specific objectives or hypotheses | If pre-specified, how the intervention was hypothesized to work | 1238 | |
| Methods | ||||
| Trial design | 3a | Describe of trial design (such as parallel, factorial), including allocation ratio | If the unit of random assignment is not the individual, please refer to CONSORT for Cluster Randomized Trials | 1238 |
| 3b | Important changes to methods after trial commencement (such as eligibility criteria), with reasons | 1242-1243, Appendix B | ||
| Participants | 4a | Eligibility criteria for participants | When applicable, eligibility criteria for settings and those delivering the interventions | Participants: 1238-1239, 1243-1245, 1246 Clinicians: 1241, Appendix B |
| 4b | Settings and locations where the data were collected | 1240, Appendix B | ||
| Interventions | 5 | The interventions for each group with sufficient details to allow replication, including how and when they are actually administered | 1240, Table 3, Appendix B | |
| 5a | Extent to which interventions were actually delivered by providers and taken up by participants as planned | 1242-1243, Appendix B | ||
| 5b | Where other informational materials about delivering the intervention can be accessed | 1238-1242, Appendix B | ||
| 5c | When applicable, how intervention providers were assigned to each group | 1241 | ||
| Outcomes | 6a | Completely defined pre-specified outcomes, including how and when they were assessed | 1242-1243 | |
| 6b | Any changes to trial outcomes after the trial commenced, with reasons | n/a | ||
| Sample size | 7a | How sample size was determined | 1243 | |
| 7b | When applicable, explanation of any interim analyses and stopping guidelines | n/a | ||
| Randomization | ||||
| Sequence generation | 8a | Method used to generate the random allocation sequence | 1238 | |
| 8b | Type of randomization; detail of any restriction (such as blocking and block size) | 1238 | ||
| Allocation concealment mechanism | 9 | Mechanism used to implement the random allocation sequence, describing any steps taken to conceal the sequence until interventions were assigned | 1238 | |
| Implementation | 10 | Who generated the random allocation sequence, who enrolled participants, and who assigned participants to interventions | 1238 | |
| Awareness of assignment | 11a | Who was aware of intervention assignment after allocation (for example, participants, providers, those assessing outcomes), and how any masking was done | 1238 | |
| 11b | If relevant, description of the similarity of interventions | 1240 | ||
| Analytical methods | 12a | Statistical methods used to compare group outcomes | How missing data were handled, with details of any imputation method | 1243 |
| 12b | Methods for additional analyses, such as subgroup analyses, adjusted analyses, and process evaluations | 1243-1245 | ||
| Results | ||||
| Participant flow (a diagram is strongly recommended) | 13a | For each group, the numbers randomly assigned, receiving the intended intervention, and analyzed for the outcomes | Where possible, the number approached, screened, and eligible prior to random assignment, with reasons for nonenrolment | Appendix C |
| 13b | For each group, losses and exclusions after randomization, together with reasons | Appendix C | ||
| Recruitment | 14a | Dates defining the periods of recruitment and follow-up | 1238 | |
| 14b | Why the trial ended or was stopped | 1243 | ||
| Baseline data | 15 | A table showing baseline characteristics for each group | Include socioeconomic variables where applicable | 1239, 1246 Table 1 and 5 |
| Numbers analyzed | 16 | For each group, number included in each analysis and whether the analysis was by original assigned groups | Appendix C | |
| Outcomes and estimation | 17a | For each outcome, results for each group, and the estimated effect size and its precision (such as 95% confidence interval) | Indicate availability of trial data | 1243 |
| 17b | For binary outcomes, the presentation of both absolute and relative effect sizes is recommended | n/a | ||
| Ancillary analyses | 18 | Results of any other analyses performed, including subgroup analyses, adjusted analyses, and process evaluations, distinguishing pre-specified from exploratory | 1243-1246 | |
| Harms | 19 | All important harms or unintended effects in each group (for specific guidance see CONSORT for Harms) | 1246 | |
| Discussion | ||||
| Limitations | 20 | Summarize the main results (including an overview of concepts, themes, and types of evidence available), link to the review questions and objectives, and consider the relevance to key groups. | Trial limitations, addressing sources of potential bias, imprecision, and, if relevant, multiplicity of analyses | 1246-1248 |
| Generalizability | 21 | Discuss the limitations of the scoping review process. | Generalizability (external validity, applicability) of the trial findings | 1248 |
| Interpretation | 22 | Provide a general interpretation of the results with respect to the review questions and objectives, as well as potential implications and/or next steps. | Interpretation consistent with results, balancing benefits and harms, and considering other relevant evidence | 1248 |
| Important Information | ||||
| Registration | 23 | Registration number and name of trial registry | n/a | |
| Protocol | 24 | Where the full trial protocol can be accessed, if available | 1238-1242 | |
| Declaration of interests | 25 | Sources of funding and other support; role of funders | Declaration of any other potential interests | 1235 |
| Stakeholder investments | 26a | Any involvement of the intervention developer in the design, conduct, analysis, or reporting of the trial | 1235 | |
| 26b | Other stakeholder involvement in trial design, conduct, or analyses | 1240 | ||
| 26c | Incentives offered as part of the trial | n/a | ||
Note. Items marked “n/a” were not included in the manuscript either to maintain readability or because they were not applicable to this study at its current efficacy stage. Adapted from Grant et al. (2018).
Appendix B
The 12 Steps of the TIDieR Protocol for Intervention Description and Replication
| TIDieR step | Item description | See manuscript page(s) |
|---|---|---|
| 1) Brief name | Provide the name or a phrase that describes the intervention. | 1235 |
| 2) Why | Describe any rationale, theory, or goal of the elements essential to the intervention. | 1235-1237 |
| 3) What (materials) | Describe any physical or informational materials used in the intervention, including those provided to participants or used in intervention delivery or in training of intervention providers. Provide information on where the materials can be accessed (for example, online appendix, URL). | 1238-1242, Appendix B |
| 4) What (procedures) | Describe each of the procedures, activities, and/or processes used in the intervention, including any enabling or support activities. | 1238-1242 |
| 5) Who provided | For each category of intervention provider (for example, psychologist, nursing assistant), describe their expertise, background, and any specific training given. | 1241 |
| 5a) Who received a | Describe the intended participants of the intervention. | 1239-1240 |
| 6) How | Describe the modes of delivery (such as face-to-face or by some other mechanism, such as Internet or telephone) or the intervention and whether it was provided individually or in a group. | 1239, 1242 |
| 7) Where | Describe the type(s) of location(s) where the intervention occurred, including any necessary infrastructure or relevant features. | 1240 |
| 8) When and how much | Describe the number of times the intervention was delivered and over what period of time including the number of sessions, their schedule, and their duration, intensity, or dose. | 1241 |
| 9) Tailoring | If the intervention was planned to be personalized, titrated, or adapted, then describe what, why, when, and how. | 1240 |
| 10) Modifications | If the intervention was modified during the course of the study, described the changes (what, why, when, and how) | 1243 |
| 11) How well (planned) | If intervention adherence or fidelity was assessed, describe how and by whom, and if any strategies were used to maintain or improve fidelity, describe them. | 1242, Appendix B |
| 12) How well (actual) | If intervention adherence or fidelity was assessed, describe the extent to which the intervention was delivered as planned. | 1242 |
Note. Adapted from Hoffmann et al. (2014, Appendix C).
Not included in the original TIDieR checklists.
Appendix C
Modified CONSORT Flow Diagram of VAULT Participants.

Note. The template for CONSORT flow diagram was obtained from http://www.consort-statement.org/consort-statement/flow-diagram.
Appendix D
Rules for Acceptable Forms of Target Words and Utterance Length
Acceptable Forms of Target Words
Most target word usage was straightforward: Clinicians used a given target in an utterance according to the assigned condition length. Target words could be conjugated (e.g., “walk,” “walks,” and “walking” all counted for target “walk”). However, exceptional cases arose, so we developed the following rules to train scorekeepers in fidelity tracking. Note that clinicians were trained to deliver the target words without the following changes.
Irregular forms of a target (e.g., irregular past tense verbs or irregular plurals) that changed the target's root did not count (e.g., “caught” did not count for target “catch” and “feet” did not count for target “foot”).
Modifications that changed word class (e.g., the noun “walker” for target verb “walk” or the verb “block” for the target noun “block”) counted as doses but were discouraged.
Targets occurring within compound words counted if the entire target was said (e.g., “bathroom” counted for target “bath,” but “bath” did not count for target “bathroom”). However, compounds were discouraged because they referred to different concepts.
If the clinician said most of the target but stopped (e.g., “cli–” for “climb”), it counted as a dose.
Utterance Length
-
In general, we tallied words using what we called the “spacebar rule.” That is, if typing would require a space, it counted as two words. If not, it counted as one word.
For example, concatenatives (e.g., “gonna,” “hafta,” “wanna,” and “gotta”) counted as one word, but clearly enunciating “want to” counted as two words.
Similarly, contractions (e.g., “she'll,” “can't,” or “he's”) counted as one word.
However, holophrases (e.g., “hot dog,” “ice cream,” “Mr. Potato-Head,” or “Mickey Mouse”) counted as one word because they act as a unit. That is, a child does not necessarily associate “ice” with “ice cream” or “dog” with “hot dog.”
-
If the clinician paused significantly between words that could otherwise go together, we counted two separate utterances. Pause significance was based on scorekeeper impressions of pause duration. Examples with target “pizza”:
In “I can't wait for pizza,” “pizza” occurs in a five-word (extended) utterance.
In “I can't wait…for pizza,” “pizza” occurs in a two-word (brief) utterance.
Funding Statement
This work was supported by funding from the National Institute on Deafness and Other Communication Disorders Grant 1R01 DC015642-01, awarded to Mary Alt and Elena Plante, for which we are very grateful.
Footnotes
The grant that funded this work was received prior to the current clinical trials mechanism. As such, it was not preregistered.
One participant received outside speech therapy during the delay period, before the start of treatment.
Although the MCDI only has norms through 30 months of age, we obtained MCDIs for participants of all ages in order to track words each child produced and widen the range of possible target and control words.
Our goal was to establish a delay period for all participants. However, we were occasionally unable to do so due to a family's scheduling requirements or our need to match subjects by sex and age.
We ran the analysis without the three participants who completed between 9 and 13 treatment sessions, but the results did not change. We still found moderate evidence (BFINC = 3.063) in favor of a larger target word effect size (M = 0.977, SD = 1.31) than a control word effect size (M = 0.539, SD = 1.16).
Posttreatment follow-up data for the other two participants with a delay period were not comparable: Due to COVID-19, both participants received at least two telepractice sessions with a modified protocol between the last date of the in-person treatment and the follow-up.
References
- Adams, E. J. , Nguyen, A. T. , & Cowan, N. (2018). Theories of working memory: Differences in definition, degree of modularity, role of attention, and purpose. Language, Speech, and Hearing Services in Schools, 49(3), 340–355. https://doi.org/10.1044/2018_LSHSS-17-0114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alt, M. (2011). Phonological working memory impairments in children with specific language impairment: Where does the problem lie? Journal of Communication Disorders, 44(2), 173–185. https://doi.org/10.1016/j.jcomdis.2010.09.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alt, M. , Mettler, H. M. , Erikson, J. A. , Figueroa, C. R. , Etters-Thomas, S. E. , Arizmendi, G. D. , & Oglivie, T. (2020). Exploring input parameters in an expressive vocabulary treatment with late talkers. Journal of Speech, Language, and Hearing Research, 63(1), 216–233. https://doi.org/10.1044/2019_JSLHR-19-00219 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alt, M. , Meyers, C. , Oglivie, T. , Nicholas, K. , & Arizmendi, G. (2014). Cross-situational statistically based word learning intervention for late-talking toddlers. Journal of Communication Disorders, 52, 207–220. https://doi.org/10.1016/j.jcomdis.2014.07.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alt, M. , & Suddarth, R. (2012). Learning novel words: Detail and vulnerability of initial representations for children with specific language impairment and typically developing peers. Journal of Communication Disorders, 45(2), 84–97. https://doi.org/10.1016/j.jcomdis.2011.12.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- American Speech-Language-Hearing Association. (n.d.). Late language emergence. Retrieved July 1, 2020, from https://www.asha.org/Practice-Portal/Clinical-Topics/Late-Language-Emergence/
- American Speech-Language-Hearing Association. (2008). Roles and responsibilities of speech-language pathologists in early intervention: Technical report [Technical report] . https://doi.org/10.1044/policy.TR2008-00290
- Arnon, I. , & Clark, E. V. (2011). Why brush your teeth is better than teeth—Children's word production is facilitated in familiar sentence-frames. Language Learning and Development, 7(2), 107–129. https://doi.org/10.1080/15475441.2010.505489 [Google Scholar]
- Baddeley, A. , Gathercole, S. , & Papagno, C. (1998). The phonological loop as a language learning device. Psychological Review, 105(1), 158–173. https://doi.org/10.1037/0033-295X.105.1.158 [DOI] [PubMed] [Google Scholar]
- Baker, C. E. , Vernon-Feagans, L. , & The Family Life Project Investigators. (2015). Fathers' language input during shared book activities: Links to children's kindergarten achievement. Journal of Applied Developmental Psychology, 36, 53–59. https://doi.org/10.1016/j.appdev.2014.11.009 [Google Scholar]
- Bayley, N. (2006). Bayley Scales of Infant and Toddler Development–Third Edition (Bayley-III) . The Psychological Corporation. [Google Scholar]
- Beeson, P. M. , & Robey, R. R. (2006). Evaluating single-subject treatment research: Lessons learned from the aphasia literature. Neuropsychology Review, 16(4), 161–169. https://doi.org/10.1007/s11065-006-9013-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bloom, P. , & Kelemen, D. (1995). Syntactic cues in the acquisition of collective nouns. Cognition, 56(1), 1–30. https://doi.org/10.1016/0010-0277(94)00648-5 [DOI] [PubMed] [Google Scholar]
- Brady, K. W. , & Goodman, J. C. (2014). The type, but not the amount, of information available influences toddlers' fast mapping and retention of new words. American Journal of Speech-Language Pathology, 23(2), 120–133. https://doi.org/10.1044/2013_AJSLP-13-0013 [DOI] [PubMed] [Google Scholar]
- Brent, M. R. , & Siskind, J. M. (2001). The role of exposure to isolated words in early vocabulary development. Cognition, 81(2), B33–B44. https://doi.org/10.1016/S0010-0277(01)00122-6 [DOI] [PubMed] [Google Scholar]
- Cable, A. L. , & Domsch, C. (2011). Systematic review of the literature on the treatment of children with late language emergence. International Journal of Language & Communication Disorders, 46(2), 138–154. https://doi.org/10.3109/13682822.2010.487883 [DOI] [PubMed] [Google Scholar]
- Capone Singleton, N. (2018). Late talkers: Why the wait-and-see approach is outdated. Pediatric Clinics of North America, 65(1), 13–29. https://doi.org/10.1016/j.pcl.2017.08.018 [DOI] [PubMed] [Google Scholar]
- Dale, P. (2007). MacArthur–Bates Communicative Development Inventory-III. Brookes. [Google Scholar]
- Fenson, L. , Marchman, V. A. , Thal, D. J. , Dale, P. S. , Reznick, J. S. , & Bates, E. (2007). MacArthur–Bates Communicative Development Inventories–Second Edition. Brookes. [Google Scholar]
- Ferguson, B. , Graf, E. , & Waxman, S. R. (2018). When veps cry: Two-year-olds efficiently learn novel words from linguistic contexts alone. Language Learning and Development, 14(1), 1–12. https://doi.org/10.1080/15475441.2017.1311260 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fisher, E. L. (2017). A systematic review and meta-analysis of predictors of expressive-language outcomes among late talkers. Journal of Speech, Language, and Hearing Research, 60(10), 2935–2948. https://doi.org/10.1044/2017_JSLHR-L-16-0310 [DOI] [PubMed] [Google Scholar]
- Frank, M. C. , Braginsky, M. , Yurovsky, D. , & Marchman, V. A. (2016). Wordbank: An open repository for developmental vocabulary data. Journal of Child Language, 44(3), 677–694. https://doi.org/10.1017/S0305000916000209 [DOI] [PubMed] [Google Scholar]
- Gathercole, S. E. (2006). Nonword repetition and word learning: The nature of the relationship. Applied Psycholinguistics, 27(4), 513–543. https://doi.org/10.1017/S0142716406060383 [Google Scholar]
- Gelman, S. A. , & Raman, L. (2003). Preschool children use linguistic form class and pragmatic cues to interpret generics. Child Development, 74(1), 308–325. https://doi.org/10.1111/1467-8624.00537 [DOI] [PubMed] [Google Scholar]
- Graf Estes, K. , Evans, J. L. , & Else-Quest, N. M. (2007). Differences in the nonword repetition performance of children with and without specific language impairment: A meta-analysis. Journal of Speech, Language, and Hearing Research, 50(1), 177–195. https://doi.org/10.1044/1092-4388(2007/015) [DOI] [PubMed] [Google Scholar]
- Grant, S. , Mayo-Wilson, E. , Montgomery, P. , Macdonald, G. , Michie, S. , Hopewell, S. , & Moher, D. (2018). CONSORT-SPI 2018 explanation and elaboration: Guidance for reporting social and psychological intervention trials. Trials, 19, Article 406. https://doi.org/10.1186/s13063-018-2735-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hammer, C. S. , Morgan, P. , Farkas, G. , Hillemeier, M. , Bitetti, D. , & Maczuga, S. (2017). Late talkers: A population-based study of risk factors and school readiness consequences. Journal of Speech, Language, and Hearing Research, 60(3), 607–626. https://doi.org/10.1044/2016_JSLHR-L-15-0417 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Head Zauche, L. , Thul, T. A. , Darcy Mahoney, A. E. , & Stapel-Wax, J. L. (2016). Influence of language nutrition on children's language and cognitive development: An integrated review. Early Childhood Research Quarterly, 36, 318–333. https://doi.org/10.1016/j.ecresq.2016.01.015 [Google Scholar]
- Hoff, E. (2003). The specificity of environmental influence: Socioeconomic status affects early vocabulary development via maternal speech. Child Development, 74(5), 1368–1378. https://doi.org/10.1111/1467-8624.00612 [DOI] [PubMed] [Google Scholar]
- Hoff, E. , & Naigles, L. (2002). How children use input to acquire a lexicon. Child Development, 73(2), 418–433. https://doi.org/10.1111/1467-8624.00415 [DOI] [PubMed] [Google Scholar]
- Hoffmann, T. C. , Glasziou, P. P. , Boutron, I. , Milne, R. , Perera, R. , Moher, D. , Altman, D. G. , Barbour, V. , Macdonald, H. , Johnston, M. , Lamb, S. E. , Dixon-Woods, M. , McCulloch, P. , Wyatt, J. C. , Chan, A.-W. , & Michie, S. (2014). Better reporting of interventions: Template for Intervention Description and Replication (TIDieR) checklist and guide. BMJ, 348, Article g1687. https://doi.org/10.1136/bmj.g1687 [DOI] [PubMed] [Google Scholar]
- Kan, P. F. , & Windsor, J. (2010). Word learning in children with primary language impairment: A meta-analysis. Journal of Speech, Language, and Hearing Research, 53(3), 739–756. https://doi.org/10.1044/1092-4388(2009/08-0248) [DOI] [PubMed] [Google Scholar]
- Kapa, L. L. , & Erikson, J. A. (2019). Variability of executive function performance in preschoolers with developmental language disorder. Seminars in Speech and Language, 40(4), 243–255. https://doi.org/10.1055/s-0039-1692723 [DOI] [PubMed] [Google Scholar]
- Kouider, S. , Halberda, J. , Wood, J. , & Carey, S. (2006). Acquisition of English number marking: The singular–plural distinction. Language Learning and Development, 2(1), 1–25. https://doi.org/10.1207/s15473341lld0201_1 [Google Scholar]
- Kruschke, J. K. (2013). Bayesian estimation supersedes the t test. Journal of Experimental Psychology: General, 142(2), 573–603. https://doi.org/10.1037/a0029146 [DOI] [PubMed] [Google Scholar]
- Manning, B. L. , Roberts, M. Y. , Estabrook, R. , Petitclerc, A. , Burns, J. L. , Briggs-Gowan, M. , Wakschlag, L. S. , & Norton, E. S. (2019). Relations between toddler expressive language and temper tantrums in a community sample. Journal of Applied Developmental Psychology, 65, Article 101070. https://doi.org/10.1016/j.appdev.2019.101070 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marini, A. , Ruffino, M. , Sali, M. E. , & Massimo, M. (2017). The role of phonological working memory and environmental factors in lexical development in Italian-speaking late talkers: A one-year follow-up study. Journal of Speech, Language, and Hearing Research, 60(12), 3462–3473. https://doi.org/10.1044/2017_JSLHR-L-15-0415 [DOI] [PubMed] [Google Scholar]
- Montgomery, J. W. , Magimairaj, B. M. , & Finney, M. C. (2010). Working memory and specific language impairment: An update on the relation and perspectives on assessment and treatment. American Journal of Speech-Language Pathology, 19(1), 78–94. https://doi.org/10.1044/1058-0360(2009/09-0028) [DOI] [PubMed] [Google Scholar]
- Morgan, J. (2014). Classification and Regression Tree analysis (Technical Report No. 1) . Boston University School of Public Health. https://www.bu.edu/sph/files/2014/05/MorganCART.pdf
- Naigles, L. (1990). Children use syntax to learn verb meanings. Journal of Child Language, 17(2), 357–374. https://doi.org/10.1017/S0305000900013817 [DOI] [PubMed] [Google Scholar]
- Navarro, I. I. , Cretcher, S. R. , McCarron, A. R. , Figueroa, C. , & Alt, M. (2020). Using AAC to unlock communicative potential in late-talking toddlers. Journal of Communication Disorders, 87, Article 106025. https://doi.org/10.1016/j.jcomdis.2020.106025 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Newbury, J. , Klee, T. , Stokes, S. F. , & Moran, C. (2015). Exploring expressive vocabulary variability in two-year-olds: The role of working memory. Journal of Speech, Language, and Hearing Research, 58(6), 1761–1772. https://doi.org/10.1044/2015_JSLHR-L-15-0018 [DOI] [PubMed] [Google Scholar]
- Paquette-Smith, M. , & Johnson, E. K. (2016). Toddlers' use of grammatical and social cues to learn novel words. Language Learning and Development, 12(3), 328–337. https://doi.org/10.1080/15475441.2015.1112801 [Google Scholar]
- Parker, R. I. , Vannest, K. J. , Davis, J. L. , & Sauber, S. B. (2011). Combining nonoverlap and trend for single-case research: Tau-U. Behavior Therapy, 42(2), 284–299. https://doi.org/10.1016/j.beth.2010.08.006 [DOI] [PubMed] [Google Scholar]
- Perry, L. K. , Samuelson, L. K. , Malloy, L. M. , & Schiffer, R. N. (2010). Learn locally, think globally: Exemplar variability supports higher-order generalization and word learning. Psychological Science, 21(12), 1894–1902. https://doi.org/10.1177/0956797610389189 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peters-Sanders, L. A. , Kelley, E. S. , Biel, C. H. , Madsen, K. , Soto, X. , Seven, Y. , Hull, K. , & Goldstein, H. (2020). Moving forward four words at a time: Effects of a supplemental preschool vocabulary intervention. Language, Speech, and Hearing Services in Schools, 51(1), 165–175. https://doi.org/10.1044/2019_LSHSS-19-00029 [DOI] [PubMed] [Google Scholar]
- Plante, E. , & Gómez, R. L. (2018). Learning without trying: The clinical relevance of statistical learning. Language, Speech, and Hearing Services in Schools, 49(3S), 710–722. https://doi.org/10.1044/2018_LSHSS-STLT1-17-0131 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Plante, E. , Ogilvie, T. , Vance, R. , Aguilar, J. M. , Dailey, N. S. , Meyers, C. , Lieser, A. M. , & Burton, R. (2014). Variability in the language input to children enhances learning in a treatment context. American Journal of Speech-Language Pathology, 23(4), 530–545. https://doi.org/10.1044/2014_AJSLP-13-0038 [DOI] [PubMed] [Google Scholar]
- Plante, E. , & Vance, R. (1994). Selection of preschool language tests: A data-based approach. Language, Speech, and Hearing Services in Schools, 25(1), 15–24. https://doi.org/10.1044/0161-1461.2501.15 [Google Scholar]
- Random.org. (n.d.). https://www.random.org
- Rescorla, L. (1989). The Language Development Survey: A screening tool for delayed language in toddlers. Journal of Speech and Hearing Disorders, 54(4), 587–599. https://doi.org/10.1044/jshd.5404.587 [DOI] [PubMed] [Google Scholar]
- Rescorla, L. (2005). Age 13 language and reading outcomes in late-talking toddlers. Journal of Speech, Language, and Hearing Research, 48(2), 459–472. https://doi.org/10.1044/1092-4388(2005/031) [DOI] [PubMed] [Google Scholar]
- Rescorla, L. (2009). Age 17 language and reading outcomes in late-talking toddlers: Support for a dimensional perspective on language delay. Journal of Speech, Language, and Hearing Research, 52(1), 16–30. https://doi.org/10.1044/1092-4388(2008/07-0171) [DOI] [PubMed] [Google Scholar]
- Rescorla, L. (2011). Late talkers: Do good predictors of outcome exist? Developmental Disabilities Research Reviews, 17(2), 141–150. https://doi.org/10.1002/ddrr.1108 [DOI] [PubMed] [Google Scholar]
- Rice, M. L. , Cleave, P. L. , & Oetting, J. B. (2000). The use of syntactic cues in lexical acquisition by children with SLI. Journal of Speech, Language, and Hearing Research, 43(3), 582–594. https://doi.org/10.1044/jslhr.4303.582 [DOI] [PubMed] [Google Scholar]
- RStudio Team. (2020). RStudio: Integrated development for R [Computer software] . RStudio. [Google Scholar]
- Schachinger-Lorentzon, U. , Kadesjö, B. , Gillberg, C. , & Miniscalco, C. (2018). Children screening positive for language delay at 2.5 years: Language disorder and developmental profiles. Neuropsychiatric Disease and Treatment, 14, 3267–3277. https://doi.org/10.2147/NDT.S179055 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Single Case Research. (n.d.). Tau-U calculator. Retrieved July 1, 2020, from http://www.singlecaseresearch.org/calculators/tau-u
- Spaulding, T. J. , Plante, E. , & Farinella, K. A. (2006). Eligibility criteria for language impairment: Is the low end of normal always appropriate. Language, Speech, and Hearing Services in Schools, 37(1), 61–72. https://doi.org/10.1044/0161-1461(2006/007) [DOI] [PubMed] [Google Scholar]
- Stokes, S. F. , & Klee, T. (2009a). The diagnostic accuracy of a new Test of Early Nonword Repetition differentiating late talking and typically developing children. Journal of Speech, Language, and Hearing Research, 52(4), 872–882. https://doi.org/10.1044/1092-4388(2009/08-0030) [DOI] [PubMed] [Google Scholar]
- Stokes, S. F. , & Klee, T. (2009b). Factors that influence vocabulary development in two-year-old children. The Journal of Child Psychology and Psychiatry, 50(4), 498–505. https://doi.org/10.1111/j.1469-7610.2008.01991.x [DOI] [PubMed] [Google Scholar]
- Stokes, S. F. , Klee, T. , Kornisch, M. , & Furlong, L. (2017). Visuospatial and verbal short-term memory correlates of vocabulary ability in preschool children. Journal of Speech, Language, and Hearing Research, 60(8), 2249–2258. https://doi.org/10.1044/2017_JSLHR-L-16-0285 [DOI] [PubMed] [Google Scholar]
- Torkildsen, J. V. K. , Dailey, N. S. , Aguilar, J. M. , Gómez, R. , & Plante, E. (2013). Exemplar variability facilitates rapid learning of an otherwise unlearnable grammar by individuals with language-based learning disability. Journal of Speech, Language, and Hearing Research, 56(2), 618–629. https://doi.org/10.1044/1092-4388(2012/11-0125) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wagenmakers, E.-J. , Love, J. , Marsman, M. , Jamil, T. , Ly, A. , Verhagen, J. , Selker, R. , Gronau, Q. F. , Dropmann, D. , Boutin, B. , Meerhoff, F. , Knight, P. , Raj, A. , van Kesteren, E.-J. , van Doorn, J. , Šmíra, M. , Epskamp, S. , Etz, A. , Matzke, D. , … Morey, R. D. (2018). Bayesian inference for psychology. Part II: Example applications with JASP. Psychonomic Bulletin & Review, 25, 58–76. https://doi.org/10.3758/s13423-017-1323-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Warren, S. F. , Fey, M. E. , & Yoder, P. J. (2007). Differential treatment intensity research: A missing link to creating optimally effective communication interventions. Mental Retardation and Developmental Disabilities Research Reviews, 13(1), 70–77. https://doi.org/10.1002/mrdd.20139 [DOI] [PubMed] [Google Scholar]
- Wolfe, D. L. , & Heilmann, J. (2010). Simplified and expanded input in a focused stimulation program for a child with expressive language delay (ELD). Child Language Teaching and Therapy, 26(3), 335–346. https://doi.org/10.1177/0265659010369286 [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



