Skip to main content
Journal of Speech, Language, and Hearing Research : JSLHR logoLink to Journal of Speech, Language, and Hearing Research : JSLHR
letter
. 2022 Feb 22;65(3):1183–1185. doi: 10.1044/2022_JSLHR-22-00019

Dynamic Norming and Open Science

Brian MacWhinney a,, Nan Bernstein Ratner b
PMCID: PMC9150751  PMID: 35192372

Abstract

In a recent issue of JSLHR, Tucci et al. (2022) presented a method for assigning SEM scores to a language sample. However, this method is based on data that are not publicly available and uses a commercial analysis program that is not open source. The TalkBank system and the Child Language Data Exchange System database provides free analysis software based on openly accessible data, thereby adhering to Open Science standards, which represent an important next step for the fields of speech and hearing.


Tucci et al. (2022) show how calculation of the standard error of the mean (SEM) can allow clinicians to align language sample measures derived from a target client with those from a larger age-matched comparison set. We agree that ability to perform dynamic score analysis provides an important component in the clinician's arsenal of tools for assessment of children's language development. We also agree that SEM values provide one good way of making these assessments. Despite our agreement regarding the value of this approach, we are concerned about several aspects of the report, data, and analysis in Tucci et al.

JSLHR has recently announced a call for papers on “promoting reproducibility for the speech, language, and hearing sciences.” However, this laudable effort is not commensurate with the journal's recent publication of a set of analyses by Tucci et al. (2022) based on a privately held dataset that is tightly linked to a specific commercial product. Open Science practices such as the FAIR (Findability, Accessibility, Interoperability, and Reusability) standards (Wilkinson et al., 2016) require that data be publicly available for further analysis and replication. This means that the software for conducting replicable analyses must also be open source and freely available. Unfortunately, the data on which Tucci et al. base their analysis is not openly available. Moreover, to compare a new transcript with this unshared comparison database, one must purchase a proprietary piece of software that is not open source.

Despite statements in Tucci et al. (2022) to the contrary, there is an alternative method for conducting dynamic norming that is in full accord with Open Science standards. This method uses the open data in the Child Language Data Exchange System (CHILDES) and the freely available open-source Computerized Language ANalysis (CLAN) programs developed in the context of the TalkBank project (MacWhinney, 2000, 2019). In 2015, TalkBank configured the CHILDES database to permit dynamic score assessment of a target language sample through the KIDEVAL program (Bernstein Ratner & MacWhinney, 2016; Overton et al., 2021). Use of this program is documented in Section 8.8 of the CLAN manual, which is freely downloadable from https://talkbank.org/manuals/CLAN.pdf. For English, the program allows users to compare a single target transcript or a collection of targets with over 2,000 comparison files from the larger English CHILDES database. The comparison is based on precompiled values for transcripts in 6-month groupings with far more than 35 samples in each of the twelve 6-month groups. The comparison can be further filtered for sample size in terms of utterances, gender (male, female), activity type (narrative, interview, free play), design (cross-sectional, longitudinal), clinical status (typically developing, atypical), and comparison with an alternative age group. Because CHILDES files include automatically computed morphosyntactic analyses, the program can also automatically compute all the mean length of utterance measures used by Systematic Analysis of Language Transcripts (SALT), along with the Developmental Sentence Score (DSS; Lee, 1974), the Index of Productive Syntax (IPSyn; Scarborough, 1990; see also MacWhinney et al., 2020), values on the 14 grammatical morphemes studies by Brown (1973), and several measures of lexical diversity. In all, KIDEVAL produces outcomes on 41 variables that are output to a .csv file for possible further analysis by Excel and statistical programs. For each of these 41 variables, KIDEVAL includes the standard deviation score with significance levels for the target transcript in relation to the comparison group.

From the viewpoint of clinical evaluation, there are limitations inherent in the measures tracked in Tucci et al. (2022), as well as similar measures produced by Sampling Utterances and Grammatical Analysis Revised (SUGAR; Pavelko & Owens, 2017). These measures, such as length of utterance and words per minute, evaluate children's language primarily in terms of quantity or volubility. While potentially useful in identifying or diagnosing less talkative children who may have expressive language limitations, such measures are less than ideal in providing clinicians with concrete strategies for furthering children's syntactic or grammatical growth. They provide clinicians with little guidance in constructing language goals other than to “say more” or “make utterances longer.” This point is made in articles by Guo et al. (2018), Pezold et al. (2019), Finestack et al. (2020), and Yang et al. (2022), all of which use CLAN assessment and are published in ASHA Journals.

In contrast to the quantity measures in SALT and SUGAR, assessment through KIDEVAL produces a profile across 41 variables including details on lexicon, morphology, and syntax, which can be further supplemented through automatic running of DSS and IPSyn (Yang et al., 2022). These Open Science tools provide the clinician with information on the specific aspects of language along which the target child diverges significantly from comparison group norms, rather than just measures of output quantity. Moreover, language samples created with CLAN can easily be linked directly to the audio recording on the utterance level, permitting additional analysis for fluency and interactional features, and they can be analyzed for phonological development and disorders by using the fully compatible and freely available Phon program (Rose & MacWhinney, 2014).

Given the fact that KIDEVAL has been openly available since 2015 (Bernstein Ratner & MacWhinney, 2016; Garbarino et al., 2020), it is surprising to find Tucci et al. (2022) describing SALT as the first available utility to perform dynamic norming of children's expressive language skills. KIDEVAL has been openly available to perform this function at no cost for the past 6 years, in contrast to this very recent development for dynamic score assessment based on nonshared data linked to a commercial product. Moreover, TalkBank provides two other programs with structures like KIDEVAL but targeted to other areas relevant to speech and language science. These are the EVAL program (Forbes et al., 2012) for analysis of language in aphasia and the FLUCALC program (Bernstein Ratner & MacWhinney, 2018) for analysis of developmental stuttering.

It is important for the field to be able to evaluate and compare alternative methods for computer-assisted language sample analysis. However, it is equally important that this process be accompanied by open access to data and full sharing of analysis tools and methods, and it is incumbent on our journals to begin to implement this policy.

Author Contributions

Brian MacWhinney: Conceptualization (Equal), Methodology (Equal), Writing – review & editing (Equal). Nan Bernstein Ratner: Conceptualization (Equal), Methodology (Equal), Writing – review & editing (Equal).

Acknowledgments

Development of the CHILDES system and the CLAN program is supported by grant HD082736 from Eunice Kennedy Shriver National Institute of Child Health and Human Development, awarded to Brian MacWhinney.

Funding Statement

Development of the CHILDES system and the CLAN program is supported by grant HD082736 from Eunice Kennedy Shriver National Institute of Child Health and Human Development, awarded to Brian MacWhinney.

References

  1. Bernstein Ratner, N. , & MacWhinney, B. (2016). Your laptop to the rescue: Using the Child Language Data Exchange System archive and CLAN utilities to improve child language sample analysis. Seminars in Speech and Language, 37(2), 74–84. https://psyling.talkbank.org/years/2016/ssl/nan.pdf [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bernstein Ratner, N. , & MacWhinney, B. (2018). Fluency Bank: A new resource for fluency research and practice. Journal of Fluency Disorders, 56, 69–80. https://doi.org/10.1016/j.jfludis.2018.03.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Brown, R. (1973). A first language: The early stages. Harvard. [Google Scholar]
  4. Finestack, L. H. , Rohwer, B. , Hilliard, L. , & Abbeduto, L. (2020). Using computerized language analysis to evaluate grammatical skills. Language, Speech, and Hearing Services in Schools, 51(2), 184–204. https://doi.org/10.1044/2019_LSHSS-19-00032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Forbes, M. , Fromm, D. , & MacWhinney, B. (2012). AphasiaBank: A resource for clinicians. Seminars in Speech and Language, 33(3), 217–222. https://doi.org/10.1055/s-0032-1320041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Garbarino, J. , Bernstein Ratner, N. , & MacWhinney, B. (2020). Use of computerized language analysis to assess child language. Language, Speech, and Hearing Services in Schools, 51(2), 504–506. https://doi.org/10.1044/2020_LSHSS-19-00118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Guo, L.-Y. , Eisenberg, S. , Bernstein Ratner, N. , & MacWhinney, B. (2018). Is putting SUGAR (Sampling Utterances of Grammatical Analysis Revised) into language sample analysis a good thing? A response to Pavelko and Owens (2017). Language, Speech, and Hearing Services in Schools, 49(3), 622–627. https://doi.org/10.1044/2018_LSHSS-17-0084 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Lee, L. (1974). Developmental sentence analysis. Northwestern University Press. [Google Scholar]
  9. MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk (3rd ed.). Lawrence Erlbaum Associates. [Google Scholar]
  10. MacWhinney, B. (2019). Understanding spoken language through TalkBank. Behavior Research Methods, 51(4), 1919–1927. https://doi.org/10.3758/s13428-018-1174-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. MacWhinney, B. , Roberts, J. A. , Altenberg, E. P. , & Hunter, M. (2020). Improving automatic IPSyn coding. Language, Speech, and Hearing Services in Schools, 51(4), 1187–1189. https://doi.org/10.1044/2020_LSHSS-20-00090 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Overton, C. , Baron, T. , Pearson, B. Z. , & Ratner, N. B. (2021). Using free computer-assisted language sample analysis to evaluate and set treatment goals for children who speak African American English. Language, Speech, and Hearing Services in Schools, 52(1), 31–50. https://doi.org/10.1044/2020_LSHSS-19-00107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Pavelko, S. , & Owens, R. (2017). Sampling Utterances and Grammatical Analysis Revised (SUGAR): New normative values for language sample analysis measures. Language, Speech, and Hearing Services in Schools, 48(3), 197–215. https://doi.org/10.1044/2017_LSHSS-17-0022 [DOI] [PubMed] [Google Scholar]
  14. Pezold, M. J. , Imgrund, C. M. , & Storkel, H. L. (2019). Using computer programs for language sample analysis. Language, Speech, and Hearing Services in Schools, 51(1), 103–114. https://doi.org/10.1044/2019_LSHSS-18-0148 [DOI] [PubMed] [Google Scholar]
  15. Rose, Y. , & MacWhinney, B. (2014). The PhonBank Project: Data and software-assisted methods for the study of phonology and phonological development. In Durand J., Gut U., & Kristoffersen G. (Eds.), The Oxford handbook of corpus phonology (pp. 380–401). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199571932.013.023 [Google Scholar]
  16. Scarborough, H. (1990). Index of Productive Syntax. Applied Psycholinguistics, 11(1), 1–22. https://doi.org/10.1017/S0142716400008262 [Google Scholar]
  17. Tucci, A. , Plante, E. , Heilmann, J. J. , & Miller, J. F. (2022). Dynamic norming for Systematic Analysis of Language Transcripts. Journal of Speech, Language, and Hearing Research, 65(1), 320–333. https://doi.org/10.1044/2021_JSLHR-21-00227 [DOI] [PubMed] [Google Scholar]
  18. Wilkinson, M. D. , Dumontier, M. , Aalbersberg, I. J. , Appleton, G. , Axton, M. , Baak, A. , Blomberg, N. , Boiten, J.-W. , da Silva Santos, L. B. , & Bourne, P. E. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1), 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Yang, J. S. , MacWhinney, B. , & Ratner, N. B. (2022). The Index of Productive Syntax: Psychometric properties and suggested modifications. American Journal of Speech-Language Pathology, 31(1), 239–256. https://doi.org/10.1044/2021_AJSLP-21-00084 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Speech, Language, and Hearing Research : JSLHR are provided here courtesy of American Speech-Language-Hearing Association

RESOURCES