Skip to main content
PLOS One logoLink to PLOS One
. 2024 Sep 27;19(9):e0311209. doi: 10.1371/journal.pone.0311209

Creating a diagnostic assessment model for autism spectrum disorder by differentiating lexicogrammatical choices through machine learning

Sumi Kato 1,2,*, Kazuaki Hanawa 3, Manabu Saito 4, Kazuhiko Nakamura 1
Editor: Laura Morett5
PMCID: PMC11432897  PMID: 39331681

Abstract

This study explores the challenge of differentiating autism spectrum (AS) from non-AS conditions in adolescents and adults, particularly considering the heterogeneity of AS and the limitations of diagnostic tools like the ADOS-2. In response, we advocate a multidimensional approach and highlight lexicogrammatical analysis as a key component to improve diagnostic accuracy. From a corpus of spoken language we developed, interviews and story-recounting texts were extracted for 64 individuals diagnosed with AS and 71 non-AS individuals, all aged 14 and above. Utilizing machine learning techniques, we analyzed the lexicogrammatical choices in both interviews and story-recounting tasks. Our approach led to the formulation of two diagnostic models: the first based on annotated linguistic tags, and the second combining these tags with textual analysis. The combined model demonstrated high diagnostic effectiveness, achieving an accuracy of 80%, precision of 82%, sensitivity of 73%, and specificity of 87%. Notably, our analysis revealed that interview-based texts were more diagnostically effective than story-recounting texts. This underscores the altered social language use in individuals with AS, a crucial aspect in distinguishing AS from non-AS conditions. Our findings demonstrate that lexicogrammatical analysis is a promising addition to traditional AS diagnostic methods. This approach suggests the possibility of using natural language processing to detect distinctive linguistic patterns in AS, aiming to enhance diagnostic accuracy for differentiating AS from non-AS in adolescents and adults.

Introduction

Autism spectrum (AS) is a neurodevelopmental condition characterized by persistent difficulties in social communication and interactions across various situations. Alongside this, individuals with AS exhibit repetitive and restricted patterns of behavior, activities, or interests [1]. The primary symptom revolves around challenges in social communication, primarily manifesting as pragmatic impairment (PI) [2, 3]. PI is characterized by specific difficulties in language comprehension and expression, especially at the pragmatic level, which pertains to the effective use of language in social contexts. This includes challenges in adapting language formality based on the situation, interpreting non-literal language (such as idioms, metaphors, irony, and sarcasm), and understanding the nuances of language that affect interpersonal interactions. It refers to struggles with these pragmatic aspects of language, rather than with the basic structural or grammatical components.

There is a widespread consensus among researchers in the clinical field that PI should be examined comprehensively, incorporating multiple factors like language, nonverbal aspects, and cognition. Previous studies have provided insights into the potential factors contributing to PI, indicating that it may arise from neurological, cognitive, symbolic, and/or sensorimotor dysfunctions [47]. Perkins [4] outlines four key domains of pragmatics, namely: (1) Semiotic: Encompasses language aspects (phonology, prosody, morphology, syntax, semantics, and discourse) and nonverbal elements (gestures, gaze, facial expressions, and posture). (2) Cognitive: Involves processes like inference, theory of mind, executive function, memory, along with emotions and attitudes. (3) Motor: Concerns physical aspects of communication (use of the vocal tract, hands, arms, face, eyes, and body). (4) Sensory: Focuses on hearing and vision for understanding and conveying information. Perkins’ classification prioritizes factors contributing to PI, highlighting cognitive dysfunction as the primary cause, with linguistic and sensorimotor factors deemed secondary.

Clinicians have observed individuals with AS who possess reasonably good language skills but struggle with effective communication. This has led them to recognize the vital role that cognitive functions, such as inferential reasoning, executive function, and memory, play in interpersonal interactions. Consequently, the clinical field has argued for a close association between cognition and PI [4]. As a result, neurology-based research, has become a major focus of studies of PI [5].

Previous studies regarding concrete linguistic phenomena of AS with cognitive perspective explored single grammatical areas such as modality [812], relative clauses [13, 14], and syntax [3, 1519].

Investigating one such area, modality, Perkins [8, 9] and Nuyts and Roeck [10] conducted story-narrating experiments with AS children and reported limited understanding and use of epistemic modal expressions. Similarly, Kato [11] found individuals with AS were found to be less likely to utilize certain modal expressions, such as probability expressions (must, will, may, etc.) and evidentiality expressions (seems, looks like, likely, is said, according to, etc.). The study further revealed that the cognitive processes associated with probability and evidentiality are closely linked to the reasoning process. McDonald [20] argued that executive processes are primary among cognitive functions, with similarities between executive function and inference generation, noting that as impairment in the executive system increases, there is a corresponding increase in inferential reasoning difficulties. Autistic children struggle in situations where contextual information is not explicit and where they need to rely on general or social knowledge, as they excel more in deductive reasoning than inductive reasoning [2123]. Such a reasoning pattern influences how AS individuals interpret and utilize the modal expression, must [21]. These two aspects of grammatical classification, probability, and evidentiality also point to a broader difficulty in utilizing context to derive meaning.

This ability to infer is related not only to the Executive Function Theory [2427] but also to other cognitive theories such as the Empathizing-Systemizing Theory [28, 29] and the Weak Central Coherence Theory [3033]. For Example, as previous studies on Central Coherence have shown [3437], individuals with AS often face challenges not only in comprehending facial expressions and gaze direction in others but also in producing these non-verbal cues themselves. This difficulty arises from an inability to integrate information from various contexts and an impaired ability to prioritize social cues.

These investigations link the observed linguistic phenomena to explanations rooted in cognitive dysfunction. However, a limitation of these studies is that they concentrate solely on specific grammatical aspects, and as a result, the overall picture of the impairment remains uncharted. To gain a comprehensive understanding of PI as a whole, a systematic and comprehensive mapping is required, one that identifies and explores linguistic phenomena and instances of pragmatic disorder across various grammatical domains. However, such a comprehensive and systematic mapping of PI within the domain of linguistics has not been undertaken thus far.

Among the aids available to assist AS diagnosis, the Autism Diagnostic Observation Schedule Second Edition (ADOS-2) and the Autism Diagnostic Interview-Revised (ADI-R) are most commonly used. The ADOS-2 is a semi-structured AS diagnostic assessment aid that focuses on behavioral observations; the ADI-R is a standardized clinical caregiver interview that yields the developmental history and the current characteristics of patient functioning as perceived by the caregiver. The two tools are recommended to be used in combination; this approach has demonstrated the highest diagnostic validity [38].

The former, the ADOS-2, especially, is considered the gold standard diagnostic measurement [39, 40]. However, the results of some previous studies have led to questions regarding its versatility, particularly for adults [41, 42], for two main reasons. First, the ADOS-2 does not clearly differentiate AS from other neurodevelopmental conditions such as attention deficit hyperactivity disorder (ADHD), nor does it distinguish AS from psychiatric conditions like the negative symptoms of schizophrenia [43, 44] or psychiatric comorbidities (e.g., anxiety disorders, mood disorders, and avoidant personality disorders) [4547]. Additionally, the inherent heterogeneity within the AS itself further complicates differential diagnosis. The overlapping symptoms [48, 49] across these conditions make diagnosis challenging. Second, factors such as masking behavior, compensation strategies [50, 51], and learned camouflaging [52] can conceal critical information about impairment, potentially leading to misdiagnosis. Additionally, although not directly related to ADOS-2 measurements, diagnoses of adults are often difficult because developmental reports from parents or caregivers are commonly absent. Patient self-insight is unreliable [30, 53]. Consequently, AS diagnosis requires input from multiple perspectives, and this study suggests that language analysis has the potential to serve as a supplementary diagnostic tool.

One effective approach to comprehensively map the PI of AS involves utilizing corpora of spoken language from individuals with AS. Despite limited research in this area, one notable corpus for English is Parish-Morris et al.’s [54], although it is not publicly accessible. Through this corpus, differences in speaking rate and inter-turn gaps between AS and non-AS individuals have been observed. Among the available open corpora, the Nadig AS English Corpus [55] contains transcripts of videotaped free play between AS children and their parents. This corpus provides a raw collection of simple linguistic data with no semantic information annotated. Similarly, the Asymmetries AS Corpus focuses on Dutch-speaking AS and Typically Developed individuals’ spoken language in its raw form [56].

In studies of Japanese-speaking individuals with AS, Sakishita et al. [57] and Kato et al. [58] are notable for utilizing corpora specifically developed for their respective research. Sakishita’s corpus contains 17 types of annotations based on the publicly available Chiba 3 Party [59], with a primary focus on phonetic usage. The analysis involves examining statistics derived from these annotations and morpheme information, as well as investigating their relationship with the ADOS scores. On the other hand, Kato et al. [58] developed a comprehensive annotation scheme for analyzing syntax and lexicogrammar in spoken Japanese by individuals with and without AS. This scheme encompasses 159 annotation items derived from transcripts obtained during ADOS-2 administrations. The corpus in Kato et al. [58] was developed based on the theoretical framework of Systemic Functional Linguistics (SFL), which outlines interconnected layers of language activities as depicted in Fig 1.

Fig 1. Position of cognition in language activities defined by SFL.

Fig 1

(Adapted from Kato et al. [11]) This figure illustrates the SFL hierarchy enhanced by Kato’s cognition layer addition [11]. It demonstrates how culture and situation shape lexicogrammatical choices via Field, Tenor, and Mode, thereby impacting communication. The diagram underscores the importance of cognition in selecting suitable lexicogrammar for successful social interactions.

The second stratum in Fig 1, encompasses culture, representing the collective social values and ideologies of a society associated with a specific language. The third stratum is defined by situation, encompassing register components that influence subordinate strata like lexicogrammar by shaping the lexicogrammatical decisions individuals make for meaningful communication. Register is composed of three elements: Field, Tenor, and Mode. Field addresses the questions "what is happening?" and "what is the topic?", covering ideational meanings. Tenor deals with the social and contextual roles of participants, representing interpersonal meanings, while Mode pertains to the communication channels during interactions, capturing textual meanings [60]; these elements are found in the fourth stratum. Interpersonal interactions are rooted in meaning choices, which are restricted to two specific contexts: culture in the second stratum and situation in the third. Ranking below, discourse semantics is expressed through lexicogrammar, with lexicogrammar subsequently articulated by phonology/graphology. A comprehensive grasp of both cultural and situational contexts empowers speakers to opt for fitting lexicogrammatical selections; in the absence of such understanding, they might choose inappropriately, leading to PI [58]. Variances in lexicogrammatical choices don’t always equate to PIs, with certain overlaps and disparities present. At the heart of PIs are socially unfitting lexicogrammatical selections.

SFL theory positions language purely as a social construct, sidestepping the cognitive aspect of meaning creation. Contrasting with SFL’s stance, Kato [11] posited that cognition should underpin social context, and consequently incorporated an additional layer dedicated to cognition at the outermost tier. A speaker’s proficiency in selecting socially suitable lexicogrammar hinges on their cognitive performance at this top layer. For instance, those with AS, stemming from cognitive anomalies like impaired executive function [6164] or weak central coherence [32, 6568], fail to accurately discern culture or situation. As a result, they resort to unsuitable lexicogrammatical selections [11].

In Kato et al.’s study, the corpus was specifically constructed to focus on the lexicogrammatical layer, the fifth layer. It is crucial within the framework of SFL to accord special attention to ’lexicogrammar’ due to its distinctive role in integrating vocabulary and grammar. Lexicogrammar is a term used in SFL to describe the interdependence of vocabulary and grammar. In SFL, grammar and vocabulary are not different strata, but are the two poles of a single continuum, properly called lexicogrammar. In other words, lexicogrammar is the grammar of the lexicon, where lexis (vocabulary) and grammar (syntax) combine into one. It is a level of linguistic structure where, in SFL, the grammar and the meaning of words are not separate systems, but are interdependent. Notably, there is no prior study that has presented a corpus with comprehensive annotation of the lexicogrammar of AS individuals’ spoken language. Kato et al.’s corpus [58] annotates syntax and lexicogrammar because PI in AS often manifests in skewed lexical choices, which are identified as lexical anomalies [69, 70]. Among the various challenges related to semantic choice in AS, lexical processing problems are the most frequently cited examples [69, 70].

Utilizing the aforementioned corpus, Kato et al. [58, 71] noted that Japanese individuals with AS showed a reduced tendency to use sentence-final particles (SFPs), specifically ne and yo, which facilitate calls-for-attention in interpersonal interactions. These particles, ne and yo, are seen as verbal indicators of joint attention. This observation points to potential issues with joint attention and weak central coherence in individuals with AS. Building on these observations, it’s noteworthy that children with typical development frequently use SFPs such as ne and yo by the ages of 1.5–2 years [7274]. In contrast, studies indicate a marked reduction in the use of these SFPs among Japanese children with AS [75, 76]. Watamaki [76] associates the SFP ne with the development of empathy, proposing that its limited use in AS children might reflect social impairments in their language, a viewpoint supported by subsequent studies [7779]. Furthermore, individuals with AS less frequently utilized evaluative lexis, which indicates dysfunctional joint attention and weak central coherence [80]. Thus, the results of previous studies suggested that language use could be diagnostic. However, no study has explored whether such an application is possible with respect to AS.

The objective of our research project is to develop a diagnostic tool for AS that utilizes natural language processing (NLP) technologies. This study marks the initial phase in proving the feasibility of such an instrument for evaluating lexicogrammatical choices. The hypothesis underlying this research is that the neurocognitive abnormalities associated with AS could be mirrored in language production, thereby creating specific AS-specific lexico-grammatical patterns that can be used to distinguish AS individuals from non-AS. Such patterns may be observed as variations from what is often considered typical speech in society, potentially suggesting linguistic behaviors that could be associated with neurocognitive differences in individuals with AS. Consequently, we propose that these abnormalities in neurocognitive function would manifest in language use; the lexicogrammatical choices made by those with AS would display distinctive trends that allow for diagnostic differentiation. These linguistic variances form the foundation for our differentiation algorithm.

To facilitate this, we utilized machine learning (ML) to analyze annotated corpus data. Previous studies have employed various machine learning techniques to enhance the diagnostic assessment of Autism Spectrum Disorder (ASD). For example, Schulte-Ruther et al. [81] used random forest models trained on ADOS item scores to predict ASD diagnoses amidst overlapping conditions, achieving high sensitivity and specificity. Similarly, Abbas et al. [82] combined questionnaire data with behavioral tagging from home videos, employing feature selection and engineering to improve early autism detection with increased accuracy. Levy et al. [83] explored sparse models using supervised learning methods on ADOS scores, achieving high ROC curve areas with minimal features, thereby offering a more interpretable and robust diagnostic approach. Duda et al. [84] validated a mobile autism risk assessment tool that demonstrated high sensitivity and specificity in clinical settings. Bone et al. [85] utilized support vector machines to enhance the effectiveness and efficiency of widely used ASD screening tools through ML-based algorithm fusion.

Despite these advancements, there remains a gap in the literature regarding the integration of lexicogrammar analysis within the ML framework for ASD diagnosis. None of the existing studies specifically incorporate the linguistic patterns of individuals with ASD as a criterion for classification. This study aims to bridge that gap by applying ML to the texts and annotations within the corpus to develop models capable of discriminating between ASD and non-ASD individuals, particularly through the analysis of lexicogrammar features.

Methods

Corpus as training database

Choice of corpus individuals

The database subjected to ML was the corpus of spoken language of AS individuals and non-AS individuals developed by Kato et al. [58]. We selected AS (N = 64, M = 18, SD = 3.48) and non-AS individuals (N = 71, M = 19, SD = 2.77) aged 14+ years, primarily between 14 to 20 year, post the critical period for language acquisition. This decision to select participants after the critical period is grounded in Lenneberg’s [86] and Newport’s [87] Critical Period Hypothesis (CPH), particularly referencing the Lenneberg hypothesis regarding the critical period for language acquisition and Newport’s hypothesis on biological constraints. The Lenneberg hypothesis posits that language learning intensifies over a distinct period during childhood and then diminishes with maturity, suggesting that language acquisition mechanisms align functionally with this critical period, typically concluding around puberty (ages 11 to 12). Newport’s ’less capacity leads to more learning’ hypothesis further elaborates that the primary learning mechanism decreases as cognitive processing capacities grow, leading to a competition that can limit further language acquisition.

While the CPH remains a topic of ongoing debate, there are supporting [8895], opposing [9698], and neutral [99101] views on its validity. The consensus on the CPH does not overwhelmingly favor any single perspective. Acceptance of the hypothesis varies significantly depending on the specific language function being studied (e.g., syntax, phonology, morphology) and the context of language learning, whether it’s first-language or second-language acquisition. These differing perspectives are broadly summarized as follows:

  1. Pro-critical period perspective:

    Phonological and syntactic development: Substantial evidence indicates age-related declines in the ability to achieve native-like pronunciation and syntactic proficiency, particularly noted in second-language acquisition.

    First-language acquisition: Research on individuals deprived of early language exposure shows profound deficits, supporting a critical period for language acquisition.

  2. Con-critical period perspective:

    Proficiency in late learners: Some studies demonstrate that late learners can achieve high levels of linguistic competence, challenging the concept of a rigid critical period.

    Neuroplasticity: Neuroscientific studies highlight continued brain plasticity into adulthood, suggesting that language learning remains viable, though perhaps more challenging, beyond early years.

  3. Neutral/Mixed perspective:

    Variable acquisition timelines: Research suggests that optimal periods for language learning exist but are not strict cutoffs; instead, they represent phases where the ease of acquisition decreases.

    Role of external factors: Motivation, exposure, and educational methods significantly impact language learning, often mitigating the disadvantages of starting later.

Given these perspectives, we adopted a pro-CPH stance to minimize variability from ongoing language development. By selecting participants aged 14 and above, we align with studies suggesting that post-puberty, language acquisition stabilizes, providing a consistent baseline for analyzing AS-specific linguistic features. This approach allows us to focus on understanding the consistent underutilization of lexico-grammatical resources by individuals with AS, rather than the variability in general language development.

We recognize that language learning and expression continue to evolve throughout life. However, our study targets specific lexico-grammatical resources potentially underutilized by individuals with AS, which are likely consistent across ages and less influenced by the natural aging process. This approach allows us to focus on analyzing linguistic characteristics associated with neurodevelopmental differences, providing a clearer understanding of AS-specific language use.

Individuals with AS underwent clinical diagnosis using the DSM-5 criteria, carried out by experienced clinicians who specialized in the diagnosis and treatment of neurodevelopmental disorders. The primary assessment tool for confirming the AS diagnosis was ADOS-2. Clinicians used several assessments alongside ADOS-2 for comprehensive evaluation (Table 1), including: (1) SRS-2 (social behavior and competency); (2) Intelligence Tests—WISC-IV for under 16 years, WAIS-III or IV for 16+ (cognitive functioning); (3) Vineland-II (adaptive behavior and functioning); (4) AQ (AS-associated characteristics); (5) PARS-TR (parent interviews for AS behaviors).

Table 1. Overview of demographic and clinical metrics in AS vs. non-AS populations.
Characteristic AS Group (N = 64) Non-AS Group (N = 71) P-Value
Age (years) 18 ± 3.48 19 ± 2.77 0.00
Sex (M/F) 24/40 39/32 0.06
Education N/A College GPA range: 2.4–2.8 N/A
ADOS-2 Module 3 6.93 ± 1.38 2.75 ± 2.01 < 0.01
ADOS-2 Module 4 11.42 ± 3.55 4.22 ± 2.17 < 0.01
SRS-2 Total Score 85.53 ± 9.00 N/A N/A
WISC-IV-IQ Full Scale IQ 81.22 ± 14.42 N/A N/A
WAIS-III Full Scale IQ 91.33 ± 20.12 N/A N/A
Vineland-II Composite 64.83 ± 22.53 N/A N/A
AQ 36.64 ± 8.04 N/A N/A
PARS-TR Preschoolers 13.54 ± 6.27 N/A N/A
PARS-TR Adolescents & adults 24.30 ± 11.49 N/A N/A

Measurements of ADOS-2 are based on both observation and interaction; an individual with suspected AS is assessed in terms of reciprocal social interaction, communication, and imagination in a semistructured setting. Coding of observed behaviors using scoring algorithms yields a diagnostic measure of autism symptoms. The scores are compared with AS cut-off scores. If an individual meets or exceeds the cut-offs for reciprocal social interaction, communication, and restricted and repetitive behaviors, that individual meets the criteria for a diagnosis of AS. ADOS-2 administrations were conducted by an administrator who established research reliability with the experience required to use ADOS-2 results in a research setting by Western Psychological Services.

In our study, the AS cohort exhibited some comorbid conditions (Fig 2). It’s crucial to note that while AS is the primary diagnosis, the comorbidities are secondary to the core AS condition. The focus of this study is not to distinguish AS without co-occurring conditions from non-AS cases but to explore the identification of AS individuals, regardless of any comorbidities. We acknowledge the extensive research indicating that a substantial proportion of individuals with AS present with comorbidities [102106]. Given that AS without co-occurring conditions may constitute only a small fraction of the spectrum, any diagnostic algorithm focusing solely on this subgroup would have limited clinical utility if it does not account for the broader AS population, which typically includes various comorbid conditions.

Fig 2. Distribution of AS and co-occurring conditions.

Fig 2

The non-AS group consisted of two different types of participants: (1) Participants who did not meet the criteria for a clinical diagnosis of any psychiatric disorder (N = 17). These participants were not included in the clinical group based on the final diagnosis. The determination of their primary diagnosis was conducted through a comprehensive diagnostic process, consistent with the methodology employed for the AS subject groups. ADOS-2 scores indicated no AS signs: Module 3 average was 3.17 and Module 4 was 4.00 for communication and social interaction, confirming non-spectrum status according to ADOS-2 criteria. For ADOS-2, scores of 6 or below in Module 3 and a combined score of 7 or below in communication and social interaction for Module 4 are considered non-spectrum. This population did not include comorbidities of neurodevelopmental disorders. (2) Participants, primarily college students (N = 54), who were recruited and assessed as non-AS, indicative of not exhibiting AS characteristics according to the ADOS-2 assessment by a research reliability-established administrator. Their ADOS-2 scores confirmed the absence of AS signs: a mean of 2.01 in Module 3 and 4.27 in Module 4 for communication and social interaction total, indicative of non-spectrum status as per the established scoring criteria of the ADOS-2. While IQ testing was not part of our methodology, we selected individuals with a GPA range of 2.4 to 2.8, a range statistically representative of the average for Japanese college students [107109], who are also recognized for their proficiency in social activities both within and beyond the academic setting. This selection approach, focused on social adaptability and functionality, was key to our study’s aim of examining social capabilities, particularly in contrast to the challenges frequently associated with AS. Additionally, there are some high school students and adults who were also given the ADOS-2 test and are recognized as proficient in social activities.

Ethics statement

This research was conducted from September 2, 2013, to October 5, 2020, in strict adherence to the ethical standards outlined in the Declaration of Helsinki. The study protocol was approved by the Hirosaki University Committee on Medical Ethics under IRB number 2013–142, with subsequent updates leading to the current approval under 2018–168, Previous Number: 2015–055. To safeguard personal data, we followed the committee’s information security guidelines closely. Participants aged 20 and above provided written consent, and for those 19 and under, we obtained written consent from both the participants and their parents or guardians. We used alphanumeric characters to anonymize participants and removed any identifiable utterances from the transcripts to protect their privacy. The recruitment and the retrospective analysis of diagnostic data occurred simultaneously, with the period spanning from September 2, 2013, to October 5, 2020. This dual approach involved both prospective recruitment of participants and the retrospective examination of their diagnostic data, treated with the same ethical rigor and adherence to privacy standards as outlined above.

The texts

The ADOS-2 uses five modules arranged according to language level and participant age. Modules 3 and 4 are used to elicit interview responses and story recounting, primarily assessing adolescents and adults with fluent speech. These modules are designed for verbally fluent individuals, where verbal fluency is defined as language development at or above the level of a typical 4-year-old child’s expressive skills. This includes the ability to produce various sentence types and grammatical forms, describe events beyond the immediate context, and use logical connectors like but or though, although occasional grammatical errors may occur [110]. Module 3 is typically suited for verbally fluent children and adolescents, incorporating tasks like playing with action figure-type toys, seen as age-appropriate for this group. Conversely, Module 4 is tailored for older adolescents and adults and does not include the action figure play task, although the other tasks remain largely the same.

Participants suspected of having AS were audiotaped while performing six to eight tasks in Modules 3 and 4. These audiotapes were then manually transcribed to ensure high accuracy. Transcripts of these tasks were annotated and subsequently stored in our existing corpus. From this corpus, texts corresponding to targeted age groups were selected. We chose two different types of texts from the corpus: interview texts and participant-narrated narrative texts from ’Tuesday’ by Wiesner [111], a wordless picture book:

  1. Interview texts: The interview questions in both Modules are designed to assess participant insights into personal difficulties, sense of responsibility, sense of social situations, and understanding of relationships (e.g., friendship, marriage, and family ties) (S1 File). Some questions explore imaginary-world creations, an objective description of self, and a description of one’s own emotions. According to the protocol, the examiner takes a conversational tone, avoiding a question-and-answer approach, and tries to further develop interaction by commenting on what the participant says (i.e., by showing interest). We used all specified questions in the ADOS-2 manual to ensure consistency in the assessment process.

  2. Story-recounting texts: The story-recounting task assesses the participant’s ability to recount a wordless picture book; the participant also spontaneously describes the supposed emotional states of characters, such as how they are feeling.

    Interviews and story-recounting tasks tap into different cognitive and linguistic processes. Interviews, being dialogic, require immediate interactive communication that deeply engages social and pragmatic skills. In contrast, story-recounting, though monologic, also involves considering the listener’s perspective, making the narrative comprehensible and engaging in a more indirect way. Both types serve to explore social cognition and pragmatic abilities in AS, with interviews demanding direct social interaction and story-recounting engaging social cognition through narrative construction and anticipation of the listener’s needs.

In compiling our corpus, we included both the spoken texts of participants and the interviewer’s questions. However, our analytical focus was exclusively on the participants’ responses. The study examines the lexicogrammatical choices of individuals with AS in reciprocal social interactions, concentrating on their language use rather than the dynamics of interaction. Our approach evaluates patterns of lexicogrammar selection from participants’ responses, comprehensively analyzing their language use throughout the task. We assessed their entire spoken output during the evaluation, without setting specific requirements for text length, speaking duration, or word count.

Annotation scheme of the corpus

When developing the annotation scheme, Kato et al. [58] constructed four system networks (lexicogrammatical option systems from which speakers make choices) using the theoretical framework of SFL. Language choice is a central organizing concept of SFL. Individuals utilize different expressions based on various factors, such as the person they are addressing, the social setting, and other contextual elements. Consequently, when constructing a clause to convey a speaker’s intended meaning, there are multiple options available. The speaker, at the time of utterance, instantaneously makes choices through resource-selection mapping for each component of the clause. SFL defines this resource-selection mapping as the system network, encompassing all potential lexicogrammars that a speaker can select during linguistic interaction. Language, in this framework, is viewed as a meaning-making system where speakers draw upon resources from the system network as they engage in social activities [112]. In essence, the system network represents the vast array of linguistic choices available to speakers, and they actively choose from this network to express their intentions effectively and contextually during communication.

To better illustrate how individuals with AS make lexicogrammatical choices from a range of options within the system network, we examined their responses to the interview question ’How about feeling relaxed or content? What kinds of things make you feel that way?’ This involved analyzing three distinct lexicogrammatical choices, highlighting their decision-making process in language use.

Example 1 (Declarative): I find solace in nature, which helps me relax.

Example 2 (Interrogative with modalization:ability, can): Can spending time in nature help you feel relaxed?

Example 3 (Declarative with modulation:obligation, must): You must spend some time in nature to unwind and find relaxation.

These examples exhibit how participants express similar ideas using different lexicogrammatical choices. To analyze these sentences, we refer to the mood selection network shown in Fig 3, which is an enlargement of the red-circled part of the MOOD system (S1 Fig 1 in S1 File). Delicacy increases from left to right on the mood selection network.

Fig 3. Indicative sentence type of the system network.

Fig 3

This figure presents a close-up of the segment highlighted by the red circle in the MOOD system (S1 Fig 1 in S1 File for context). It illustrates the progression of delicacy in mood selection choices, moving from left to right across the network.

Indicative has three choices, EXPLANATIVE TYPE, INDICATIVE TYPE, and MODAL DEIXIS. Example 1 takes a declarative form from the INDICATIVE TYPE without taking EXPLANATIVE nor MODAL DEIXIS. The participant articulates their feelings through a definitive declarative statement, presenting a clear and unequivocal assertion. This sentence structure leaves little room for interpretation or doubt regarding their experience. The use of straightforward, declarative language conveys a strong, assertive claim about what the speaker intends to say.

On the other hand, in Example 2, the participant navigates the mood selection network from a general interrogative form towards a more specific yes/no question, which seeks to determine the likelihood of relaxation in nature. This progression is illustrated by the use of MODALITY, specifically modalization of ability (can), indicating a move from left to right on the network towards increased delicacy in lexicogrammatical choices. The intention could be seen as seeking information or confirmation from the listener. The participant appears to inquire whether spending time in nature has the potential to induce relaxation, aiming for a direct response. Additionally, this question might also serve a rhetorical or reflective function, potentially implying a broader, commonly held belief that nature inherently provides relaxation, rather than solely seeking the listener’s personal viewpoint.

In Example 3, the speaker employs the MODALITY TYPE alongside a declarative mood, specifically using modulation in the form of obligation (must). This choice underlines the necessity of engaging with nature for relaxation. By integrating modulation, obligation, the speaker asserts the essentiality of this action and aims to convince the listener of its importance for achieving relaxation. This linguistic strategy indicates a deliberate move to higher delicacy in the system network, reflecting the speaker’s persuasive intent.

These examples aim to emphasize the diverse language choices made by participant in response to similar prompts, highlighting the central role of linguistic choice in their communication.

The system network is divided into several parts based on the SFL lexicogrammatical classification. The Japanese system networks that Kato et al. [58] constructed are (i) MOOD (S1 Fig 1 in S1 File), (ii) APPRAISAL (S1 Fig 2 in S1 File), (iii) TRANSITIVITY (S1 Fig 3 in S1 File), and (iv) LOGICAL (S1 Fig 4 in S1 File). Kato et al. [58] charted all possible lexicogrammatical resources within these four systems, creating a network of interconnected options. The annotation scheme was formulated based on these networks. However, annotation does not encompass all the lexicogrammatical resources within the network, but only includes resources located in the green sections as shown in S1 Figs 1–4 in S1 File.

In light of the neurocognitive characteristics often observed in individuals with AS, our objective was to identify lexicogrammatical selections characterized by their differential usage—both those less and more frequently employed by this group compared to non-AS individuals, drawing upon the studies mentioned previously. Each identified lexicogrammar is thought to necessitate certain cognitive abilities for its effective utilization, similar to how joint attention may be required for the use of SFPs as discussed earlier.

Table 2 shows the tag set scheme and the lexicogrammatical functions used by Kato et al. [58] to annotate the scheme constructs. Each of the 15 headings has distinct subcategories; there are 147 different tag types (Table 2).

Table 2. Tag types and linguistic functions.
Lexicogrammar headings Linguistic functions Tag types No. of tag types
Ideational metafunction
1. Process type The mental image of reality is constructed by the TRANSITIVITY (clause component) of a clause. All individuals create a representation of reality. Experiential worlds are defined using 10 types of process verbs, yielding information about how, when speaking, an individual creates a representation of reality. 1.Material-doing 2.Materisal-happen 3.Mental-cognition 4.Mental-affect 5.Mental-perception 6.Relational-attribute 7.Relational-identity 8.Behavioral 9.Verbal 10.Existential 10
2. Ergativity This measures causation or instigation. In an ergative analysis, the participant that causes an event is the agent. Ergativity reveals whether a speaker interprets events and reality from the causal viewpoint of agency (effective) or becoming (i.e., a perspective lacking agency; a middle). 1.effective 2.middle 2
3. Transitivity A property that yields clues regarding the perspective (active or passive) from which the speaker interprets events and reality. voice (1.passive/active 2.causative) 2
4. Clause complexes The Japanese sentence type (of 22 types) chosen. This reveals syntactic ability and any cognitive tendency or deficiency. 1.Parallel clause 2.Te-form/Conjunctive clause-parallel/contrast 3.Te-form/Conjunctive clause- forerunner 4.Te-form/Conjunctive clause-sequence of actions 5.Te-form/Conjunctive clause-cause/ reason 6.Te-form/Conjunctive clause-adversative connective 7.Te-form/Conjunctive clause-resultative condition 8.Te-form/Conjunctive clause-attendant circumstance 9.Conditional clause-resultative condition 10.Conditional clause-converse condition-converse condition 11.Conditional clause-converse condition-adversative connective 12.Conditional clause-cause/ reason 13.Purpose clause 14.Time clause-temporal anteroposterior relation 15.Time clause-simultaneous actions 16.Time clause-others 17.Manner clause 18.Reported clause 19.Interrogative clause 20.Noun clause 21.Adnominal clause 22.Cordinate clause 22
5. Logico- semantic relation Logical clause linkages revealing syntactic ability, discourse strategy, and any cognitive tendency or deficiency. 1.Expansion-elaboration-expository 2.Expansion-elaboration-exemplifying 3.Expansion-elaboration-clarifying 4.Expansion-extension-additive 5.Expansion-extension-alternative 6.Expansion-enhancement-temporal 7.Expansion-enhancement-spatial 8.Expansion-enhancement-manner 9.Expansion-enhancement-cause-conditional 10.Projection-quote 11.Projection-report 12.Projection-idea 13.Projection-embedding 13
6. Auxiliary verbs Stative: Verbs describing the state of a subject rather than an action, reflecting the perspective of a speaker on an ongoing phenomenon. stative: (19 categories) compound: (1 category) 20
Compound: Verbs created by adding one verb to the stem of another; use of these verbs reflects the morphological skill of a speaker.  
Interpersonal metafunction
7. Modality In SFL, modality refers to an area of meaning that lies between yes and no; this constitutes the intermediate space between positive and negative polarity, categorized as either modalization (epistemic modality) and modulation. 1.Ability 2.Probability 3.Usuality 4.Necessity 5.Obligation 6.Permission 7.Expectation 8.Inclination 8
8. Appraisal- attitude The semantic resource used to negotiate emotional reactions, judge behavior, and value things. Attitude is divided into three domains: affect, judgment, and appreciation. Affect is used to interpret emotional responses (including fear, loathing, sadness, and happiness); judgment is used for moral evaluation of behavior (including ethical, brave, and deceptive); and appreciation is used to interpret the esthetic qualities of semiotic phrases/processes and natural phenomena (including remarkable, desirable, elegant, harmonious, and innovative). This lexicogrammar reveals the speaker’s value system. 1.AFFECT-inclination 2.AFFECT-emotion 3.AFFECT- security 4.AFFECT-satisfaction 5.JUDGEMENT-capacity 6. JUDGEMENT-reliability 7.JUDGEMENT-veracity 8.JUDGEMENT-propriety 9.JUDGEMENT-propencity 10.APPRECIATION-reaction 11.APPRECIATION-composition 12.APPRECIATION-phase-time 13.APPRECIATION-phase-extent 14.APPRECIATION-phase-degree 15.APPRECIATION-phase-space 16.APPRECIATION-phase-distance 17.APPRECIATION-phase-mass 18.APPRECIATION-social evaluation 18
9. Appraisal- graduation This is one of the three categories that make up Appraisal, along with Appraisal-attitude, which focuses on gradability. (i.e., adjustment of the extent of evaluation). 1.FORCE-intensification 2.FORCE-quantification 3.FOCUS-sharpening 4.FOCUS-softening 4
10. Negotiating particle A lexis that adds various negotiatory values to a clause, implying the attitudinal stance of a speaker toward a proposition or proposal; this lexis is associated with a call for attention and indicates the territory of the information involved. sentence-ending particles; 1.kana 2.kane 3.sa 4.ne 5.yo 6.yona 7.yone: Particle- 8.kane 9.sa 10.ne 11.yo- at places other than the end of the sentence: other- 12.ne 12
11. Explanative mood An optional lexicogrammar often added to other mood types such as declarative and interrogative, implying a variety of meanings. It constitutes a cause, reason, motivation, source, and/or grounds for judgment that suggest a causal relationship between the explained and the explainer. 1.Explanative mood 2.Explanative mood-ka 3.Explanative mood-kana 4.Explanative mood-kane 5.Explanative mood-kedo 6.Explanative mood-other 7.Explanative mood-na 8.Explanative mood-ne 9.Explanative mood-yo 10.Explanative mood-yone 11.Explanative mood-yona 12.Explanative mood-monoda 12
12. Evidentiality This lexicogrammar describes how a speaker judges the validity of a proposition. Three types of evidence are used. Appearance refers to how the information is likely to appear or eventually occur; hearsay refers to how it will be known whether the event occurs; and reasoning refers to the reason the judgment is made or how the event is known to happen. 1.appearance 2.hearsay 3.reasoning 3
13. Optative mood A desire or urge to do something that the speaker considers desirable. lexis to express desire to do something 1
14. Auxiliary verbs, benefactive Verbs used when two parties converse; one party is doing something that benefits the other, and the other party is the recipient of that benefit. Such verbs indicate whether the speaker positions the other party inside or outside. Benefactive: (10 categories) 10
15. Onomatopoeia Imitative and mimetic words used to express manner, quality, or an exclamation. 1.imitative word 2.imitative mimetic word 2
16. Filler A time-filler: a meaningless sound, word, or phrase used in social settings when an individual is aware that a listener is present. Filler words-1.maa 2.nanka 3.ano 4.unto 5.eeto 6.sono 7.kono 8.kou 8
      Total 147

Diagnostic differentiation by ML

Overview and rationale for machine learning approaches

In this study, we chose to explore both a linear model (logistic regression) and deep neural network (DNN) models to differentiate between AS and non-AS individuals, with a focus on the trade-off between interpretability and performance. Logistic regression was selected due to its simplicity and clarity, allowing us to examine the relationship between specific linguistic tags and the likelihood of AS. This high interpretability is crucial when the goal is to understand the significance of each linguistic feature in the classification process.

Although logistic regression was the sole linear model used, it was chosen deliberately for its well-established effectiveness and simplicity in binary classification tasks, which aligns with our objective of exploring interpretable models. Other linear models, such as linear support vector machines (SVMs), were considered but not included in this study due to their more complex implementation and the specific focus on the interpretability of the relationship between features and the outcome. The choice of logistic regression ensures that our findings remain directly interpretable, which is vital for the analysis of linguistic features.

To complement this, we also employed DNN models, which, while less interpretable, offer the potential for higher accuracy by capturing more complex patterns within the data. We proposed four models: a linear model using only tags, a DNN model using only tags, a DNN model using only text, and a DNN model that incorporates both tags and text.

It is important to clarify that the primary aim of this study was not to determine the best possible model or to establish the upper bounds of classification accuracy. Instead, the chosen models were intended to serve as tools to explore specific research questions related to linguistic feature analysis in the context of AS classification. The focus was on providing insights into the relationships between features and outcomes rather than exhaustively comparing model performance.

Input

Each text uttered during the interview and story-recounting phase served as the input, devoid of any annotations, treated by the machine as simple sequences of words. Therefore, each input was defined as x = (w1, w2,…,wL), where wi denotes words and L signifies the count of these words. ML necessitates annotations; text annotated manually is generally preferable owing to its greater accuracy. However, this process is time-intensive, costly, and requires expertise, rendering manual annotation impractical in clinical environments. Consequently, we employed automatic annotation. The obtained F1 score, precision, and recall were 0.88, 0.89, and 0.87 respectively. These results are considered reliable enough to be used for distinguishing between groups.

Output

The output, denoted as y, is classified as either AS or non-AS. This classifier is expressed as y = f(x), and our objective was to identify an f that would predict y with the highest possible accuracy.

Experimental procedure

The data used were sets of quadruplets (x, Tmanual, y), where Tmanual denotes a manually annotated tag. Therefore, the dataset D is represented as D={(xi,Timanual,yi)}N. Considering the small size of our dataset, we performed leave-one-out cross validation (LOOCV), where each sample serves as a test sample once while the remaining samples are used for training. This method involved 64 samples in the AS group and 71 in the non-AS group. For each iteration, the model was trained on n−1 samples and tested on 1 sample, as illustrated in Fig 4. Initially, we classified the n-th data as test data (xn,Tnmanual,yn) and the remaining data as training data Dtrain. Following this, we trained the tag annotation model using Dtrain. The trained tag annotation model was then used to automatically annotate xn, resulting in Tnauto. This approach served as a replacement for manual annotation of Tmanual in real cases; we also present the results of manual annotation Tmanual for comparative purposes. Next, we trained the classification model using Dtrain. We then classified (xn,Tnauto,yn) and (xn,Tnmanual,yn) using the trained model. Lastly, we consolidated the results of all tests (n = 1,2,…,N) and computed the accuracy, precision, recall, and specificity.

Fig 4. Procedure for automatic AS differentiation experiments using leave-one-out cross validation (LOOCV).

Fig 4

AS differentiation model

The study proposes three models for differentiating AS: a linear model using only tags, a DNN model also relying solely on tags, and a DNN model that incorporates both tags and text.

The study sets out to make two key comparisons. The first comparison evaluates a linear model that utilizes only linguistic tags against a DNN model that also relies solely on tags. This aims to explore the trade-off between interpretability and performance by juxtaposing the linear model’s high interpretability but lower efficacy with the DNN model’s superior performance but reduced clarity. The second comparison assesses the effectiveness of DNN models when inputs are limited to tags versus when both text and tags are incorporated. Given the potential of DNN models to extract comprehensive information from their inputs, this analysis seeks to assess the extent to which tags alone can encapsulate the informative essence of the original sentences for AS classification.

  • (1) AS Differentiation Using Tags

Given that the frequency of annotated tags differs between AS and non-AS individuals (S2 Tables 1 and 2 in S2 File), we hypothesize that AS differentiation can be effective through the use of tags. As each clinical input is a sequence of words devoid of tags, automatic annotation is essential. We approached such annotation as a sequence labeling problem, which we resolved using Bidirectional Long-Short Term Memory (Bi-LSTM), a type of deep neural network (DNN) within the realm of Machine Learning.

In formal terms, we computed a sequence Tauto from the input x, with Tauto being a set of C types of tags. Here, Tauto is an L×C m atrix where each row signifies a tagging category, and each column represents a word. Since all texts in the corpus have been manually annotated, we employed these texts for training differentiation models to ensure accuracy. Although manual annotation is not feasible in clinical settings, we used manually annotated texts to compare the accuracies of two methods, one based on a linear model and the other on a DNN.

For the linear model, we employed logistic regression. This method is transparent and hence, interpretable; it allowed us to identify tags that impacted the outputs and quantify these effects. Generally, differentiation by a linear model is often less accurate than by a DNN-based model, with the former exhibiting less precision [113, 114]. Nevertheless, in the medical context, interpretability is crucial because we are dealing with human lives. Therefore, it is not possible to make an unconditional judgment of which model is better. In the linear model, only tag frequencies were used as inputs, disregarding the order of tag occurrences. Thus, the input was xtag = (t1, t2,…,tC), where ti is the frequency of the i-th tag divided by the total number of words.

Our DNN-based model employs a Bi-LSTM that takes into account the order of tag occurrences. Each input was a sequence of tag sets Tmanual/Tauto. Specifically, for each word, the sum of embedding vectors corresponding to each tag was calculated to obtain the tag embeddings Etag∈ℝL×d, where d represents the number of dimensions. These embeddings were then input into the Bi-LSTM. Differentiation was accomplished by inputting the last state of the Bi-LSTM into a fully connected layer. We trained the model for 50 epochs with a batch size of 32, using the Adam optimizer and a learning rate of 0.001. The dimensions of the input word vectors and the hidden layer were 300. These hyperparameters were adopted from commonly used values and not explored, as preliminary experiments determined that the impact of the search for hyperpatameters was minimal. The architecture of the model is depicted in Fig 5.

Fig 5. Bi-LSTM-based classification model using tags.

Fig 5

The term Middle signifies a type of verb that lacks agency from a perspective, while Usuality indicates how often an event tends to occur. Additional explanation is provided in Table 2. Both these selective resources are embedded in the system network.

  • (2) Differentiation Using Text

The model is almost identical to the DNN-based tag model. The only difference is that the input to Bi-LSTM is changed to word embeddings Eword∈ℝL×d instead of tag embeddings Etag.

  • (3) Differentiation Using Tag-and-Text Combinations

In tag-based differentiation, the sequence Tmanual / Tauto of tag sets was derived from the input x, leading to a certain degree of information loss. For example, the words sad and angry both received the same Attitude-Affect-Emotion tag (Table 2). By retaining text information, we could differentiate between these words, thereby enhancing differentiation accuracy. To preserve text information, we developed a model that combines text and tags, taking into account complex word-tag relationships while retaining all the related information. For instance, the model can take into account specific situations, such as when a particular word with a specific part of speech appears before or after a certain tag, potentially indicating an AS or non-AS characteristic.

In this model, we employed a Bi-LSTM quite similar to the Bi-LSTM used in the tag-based model. Each input was an Econcat∈ℝL×2d, a concatenation of tag embeddings Etag and word embeddings Eword at each time step. The methods for predicting Tauto and the hyperparameters were identical to those used in the tag-based model. Fig 6 shows the architecture of the model.

Fig 6. Bi-LSTM-based classification model that utilizes tag-and-text.

Fig 6

An example sentence is Oko ttari nanka surukoto aru (There are occasions when I get mad). The assigned tags are the same as in Fig 5.

Results

Table 3 presents the accuracies, precisions, sensitivities, and specificities of the tag-linear, tag-DNN, tex-DNN, and text+tag-DNN models, following both automatic and manual annotation. As previously mentioned, manual annotation tends to yield more accurate results, as reflected in the generally higher values compared to automatic annotation. The overall mean F value was 0.88. The complexity of machine annotation, however, has compromised accuracy. For instance, the te-clause, one of the annotation categories listed in Table 2, was most frequently associated with errors. This category is subdivided into eight different classifications: conjunctive clause-parallel, conjunctive clause-contrast, conjunctive clause-forerunner, conjunctive clause-sequence of actions, conjunctive clause-cause/reason, conjunctive clause-adversative connective, conjunctive clause-resultative condition, and conjunctive clause-attendant circumstance. Accurate annotation must distinguish these eight types based on morphemes (i.e., te and the surrounding words). To improve this, more precise definitions of the differences are necessary, or alternatively, expanding the training data for machine learning could improve the system’s ability to accurately handle the complexities of te-clauses.

Table 3. Statistical values of linear and DNN-based models.

Input Model Accuracy 95% confidence interval Precision Sensitivity Specificity AUC
Annotation
Text
manual automatic manual automatic manual automatic manual automatic manual automatic manual automatic
Interview tag linear model 0.78 0.78 [0.71, 0.85] [0.71, 0.85] 0.75 0.72 0.77 0.84 0.79 0.73 0.85 0.84
tag DNN model 0.80 0.75 [0.73, 0.87] [0.67, 0.82] 0.77 0.66 0.79 0.93 0.81 0.60 0.88 0.85
text DNN model 0.80 [0.73, 0.87] 0.81 0.75 0.85 0.89
text + tag DNN model 0.80 0.82 [0.73, 0.87] [0.75, 0.89] 0.82 0.80 0.73 0.80 0.87 0.84 0.90 0.90
Picture book recounting tag linear model 0.64 0.60 [0.55, 0.73] [0.51, 0.69] 0.55 0.50 0.52 0.48 0.72 0.69 0.67 0.63
tag DNN model 0.75 0.67 [0.67, 0.83] [0.58, 0.75] 0.69 0.58 0.66 0.57 0.81 0.73 0.81 0.72
text DNN model 0.76 [0.68, 0.84] 0.76 0.57 0.88 0.78
text + tag DNN model 0.76 0.78 [0.68, 0.84] [0.71, 0.86] 0.72 0.78 0.64 0.64 0.84 0.88 0.80 0.80

We used the McNemar test to evaluate whether there are significant differences in accuracy between methods. Specifically, we compared the accuracy between tag-linear model and tag-DNN model, between tag-DNN model and text+tag-DNN model, and between text-DNN model and text+tag-DNN model. For each comparison, the null hypothesis was that there is no significant difference in accuracy between the methods, while the alternative hypothesis was that there is a significant difference in accuracy between the methods.

The results generated by the tag-linear, tag-DNN, and text+tag-DNN models did not display significant differences (Table 4). Nevertheless, the tag-DNN model exhibited a marginal performance improvement over the tag-linear model, and the text+tag-DNN model was slightly superior to the tag-DNN model. The text-DNN and text+tag-DNN models performed almost the same, with the text+tag-DNN model being marginally higher. The absence of statistically significant differences in our McNemar test results does not definitively indicate the absence of a performance difference between models, particularly in the context of small sample sizes. This aligns with broader discussions on statistical power and interpreting non-significant results in research [115117]. In general, the linear model was less precise than its DNN-based counterparts.

Table 4. McNemar test results for model performance comparisons.

    manual automatic
statistic p-value statistic p-value
A tag(linear model) vs tag (DNN model) 0.04 0.84 0.30 0.58
tag(DNN model) vs text+tag (DNN model) 0.00 1.00 2.21 0.14
text(DNN model) vs text+tag (DNN model) 0.12 0.72 0.10 0.75
E tag(linear model) vs tag (DNN model) 5.50 0.02 1.33 0.25
tag(DNN model) vs text+tag (DNN model) 0.00 1.00 5.76 0.02
text(DNN model) vs text+tag (DNN model) 0.12 0.72 0.44 0.50

We acknowledge the limitations of our sample size. Due to constraints, achieving a larger dataset was not feasible. Consequently, our study should be viewed as exploratory, aimed at providing initial insights rather than definitive conclusions.

Regarding the comparison between interview text and story-recounting test, our findings suggest that the interview task may provide insights into the linguistic behaviors of individuals with AS that are more detailed than those provided by the story-recounting task. Given the inherently interactive and social nature of the interview task, it has the potential to highlight differences in lexicogrammatical use that relate to the neurocognitive characteristics of AS. Although the story-recounting task is also social, its monologic nature offers fewer opportunities for such distinctions to emerge. Therefore, in the context of our study, the interview task proved to be a more effective diagnostic tool.

Discussion

Implications of the tag-linear model

The working hypothesis of this study suggests that language output may be indicative of underlying cognitive processes. Therefore, we proposed that neurodevelopmental disorders can be distinguished from non-AS conditions through their lexicogrammatical choices. Using the text + tag DNN model and manual annotation, the test displayed results with 80% accuracy, 82% precision, 73% sensitivity, and 87% specificity for the interview texts. These findings indicate the potential of utilizing lexicogrammatical choices as a diagnostic tool, reinforcing our proposition that cognitive patterns influence language output. This notion aligns with the idea that cognitive processes guide lexicogrammatical choices during language formation, as outlined by the SFL stratification in Fig 1 [58].

When devising differentiation criteria, our attention centered on the lexicogrammar situated in the fifth stratification layer. To reach lexicogrammar, one must traverse the prior four layers: cognition, culture, situation, and discourse semantics. Our system for differentiation rests on the premise that there would be a discernible difference in the lexicogrammatical selections between AS and non-AS individuals within the lexicogrammar layer. The articulated lexicogrammar acts as a clear manifestation of the speaker’s chosen syntactic configurations and stands apart due to its sheer objectivity, void of the subjectivity often seen in semantic evaluations. This same principle of objectivity extends to the phonology/graphology layer, situated directly below lexicogrammar, ensuring the observational standards are purely objective.

Driven by the register (located in the third layer), we employ varied expressions. Several factors shape these choices: the context of the conversation (Field), the relationship with and societal role of the person we are conversing with (Tenor), and the mode of communication (Mode), which encompasses aspects like whether the language is spoken or written, the level of formality, and whether the communication is dialogic or monologic. In the act of constructing a clause encapsulating our intended meaning, specific linguistic decisions arise. During speech, speakers instantaneously sift through the system network, effectively engaging in a resource-selection mapping process. This network encapsulates all available lexicogrammatical options during a linguistic exchange. Language functions as a structured system where meanings are crafted by speakers drawing words from a reservoir within the system network, all while partaking in societal activities [112].

Advantages of the DNN model over the linear model

The superiority of the DNN model can be attributed to its ability to construct judgment criteria through autonomous learning of input data. Deep learning algorithms are proficient at automatically extracting and assimilating the most beneficial features that guarantee output accuracy. This is why the DNN model is a notch above the linear model, given that the DNN incorporates learned judgment criteria alongside the tag information.

The learned criteria function as black boxes, and it is plausible that the DNN considered tag orders and combinations. For instance, the DNN may have identified patterns such as the presence of tag B following tag A indicating autism, the co-occurrence of tags A and B signifying autism, or the independent presence of tag A suggesting non-AS status. In contrast, the linear model’s accuracy is potentially lower than the DNN models’ accuracies due to its constrained input information. This constraint stems from the selection of arbitrary items and the omission of certain data, which restricts the comprehensiveness of the model.

We acknowledge the potential benefits of a text-only model. However, our focus on the tag-only and tag+text models is based on three key reasons:

  1. Medical transparency: As stated previously, transparency is crucial in medical applications. The lexicogrammatical tags provide clear, interpretable insights, which are essential for effective and transparent use in clinical settings.

  2. Improving Diagnostic Accuracy: Our study, as a pilot, aimed to demonstrate the potential of these models. While the text+tag approach shows slightly better accuracy than the tag-only approach, we plan to increase the accuracy further by adding more annotation categories from the system network. Currently, we use 147 categories, but expanding this will enhance diagnostic precision. We assume that increasing annotation items from the system network will improve diagnostic accuracy. The text-DNN model has reached its limit in terms of precision, and enhancing accuracy beyond this point will require expanding the system network categories. As mentioned previously, improving diagnostic accuracy is critical, especially for adults with comorbidities where traditional tools struggle.

  3. Cognitive Insights: The tag-based approach allows us to pinpoint specific lexico-grammatical features linked to neurodevelopmental dysfunctions, aiding in understanding the underlying cognitive processes.

Although the annotation process might seem complex, the text+tag DNN model is efficient due to our developed automatic annotation system. This system streamlines the process by quickly providing classification results upon uploading the transcript and allows for easy verification through downloadable annotated transcripts. We anticipate that the accuracy of the automatic annotation will significantly improve by increasing the amount of training data. Currently, the accuracy of our automatic annotation system is strong, and adding more training data will undoubtedly enhance its precision. The primary problem is that transcription still requires a considerable amount of manual corrections due to the current accuracy limitations of Automatic Speech Recognition (ASR) in Japanese. We acknowledge this as an area for improvement.

Text appropriate for diagnostic differentiation

We examined interview and story-recounting texts from Modules 3 and 4 of the ADOS-2, discovering that individuals with AS’s lexicogrammatical choices during interviews differed more significantly from those of non-AS individuals compared to story-recounting tasks (Table 3). This observation suggests that, in monological language use, the lexicogrammatical distinctions between AS and non-AS individuals are less marked than in interactive social language situations, highlighting the specific challenges faced by individuals with AS in reciprocal social communication. These results underscore the central issue of social impairment in AS, a neurodevelopmental disorder where difficulties in selecting suitable lexicogrammatical structures for effective interpersonal communication are prominent. Given that social components of language development start forming in early childhood [118], it is expected that children with AS, who have core deficits in social interaction and a limited interest in social engagement, would show significant language development impairments. These social deficits are often linked to cognitive, motor, and sensory challenges, including limited joint attention, weak central coherence, and impaired executive functions.

Versatility of annotation scheme for our differentiation system

The annotation scheme was based on a Japanese system network constructed specifically for this project using transfer comparison. A system network is a language that highlights special features of that language [119]. The description of a particular language without making assumptions based on other languages requires an inordinate amount of time; such a description entails many observations of discursive instances and extensive discourse analysis. Therefore, one practical heuristic method models the description of one language on the descriptions of others. This is transfer comparison [119, 120]. Fundamentally, transfer comparison highlights similarities between two languages [120]. We developed the system network of our current annotation scheme using transfer comparison; the descriptive assumptions were based on English because system networks for English are available [113, 121]. Each language is distinct in terms of its descriptors and system network. However, when comprehensive descriptions of some languages are available, typological generalizations across languages become possible. Transfer comparison enables such generalization. Thus, the annotation scheme of Kato et al. [58] is applicable to any language via transfer comparison.

Limitations and future perspectives

Verification process and methodological enhancements

This research constitutes an initial phase in illustrating the feasibility of utilizing a diagnostic instrument for the evaluation of lexicogrammatical choices. The subsequent phase entails a comprehensive verification of this tool: A key limitation of our study is the small sample size. To robustly validate the algorithm developed, expanding the participant pool will be crucial. This will require overcoming logistical challenges and ensuring a larger, more diverse sample to enhance the validity and generalizability of our findings.

Our text+tag DNN model demonstrates efficiency due to the implementation of our automatic annotation system. This system optimizes the process by rapidly providing classification results upon transcript upload and facilitates straightforward verification through downloadable annotated transcripts. We anticipate that increasing the volume of training data will significantly enhance the accuracy of the automatic annotation. Presently, the system exhibits strong accuracy, and expanding the training data set is likely to further refine its precision.

While the text+tag DNN model benefits from our efficient automatic annotation system, the primary challenge remains in the transcription phase. The current limitations of ASR for Japanese necessitate substantial manual corrections. While the text+tag DNN model benefits from our efficient automatic annotation system, the primary challenge remains in the transcription phase. The current limitations of ASR for Japanese necessitate substantial manual corrections. Recognizing this, we have adopted manual transcription for our research to ensure the highest accuracy. However, manual transcription is time-consuming and not feasible for broader clinical applications. Thus, enhancing the ASR system is essential for converting raw voice data into text more efficiently, which is crucial for scaling clinical applications and streamlining the diagnostic process.

Analysis of false positive and false negative

A notable limitation of our study is the sensitivity and specificity of the diagnostic tool, both approximately 80%. This suggests a potential 20% error rate in AS diagnosis, manifesting as false negatives or positives. This limitation indicates that in some instances, cases cannot be accurately judged based solely on lexicogrammatical choices. The findings underscore the complexity of diagnosing AS based solely on linguistic patterns, given the broad spectrum and variability in language use within the AS population. This necessitates a more detailed analysis of lexicogrammatical choices and may require adjustments to the annotation scheme, incorporating additional resources from system networks.

To further elucidate, the issue of false negatives and positives can be examined more specifically. In terms of false negatives, this issue may be particularly relevant in individuals with AS characteristics akin to AS individuals without language and cognitive delay, who may exhibit language patterns similar to non-AS individuals. Given that DSM-5 encompasses Asperger’s under the broader AS classification, our study included participants with such complex vocabularies and refined speech, which could lead to diagnostic challenges. Regarding false positives, it is possible that some individuals were misdiagnosed as having AS due to their frequent use of certain lexicogrammatical choices commonly seen in AS, despite being non-AS.

Future research should focus on refining diagnostic criteria and tools to better accommodate the diversity in language use among individuals with AS. Exploring more comprehensive and nuanced methods for differentiating between AS and non-AS individuals, particularly those with atypical language profiles, will be crucial in reducing false diagnostic rates.

Investigation of influences of comorbid conditions on lexicogrammatical choices

Our methodology begins with creating a classifier that distinguishes AS from non-AS, a foundational step towards developing a comprehensive diagnostic tool for real-world clinical assessments. We have found discernible differences even without excluding comorbidities, underscoring the potential utility of our research as a diagnostic tool in these complex clinical scenarios. However, further investigation is needed into how comorbidities might affect the occurrence of false positives or negatives. To address this, our next step involves developing separate tools for each comorbid conditions, including, adjustment disorder/non-adjustment disorder, depression/non-depression, ADHD/non-ADHD and so on. This approach aligns with clinical realities and will be crucial in enhancing the accuracy and applicability of our diagnostic tools.

Conclusions

This study demonstrates the feasibility of using natural language processing (NLP) to develop a diagnostic tool for AS. The text+tag DNN model distinguishes AS from non-AS through lexicogrammatical analysis, indicating significant diagnostic potential. By examining lexicogrammatical choices, our approach shows promise in supporting the multidisciplinary diagnosis of AS. Leveraging NLP and machine learning, we aim to integrate language-based diagnostics with traditional methods, potentially enhancing early detection and support for individuals with AS.

Supporting information

S1 File. Supplementary matessrials.

(DOCX)

pone.0311209.s001.docx (571KB, docx)
S2 File. Supplementary tables.

(DOCX)

pone.0311209.s002.docx (37.6KB, docx)

Acknowledgments

The authors would like to thank appreciation to Dr. Kentaro Inui for the thoughtful comments and expertise on the current study.

Data Availability

All relevant data are within the paper.

Funding Statement

Japan Society for the Promotion of Science (JSPS): https://www.jsps.go.jp/j-grantsinaid/ This study was supported by grants from JSPS KAKENHI (Grants-in-Aid for Scientific Research, https://www.jsps.go.jp/j-grantsinaid/) JP26284060 (SK) and JP26590161 (SK).

References

  • 1.American Psychiatric Association. Diagnostic and statistical manual of mental disorders (DSM-5). 5th ed. Arlington, VA: American Psychiatric Association; 2013. ISBN: 0890425558 [Google Scholar]
  • 2.Paul R, Norbury C. Language disorders from infancy through adolescence: Listening, speaking, reading, writing, and communicating. St. Louis, MO: Elsevier Health Sciences; 2012. ISBN: 0323071848 [Google Scholar]
  • 3.Ambridge B, Bidgood A, Thomas K. Disentangling syntactic, semantic and pragmatic impairments in ASD: Elicited production of passives. Journal of Child Language. 2020:1–18. doi: 10.1017/S0305000920000069 [DOI] [PubMed] [Google Scholar]
  • 4.Perkins M. Pragmatic impairment. Cambridge University Press; 2010. ISBN: 0521153867 [Google Scholar]
  • 5.Martin I, McDonald S. Weak coherence, no theory of mind, or executive dysfunction? Solving the puzzle of pragmatic language disorders. Brain and language. 2003;85:451–466. doi: 10.1016/s0093-934x(03)00070-1 [DOI] [PubMed] [Google Scholar]
  • 6.Scobbie JM. The phonetics-phonology overlap. QMUC Speech Science Research Centre Working Paper. 2005;1:1–30. Available from: http://eresearch.qmu.ac.uk/138/
  • 7.Murdoch BE. Acquired speech and language disorders: A neuroanatomical and functional neurological approach. London: Chapman and Hall; 1990. ISBN: 9781489934581 [Google Scholar]
  • 8.Perkins M. Modal expressions in English. London: Frances Pinter; 1983. b. doi: 10.2307/414408 [DOI] [Google Scholar]
  • 9.Perkins MR, Firth C. Production and comprehension of modal expressions by children with a pragmatic disability. First Language. 1991;11(33):416–416. doi: 10.1177/014272379101103318 [DOI] [Google Scholar]
  • 10.Nuyts J, Roeck AD. Autism and meta-representation: The case of epistemic modality. European Journal of Disorders of Communication. 1997;32:113–17. doi: 10.1111/j.1460-6984.1997.tb01627.x [DOI] [PubMed] [Google Scholar]
  • 11.Kato S. Modality from the perspective of pragmatic impairment: A systemic analysis of modality in Japanese. Amsterdam: John Benjamins Publishing Company; 2021. a. doi: 10.1075/z.234.04kat [DOI] [Google Scholar]
  • 12.Tager-Flusberg H. Language acquisition and theory of mind: Contributions from the study of autism. In Adamson LB, Romski MA, editors. Communication and language acquisition: Discoveries from atypical development. Baltimore, MD: Paul Brookes Publishing; 1997. p. 135–160. ISBN: 1557662797 [Google Scholar]
  • 13.Durrleman S, Delage H. Autism spectrum disorder and specific language impairment: Overlaps in syntactic profiles. Language Acquisition. 2016. a;23(4):361–386. doi: 10.1080/10489223.2016.1179741 [DOI] [Google Scholar]
  • 14.Durrleman S, Marinis T, Franck J. Syntactic complexity in the comprehension of wh-questions and relative clauses in typical language development and autism. Applied Psycholinguistics. 2016. b;37(6):1501–1527. doi: 10.1017/S0142716416000059 [DOI] [Google Scholar]
  • 15.Park CJ, Yelland GW, Taffe JR, Gray KM. Morphological and syntactic skills in language samples of pre school aged children with autism: Atypical development? International Journal of Speech-Language Pathology. 2012;14(2):95–108. doi: 10.3109/17549507.2011.645555 [DOI] [PubMed] [Google Scholar]
  • 16.Durrleman S, Hippolyte L, Zufferey S, Iglesias K, Hadjikhani N. Complex syntax in autism spectrum disorders: a study of relative clauses. Int J Lang Commun Disord. 2015;50(2):260–267. doi: 10.1111/1460-6984.12130 [DOI] [PubMed] [Google Scholar]
  • 17.Terzi A, Marinis T, Francis K. The interface of syntax with pragmatics and prosody in children with autism spectrum disorders. J Autism Dev Disord. 2016;46: 2692–2706. doi: 10.1007/s10803-016-2811-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Martzoukou M, Papadopoulou D, Kosmidis MH. The comprehension of syntactic and affective prosody by adults with autism spectrum disorder without accompanying cognitive deficits. J Psycholinguist Res. 2017;46:1573–1595. doi: 10.1007/s10936-017-9500-4 [DOI] [PubMed] [Google Scholar]
  • 19.Eigsti IM, Bennetto L, Dadlani MB. Beyond pragmatics: morphosyntactic development in autism. J Autism Dev Disord. 2007;37(6):1007–1023. doi: 10.1007/s10803-006-0239-2 [DOI] [PubMed] [Google Scholar]
  • 20.McDonald S. Exploring the process of inference generation in sarcasm: A review of normal and clinical studies. Brain and Language. 1999;68: 486–506. doi: 10.1006/brln.1999.2124 [DOI] [PubMed] [Google Scholar]
  • 21.Perkins M. Production and comprehension of modal expressions by children with a pragmatic disability. First Language. 1991;11(33):416–416. doi: 10.1177/014272379101103318 [DOI] [Google Scholar]
  • 22.Norbury CF, Bishop DV. Inferential processing and story recall in children with communication problems: a comparison of specific language impairment, pragmatic language impairment and high-functioning autism. Int J Lang Commun Disord. 2002;37:227–251. doi: 10.1080/13682820210136269 [DOI] [PubMed] [Google Scholar]
  • 23.Grant CM, Riggs KJ, Boucher J. Counterfactual and mental state reasoning in children with autism. J Autism Dev Disord. 2004;34:177–188. doi: 10.1023/b:jadd.0000022608.57470.29 [DOI] [PubMed] [Google Scholar]
  • 24.Ozonoff S, Pennington BF, Rogers SJ. Executive function deficits in high-functioning autistic individuals: Relationship to theory of mind. Journal of Child Psychology and Psychiatry. 1991;32:1081–1105. doi: 10.1111/j.1469-7610.1991.tb00351.x [DOI] [PubMed] [Google Scholar]
  • 25.Hill EL. Evaluating the theory of executive dysfunction in autism. Developmental Review. 2004. a;24(2):189–233. doi: 10.1016/j.dr.2004.01.001 [DOI] [Google Scholar]
  • 26.Hill EL. Executive dysfunction in autism. Trends in Cognitive Science. 2004. b;8: 26–32. doi: 10.1016/j.tics.2003.11.003 [DOI] [PubMed] [Google Scholar]
  • 27.Ozonoff S, Jensen J. Brief report: specific executive function profiles in three neurodevelopmental disorders. Journal of Autism and Developmental Disorders. 1999;29:171–177. doi: 10.1023/a:1023052913110 [DOI] [PubMed] [Google Scholar]
  • 28.Baron-Cohen S. The essential difference: Male and female brains and the truth about autism. Basic Books, New York; 2004. [Google Scholar]
  • 29.Baron-Cohen S. Autism: The empathizing-systemizing (E-S) theory. Annals of the New York Academy of Sciences. 2009;1156: 68–80. doi: 10.1111/j.1749-6632.2009.04467.x [DOI] [PubMed] [Google Scholar]
  • 30.Frith U. Autism: explaining the enigma. 2nd ed. Oxford: Blackwell Publishing; 2003. [Google Scholar]
  • 31.Rajendran G, Mitchell P. Cognitive theories of autism. Developmental Review. 2007;27:224–260. doi: 10.1016/j.dr.2007.02.001 [DOI] [Google Scholar]
  • 32.Van der Hallen R, Evers K, Brewaeys K, Van den Noortgate W, Wagemans J. Global processing takes time: A meta-analysis on local-global visual processing in ASD. Psychol Bull. 2015;141(3):549–73. doi: 10.1037/bul0000004 [DOI] [PubMed] [Google Scholar]
  • 33.Damarla SR, Keller TA, Kana RK, Cherkassky VL, Williams DL, Minshew NJ, et al. Cortical underconnectivity coupled with preserved visuospatial cognition in autism: Evidence from an fMRI study of an embedded figures task. Autism Res. 2010;3(5): 273–279. doi: 10.1002/aur.153 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Senju A, Tojo Y, Dairoku H, Hasegawa T. Reflexive orienting in response to eye gaze and an arrow in children with and without autism. Journal of Child Psychol, Psychiatry. 2004;45:445–458. doi: 10.1111/j.1469-7610.2004.00236.x [DOI] [PubMed] [Google Scholar]
  • 35.Senju J. Kyoukan to jiheisho supekutoramusho [Empathy and autism spectrum symptoms]. Kyokan [empathy]. Tokyo: Iwanami Shoten, Publishers; 2014. [Google Scholar]
  • 36.Kikuchi Y, Senju A, Tojo Y, Osanai H, Hasegawa T. Faces do not capture special attention in children with autism spectrum disorder: A change blindness study. Child Development. 2009;80:1421–1433. doi: 10.1111/j.1467-8624.2009.01342.x [DOI] [PubMed] [Google Scholar]
  • 37.Sugiyama T. Komyunikeishon shougai toshite no jiheishou [Autism as communication disorder]. Advances in Research on Autism and Developmental Disorder, Tokyo: Seiwa Shoten Publishers. 2004;8: 3–23. [Google Scholar]
  • 38.Falkmer T, Anderson K, Falkmer M, Horlin CH. Diagnostic procedures in autism spectrum disorders: a systematic literature review. Eur Child Adolesc Psychiatry. 2013;22:329–340. doi: 10.1007/s00787-013-0375-0 [DOI] [PubMed] [Google Scholar]
  • 39.Molloy CA, Murray DS, Akers R, Mitchell T, Manning-Courtney P. Use of the autism diagnostic observation schedule (ADOS) in a clinical setting. Autism. 2011;15(2):143–62. doi: 10.1177/1362361310379241 [DOI] [PubMed] [Google Scholar]
  • 40.De Bildt A, Oosterling IJ, Van Lang NDJ, Sytema S, Minderaa RB, Van Engeland H, et al. Standardized ADOS scores: Measuring severity of autism spectrum disorders in a Dutch sample. J Autism Dev Disord. 2011;41(3):311–9. doi: 10.1007/s10803-010-1057-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Adamou M, Jones SL, Wetherhill S. Predicting diagnostic outcome in adult autism spectrum disorder using the autism diagnostic observation schedule. 2nd ed. BMC Psychiatry; 2021. doi: 10.1186/s12888-020-03028-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Conner CM, Cramer RD, McGonigle JJ. Examining the diagnostic validity of autism measures among adults in an outpatient clinic sample. Autism in Aduthood. 2019;1. doi: 10.1089/aut.2018.0023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Barlati S, Deste G, Gregor Elli M, Vita A. Autistic traits in a sample of adult patients with schizophrenia: prevalence and correlates. Psychol Med. 2019;49(1):140–8. doi: 10.1017/S0033291718000600 [DOI] [PubMed] [Google Scholar]
  • 44.De Crescenzo F, Postorino V, Siracusano M, Riccioni A, Armando M, Curatolo P, et al. Autistic symptoms in Schizophrenia spectrum disorders: a systematic review and meta-analysis. Front Psychiatry. 2019;10:78. doi: 10.3389/fpsyt.2019.00078 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Bresnahan M, Li G, Susser E. Hidden in plain sight. Int J Epidemiol. 2009;38(5):1172–1174. doi: 10.1093/ije/dyp293 [DOI] [PubMed] [Google Scholar]
  • 46.Luciano CC, Keller R, Politi P, Aguglia E, Magnano F, Burti L, et al. Misdiagnosis of high function autism spectrum disorders in adults: An Italian case series. Autism Open Accccess. 2014;4(131), Article 2. doi: 10.4172/2165-7890.1000131 [DOI] [Google Scholar]
  • 47.Leyfer OT, Folstein SE, Bacalman S, Davis NO, Dinh E, Morgan J, et al. Comorbid psychiatric disorders in children with autism: interview development and rates of disorders. J Autism Dev Disord. 2006;36(7):849–61. doi: 10.1007/s10803-006-0123-0 [DOI] [PubMed] [Google Scholar]
  • 48.Bastiaansen JA, Meffert H, Hein S, Huizinga P, Ketelaars C, Pijnenborg M, et al. Diagnosing autism spectrum disorders in adults: The use of autism diagnostic observation schedule (ADOS) module 4. J Autism Dev Disord. 2011;41(9):1256–66. doi: 10.1007/s10803-010-1157-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.De Bildt A, Sytema S, Meffert H, Bastiaansen J. The autism diagnostic observation schedule, module 4: Application of the revised algorithms in an independent, well-defined, Dutch sample (n = 93). J Autism Dev Disord. 2016;46(1):21–30. doi: 10.1007/s10803-015-2532-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Gould J, Ashton-Smith J. Missed diagnosis or misdiagnosis? Girls and women on the autism spectrum. Good Autism Practice (GAP). 2011;12. [Google Scholar]
  • 51.Hull L, Petrides KV, Allison C, Smith P, Baron-Cohen S, Lai MC, et al. ‘Putting on my best normal’: Social camouflaging in adults with autism spectrum conditions. J Autism Dev Disord. 2017;47(8):2519–34. doi: 10.1007/s10803-017-3166-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Lai MC, Baron-Cohen S. Identifying the lost generation of adults with autism spectrum conditions. Lancet Psychiatry. 2015;2(11):1013–1027. doi: 10.1016/S2215-0366(15)00277-1 [DOI] [PubMed] [Google Scholar]
  • 53.Berthoz S, Hill EL. The validity of using self-reports to assess emotion regulation abilities in adults with autism spectrum disorder. Eur Psychiatry. 2005;20(3):291–298. doi: 10.1016/j.eurpsy.2004.06.013 [DOI] [PubMed] [Google Scholar]
  • 54.Parish-Morris J, Cieri C, Liberman M, Bateman L, Ferguson E, Schultz RT. Building language resources for exploring autism spectrum disorders. International Conference on Language Resources and Evaluation. 2016. May:2100–2107. doi: 10.1001/journalofethics.2015.17.4.msoc1-1504 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Nadig A, Bang J. Nadig ASD English Corpus. 2015. doi: 10.21415/T54P4Q [DOI] [Google Scholar]
  • 56.Hendriks P, Koster C, Kuijper S. Asymmetries corpus. 2014. doi: 10.21415/T5SW2X [DOI] [Google Scholar]
  • 57.Sakishita M, Ogawa C, Tsuchiya JK, Iwabuchi T, Kishimoto T, Kano Y. Autism spectrum disorder’s severity prediction system using utterance features. Journal of JSAI. 2020;35(3):1–11. doi: 10.1527/tjsai.B-J45 [DOI] [Google Scholar]
  • 58.Kato S, Hanawa K, Linh VP, Saito M, Iimura R, Inui K, et al. Toward mapping pragmatic impairment of autism spectrum disorder individuals through the development of a corpus of spoken Japanese. PLOS ONE. 2022;17(2): e0264204. doi: 10.1371/journal.pone.0264204 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Den Y, Enomoto M. Chiba three-party conversation corpus (chiba3party). Speech Resources Consortium, National Institute of Informatics; 2014. (dataset). doi: org/10.32130/src.Chiba3Party [Google Scholar]
  • 60.Muntigl P. Narrative counseling. Amsterdam: Benjamins Publishing Company; 2004. ISBN: 1588115348 [Google Scholar]
  • 61.Pellicano E. The development of executive function in autism. Autism Research and Treatment; 2012. doi: 10.1155/2012/146132 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Craig F, Margari F, Legrottaglie AR, Palumbi R, de Giambattista C, Margari L. A review of executive function deficits in autism spectrum disorder and attention-deficit/hyperactivity disorder. Neuropsychiatric Disease and Treatment. 2016;12: 1191. doi: 10.2147/NDT.S104620 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Demetriou EA, Lampit A, Quintana DS, Naismith SL, Song YJC, Pye JE, et al. Autism spectrum disorders: a meta-analysis of executive function. Molecular Psychiatry. 2018; 23(5):1198–1204. doi: 10.1038/mp.2017.75 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Panerai S, Tasca D, Palermo F, Zingale M. Executive functions and adaptive behaviour in autism spectrum disorders with and without intellectual disability. Psychiatry Research. 2019;274:247–253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Happé F, Frith U. The weak coherence account: Detail-focused cognitive style in autism spectrum disorders. Journal of Autism and Developmental Disorders. 2006;36: 5–25. doi: 10.1007/s10803-005-0039-0 [DOI] [PubMed] [Google Scholar]
  • 66.Happé F, Booth R. The power of the positive: Revisiting weak coherence in autism spectrum disorders. Quarterly Journal of Experimental Psychology. 2008;61:50–63. doi: 10.1080/17470210701508731 [DOI] [PubMed] [Google Scholar]
  • 67.Pellicano E, Burr D. When the world becomes ’too real’: A Bayesian explanation of autistic perception. Trends in Cognitive Sciences. 2012;16(10):504–510. doi: 10.1016/j.tics.2012.08.009 [DOI] [PubMed] [Google Scholar]
  • 68.Booth R, Happé F. Evidence of reduced global processing in autism spectrum disorder. Journal of Autism and Developmental Disorders. 2018;48(4):1397–1408. doi: 10.1007/s10803-016-2724-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Locke JL. A theory of neurolinguistic development. Brain and Language. 1997;58:265–326. doi: 10.1006/brln.1997.1791 [DOI] [PubMed] [Google Scholar]
  • 70.Perkins MR, Dobbinson S, Boucher J, Bol S, Bloom P. Lexical knowledge and lexical use in autism. Journal of Autism and Developmental Disorders. 2006;36:795–805. doi: 10.1007/s10803-006-0120-3 [DOI] [PubMed] [Google Scholar]
  • 71.Kato S. How neurodevelopment and joint attention affects the use of the negotiating particles, ne and yo. The Japanese Journal of Systemic Functional Linguistics. 2021. b;11: 11–30. [Google Scholar]
  • 72.Nagano K. The development of the speech of infants, especially on the learning of Zyosi (Postpositions). Study of Language. 1959;1:383–396. doi: 10.15084/00001725 [DOI] [Google Scholar]
  • 73.Terao Y. An experimental approach to the acquisition of pragmatic competence: When and how do children acquire ‘territorial’ ne? Language and culture. 2003;(6):45–58. [Google Scholar]
  • 74.Watamaki T. Dai-9-syoo Bunpoo-no hattatu [Chapter 9. Development of grammar]. In Ogura T, Watamaki T, Inaba T, editors. Nihongo MacArthur nyuuyoozi gengo hattatu situmonsi-no kaihatu-to kenkyuu [The development and the study of The Japanese MacArthur Communicative Development Inventory]. Kyoto: Nakanisiya syuppan; 2016. [Google Scholar]
  • 75.Satake S, Kobayashi S. A study of pragmatic communicative functions: Teaching "Shujoshi" sentence expressions to autistic children. The Japanese Journal of Special Education. 1987;25(3):19–30. doi: 10.6033/tokkyou.25.19_2 [DOI] [Google Scholar]
  • 76.Watamaki T. Lack the particle- ne in conversation by autistic children: A case study Institute for Developmental Research. Japanese journal on developmental disabilities. 1997;19(2):48–59. [Google Scholar]
  • 77.Endo Y. Non-standard questions in English, German, & Japanese. Linguistics Vanguard. 2022;8(s2):251–260. doi: 10.1515/lingvan-2020-0108 [DOI] [Google Scholar]
  • 78.Kiyama S, Verdonschot RG, Xiong K, Tamaoka K. Individual mentalizing ability boosts flexibility toward a linguistic marker of social distance: An ERP investigation. Journal of Neurolinguistics. 2018;47:1–15. doi: 10.1016/j.jneuroling.2018.01.005 [DOI] [Google Scholar]
  • 79.Miyagawa S. Syntax in the Treetops. Cambridge, MA, US: MIT Press; 2022. [Google Scholar]
  • 80.Kato S. Attitudinal evaluation of autism spectrum disorder individuals from the perspective of affordances and social cognition. Japanese Journal of Systemic Functional Linguistics. 2023;12: in press. [Google Scholar]
  • 81.Schulte-Ruther M, Kulvicius T, Stroth S, Wolff N, Roessner V, Marschik PB, et al. Using machine learning to improve diagnostic assessment of ASD in the light of specific differential and co-occurringdiagnoses. Journal of Child Psychology and Psychiatry. 2022;64(1):16–26. doi: 10.1111/jcpp.13650 [DOI] [PubMed] [Google Scholar]
  • 82.Abbas H, Garberson F, Glover E, Wall DP. Machine learning approach for early detection of autism by combining questionnaire and home video screening. Journal of the American Medical Informatics Association. 2018;25: 1000–1007. doi: 10.1093/jamia/ocy039 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Levy S, Duda M, Haber N, Wall DP. Sparsifying machine learning models identify stable subsets of predictive features for behavioral detection of autism. Molecular Autism. 2017;8:65. doi: 10.1186/s13229-017-0180-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Duda M, Daniels J, Wall DP. Clinical evaluationof a novel and mobile autism risk assessment. Journal ofAutism and Developmental Disorders. 2016;46(6):1953–1961. doi: 10.1007/s10803-016-2718-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Bone D, Bishop SL, Black MP, Goodwin MS, Lord C, Narayanan SS. Use of machine learning to improve autism screening and diagnostic instruments: Effectiveness, efficiency, and multi-instrument fusion. Journal of Child Psychology and Psychiatry. 2016;57(8):927–937. doi: 10.1111/jcpp.12559 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Lenneberg EH. Biological foundations of language. New York: John Wiley and Sons; 1967. ISBN: 9780471526261 [Google Scholar]
  • 87.Newport EL. Maturational constraints on language learning. Cognitive Science. 1990;14:11–28. doi: 10.1016/0364-0213(90)90024-Q [DOI] [Google Scholar]
  • 88.DeKeyser R, Larson-Hall J. What does the critical period really mean? In Kroll JF, de Groot AMB, editors. Handbook of bilingualism: Psycholinguistic approaches. Oxford: Oxford University Press; 2005. p. 88–108. [Google Scholar]
  • 89.Kuhl PK. Brain mechanisms in early language acquisition. Neuron. 2010;67(5):713–727. doi: 10.1016/j.neuron.2010.08.038 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Mayberry RI. Early language acquisition and adult language ability: What sign language reveals about the critical period for language. In Marschark M, Spencer PE, editors. The Oxford Handbook of Deaf Studies, Language, and Education, Volume 2. Oxford: Oxford University Press; 2010. p. 281–291. doi: 10.1093/oxfordhb/9780195390032.013.0019 [DOI] [Google Scholar]
  • 91.Granena G, Long MH. Age of onset, length of residence, language aptitude, and ultimate L2 attainment in three linguistic domains. Second Language Research. 2013;29(3):311–343. [Google Scholar]
  • 92.Werker JF, Hensch TK. Critical periods in speech perception: new directions. Annual Review of Psychology. 2015;66:173–196. doi: 10.1146/annurev-psych-010814-015104 [DOI] [PubMed] [Google Scholar]
  • 93.Hartshorne JK, Tenenbaum JB, Pinker S. A critical period for second language acquisition: Evidence from 2/3 million English speakers. Cognition. 2018;177:263–277. doi: 10.1016/j.cognition.2018.04.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Mayberry RI, Kluender R. Rethinking the critical period for language: New insights into an old question from American Sign Language. Bilingualism: Language and Cognition. 2018;21(5):938–944. doi: 10.1017/S1366728918000585 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Saito K. Age effects in spoken second language vocabulary attainment beyond the critical period. Studies in Second Language Acquisition. 2022;46(1):3–27. doi: 10.1017/S0272263122000432 [DOI] [Google Scholar]
  • 96.Bialystok E. The structure of age: In search of barriers to second language acquisition. Second Language Research. 1997;13(2):116–137. doi: 10.2167/jmmd555.0 [DOI] [Google Scholar]
  • 97.DeKeyser R, Alfi-Shabtay I, Ravid D. Cross-linguistic evidence for the nature of age effects in second language acquisition. Applied Psycholinguistics. 2010;31(3):413–438. doi: 10.1017/S0142716410000056 [DOI] [Google Scholar]
  • 98.Birdsong D. Plasticity, variability and age in second language acquisition and bilingualism. Front Psychol. 2018;9:81. doi: 10.3389/fpsyg.2018.00081 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Birdsong D, Molis M. On the evidence for maturational constraints in second-language acquisition. Journal of Memory and Language. 2001;44:235–249. doi: 10.1006/jmla.2000.2750 [DOI] [Google Scholar]
  • 100.Bylund E, Abrahamsson N, Hyltenstam K, Norrman G. Revisiting the bilingual lexical deficit: The impact of age of acquisition. Cognition. 2019;182:45–49. doi: 10.1016/j.cognition.2018.08.020 [DOI] [PubMed] [Google Scholar]
  • 101.Pfenninger SE, Singleton D. Starting age overshadowed: The primacy of differential environmental and family support effects on second language attainment in an instructional context. Language Learning. 2019;69(Suppl 1):207–234. doi: 10.1111/lang.12318 [DOI] [Google Scholar]
  • 102.Fulceri F, Morelli M, Santocchi E, Cena H, Del Bianco T, Narzisi A, et al. Gastrointestinal symptoms and behavioral problems in preschoolers with Autism Spectrum Disorder. Dig Liver Dis. 2016;48(3):248–254. doi: 10.1016/j.dld.2015.11.026 [DOI] [PubMed] [Google Scholar]
  • 103.Hirata I, Mohri I, Kato-Nishimura K, Tachibana M, Kuwada A, Kagitani-Shimono K, et al. Sleep problems are more frequent and associated with problematic behaviors in preschoolers with autism spectrum disorder. Res Dev Disabil. 2016;49–50:86–99. doi: 10.1016/j.ridd.2015.11.002 [DOI] [PubMed] [Google Scholar]
  • 104.Levy SE, Giarelli E, Li-Ching L, Schieve LA, Kirby RS, Cuniff C, et al. Autism spectrum disorder and co-occurring developmental, psychiatric, and medical conditions among children in multiple populations of the United States. Journal of Developmental and Behavioral Pediatrics. 2010;31(4):267–275. doi: 10.1097/DBP.0b013e3181d5d03b [DOI] [PubMed] [Google Scholar]
  • 105.Lundström S, Reichenberg A, Melke J, Råstam M, Kerekes N, Lichtenstein P, et al. Autism spectrum disorders and coexisting disorders in a nationwide Swedish twin study. J Child Psychol Psychiatry. 2015;56(6):702–710. doi: 10.1111/jcpp.12329 [DOI] [PubMed] [Google Scholar]
  • 106.Magnusdottir K, Saemundsen E, Einarsson BL, Magnusson P, Njardvik U. The Impact of attention deficit/hyperactivity disorder on adaptive functioning in children diagnosed late with autism spectrum disorder: A comparative analysis. Research in Autism Spectrum Disorders. 2016;23: 28–35. doi: 10.1016/j.rasd.2015.11.012 [DOI] [Google Scholar]
  • 107.Benesse Educational Research and Development Institute. Daigakusei-no gakushuu seikatsu jittai chousa houkokusho (Report on the Learning and Living Conditions of University Students); 2021.
  • 108.Guide Internship. What is the average GPA for university grades typically? Internship Guide; 2023. Available from: https://internshipguide.jp/columns/view/gpa-average [Google Scholar]
  • 109.Aya K. ‘nihon-no daigaku-ni-okeru GPA-seido-no dounyu-to unyou-ni miidasareru tokuchou-to mondaiten: webu-kennsaku-ni your kenkyuu-chousa’. (A Study on the Characteristics and Issues in the Introduction and Implementation of the GPA System in Japanese Universities—Research Based on Web Search). PC Conference Proceedings. 2017; 259–262.
  • 110.Corsello C, Spence S, Lord C. Autism diagnostic observation schedule, Second Edition (ADOS-2) Training videos guidebook (Part I): Modules 1–4. Torrance, CA: Western Psychological Services; 2012. [Google Scholar]
  • 111.Wiesner D. Tuesday. Houghton Mifflin Harcourt Publishing Company; 1991. [Google Scholar]
  • 112.Martin JR. English text: System and structure. Amsterdam: John Benjamins Publishing Company; 1992. ISBN: 9027220794 [Google Scholar]
  • 113.Ras G, van Gerven M, Haselager P. Explanation methods in deep learning: Users, values, concerns and challenges. In Explainable and interpretable models in computer vision and machine learning. Springer; 2018:19–36. [Google Scholar]
  • 114.Rudin C, Chen C, Chen Z, Huang H, Semenova L, Zhong C. Interpretable machine learning: Fundamental principles and 10 grand challenges. ArXiv. 2021; abs/2103.11251. [Google Scholar]
  • 115.Cohen J. Statistical power analysis. Current Directions in Psychological Science. 1992;1(3):98–101. doi: 10.1111/1467-8721.ep10768783 [DOI] [Google Scholar]
  • 116.Ellis PD. The essential guide to effect sizes: Statistical power, meta-analysis and the interpretation of research results. Cambridge: Cambridge University Press; 2010. [Google Scholar]
  • 117.Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J, Robinson ESJ, et al. Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience. 2013;14:365–376. doi: 10.1038/nrn3475 [DOI] [PubMed] [Google Scholar]
  • 118.Bruner JS. From communication to language—a psychological perspective. Cognition. 1975;3:255–287. [Google Scholar]
  • 119.Caffarel A, Martin JR, Matthiessen CMIM, editors. Language typology: A functional perspective. Amsterdam/Philadelphia: John Benjamins Publishing Company; 2004. ISBN: 9781588115591ss [Google Scholar]
  • 120.Halliday MAK. Typology and the exotic. In McIntosh A, Halliday MAK, editors. Patterns of language: papers in general, descriptive and applied linguistics. London: Longman; 1966. p. 165–182. ISBN: 9780582523968 [Google Scholar]
  • 121.Matthiessen D. Lexicogrammatical cartography. London: Tokyo: International Language Science Publisher; 1995. IBSN: 4877180028 [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 File. Supplementary matessrials.

(DOCX)

pone.0311209.s001.docx (571KB, docx)
S2 File. Supplementary tables.

(DOCX)

pone.0311209.s002.docx (37.6KB, docx)

Data Availability Statement

All relevant data are within the paper.


Articles from PLOS One are provided here courtesy of PLOS

RESOURCES