Skip to main content
. Author manuscript; available in PMC: 2021 Apr 26.
Published in final edited form as: IEEE J Sel Top Signal Process. 2019 Nov 7;14(2):282–298. doi: 10.1109/jstsp.2019.2952087

TABLE I:

Summary of work covered in this review, including textual and acoustic features to assess cognitive and thought disorders that are automatically extracted from speech and language samples using NLP and speech signal processing. Note that in the abbreviations in the table, “Sz” refers to schizophrenia and “Sza” refers to the related schizoaffective disorder, which are considered together for simplicity in this summary.

Category Subcategory Features or Methods Used Cognitive & Thought Disorder(s) Assessed
Text-based Lexical features Bag of words vocabulary analysis
Linguistic Inquiry & Word Count (LIWC) [6]
Lexical Diversity (TTR, MATTR, BI, HS, etc.)

Lexical Density (content density, idea density, P-density)

Part-of-speech (POS) tagging
Semantic dementia (SD) [4]; Alzheimer’s disease (AD) [5]
Mild cognitive impairment (MCI) [7], schizophrenia (Sz/Sza) [8]; AD [9] AD [5], [10], [11]; primary progressive aphasia (PPA) [12],
Sz/Sza [13], bipolar disorder (BPD) [13]
MCI [14]; AD [5], [11], [9]; PPA [12]; chronic traumatic encephalopathy (CTE) [15]; Sz/Sza [13], BPD [13]
MCI [14]; PPA [12]; AD [5], [9]; Sz/Sza [16], [17], [18]
Syntactical features Constituency-based parse tree scores (Yngve [19], Frazier [20]) Dependency-based parse tree scores
Speech graphs and attributes
AD [5], [9]; MCI [14]; PPA [12]
MCI [14]
Sz/Sza [21], [22]; BPD [21], [22]; MCI [23]
Semantic features Word & sentence embeddings:
- LSA [24]
- Neural word embeddings (word2vec [26], GloVe [27], etc.)
- Neural sentence embeddings (SIF [29], InferSent [30], etc.)
Topic modeling:
- LDA [31]
- Vector-space topic modeling with neural networks
Semantic role labeling [34]
Sz/Sza [25], [16], [17]
Sz/Sza [28], [18], [13]; BPD [13]
Sz/Sza [18], [13]; BPD [13]
Sz/Sza [28], [8]
AD [32], [33]; MCI [33]
Sz/Sza [28]
Pragmatics Sentiment analysis Sz/Sza [28]
Acoustic Prosodic features Temporal (pause rate, phonation rate, voiced durations, etc.)
Fundamental frequency (F0) and trajectory
Loudness and energy
Emotional content
MCI [14], [35], [36], [37]; AD [35], [38]; Sz/Sza [39]
Frontotemporal lobal degeneration (FTLD) [40]
AD [38]; BPD [41]
AD [38]
AD [38]
Spectral features Formant trajectories (F1, F2, F3, etc.)
Spectral centroid [43]
MFCC statistics [44]
PPA [42]; AD [5]
AD [38]
PPA [42]; AD [5]
Vocal quality Jitter, shimmer, harmonic-to-noise ratio (HNR) PPA [42]; AD [5], [38]
ASR-related Phone-level detection of filled pauses & temporal features Improving WER for clinical data MCI [36], [37]
AD [45], [9]; neurodegenerative dementia (ND) [46], [47], [48]