Table 5. Features selected via JMIM for building the highest-performing machine learning algorithm (Model I), ranked by importance.
Component (Com)- 1: phonetic motor planning; 2: Semantic and syntactic levels of language organization; 3: Psycholinguistic cues
| Linguistic and acoustic features | Com | Linguistic and acoustic features | Com |
|---|---|---|---|
| linregc2 of voice probability | 2 | Part-of-Speech rate | 3 |
| Common verbs | 4 | linregerrQ of simple moving average of LSP Frequency | 2 |
| quartile 1 of MFCC | 2 | Functional words | 3 |
| Words cannot be found in Dictionary in LIWC | 4 | Textual Lexical Diversity | 3 |
| Article | 4 | Silence time for VERB within clauses | 3 |
| Std of LSP frequency | 2 | Content words | 3 |
| Quartile2-Quartile1 LSP frequency | 2 | Silence time for ADJ/ADV within clauses | 2 |
| Voiced Segments Per Second | 2 | Average of similarity score between clauses without stop word | 3 |
| Pause rate | 2 | Brunet’s Index | 3 |
| Total average silence duration in initial clauses | 2 | Indefinites articles | 3 |
| LinregerrQ of MFCC | 2 | Rate of negative adverbs | 2 |
| Words that are longer than six letters. | 4 | Root type-token ratio | 3 |
| Definite articles | 2 | Interquartile range of the 3rd MFCC coefficient | 2 |
| Content Density | 3 | Analytical thinking (summary variables in LIWC that measure cognitive language style) | 4 |
| Lexical frequency | 3 | Corrected type-token ratio | 3 |
| Std of MFCC | 2 | Linregc2 of simple moving average of LSP Frequency | 2 |
| Average Length of Unvoiced Segments | 2 | linregc1 of perceived loudness | 4 |
| Proportion of clauses with a similarity score of zero with stop word | 3 | Honor’s Statistic | 3 |
| Cognitive processes | 4 | pitch | 4 |
| Skewness of LSP frequency | 2 | MaxPos of Simple Moving Average (sma) of LSP | 2 |
| Std of local Shimmer | 2 | Normalized standard deviation of simple moving average of F2 | 4 |
| Silence time for NOUNs within clauses | 3 | Normalized Std of simple moving average of the amplitude of F1 relative to F0 | 4 |
| Mean of Local Jitter | 2 | Determiners | 3 |
| Standard deviation of similarity score between clauses with stop word | 3 | 80th percentile of Frequency of 27.5Hz | 2 |
| Relative pronouns rate | 3 | Ratio of standardized mean amplitude of F3 and F0 | 4 |
| Mean F0 Envelope | 2 | Std of Length of Unvoiced Segment | 2 |
| Total average silence duration per word within clauses | 3 | 80th Percentile of Loudness | 2 |
| Pronouns | 3 | Reference Rate to Reality | 3 |
| Std of rising slope of loudness | 2 | Average of similarity score between clauses with stop word | 3 |
| std local Jitter | 2 | Std of harmonic noise ratio | 4 |
| Mean ratio energy spectral harmonic | 2 | Unique word count | 3 |
| Speech rate | 2 | Proportion of clauses with a similarity score of zero without stop word | 3 |
| Nouns | 3 | Hypergeometric Distribution Diversity | 3 |
| Quartile2-Quartile3 of F0 | 2 | Word count | 4 |
| Long term average spectrum | 2 | Consecutive repeated clauses | 3 |
| Normalized standard deviation of the amplitude of F2 to F0 | 2 | Type-token ratio | 3 |
LSP are used to represent linear prediction coefficients (LPC) for transmission over a channel. LSPs have several properties such as smaller sensitivity to quantization noise that make them superior to direct quantization of LPCs.
LinregerrQ: The quadratic error computed as the difference of linear approximation and the actual contour.
Linregc2: The offset (t) of a linear approximation of the contour.