Skip to main content

View full-text article in PMC

. 2020 Sep 25;6:26. doi: 10.1038/s41537-020-00115-2

Table 4.

The most stable features for predicting BvA and alogia from vocal acoustics.

Feature name	How feature is computed	What feature means
Alogia
Unvoiced Segment Length: SD (StddevUnvoicedSegmentLength)	Standard deviation of unvoiced segments length	Captures the variability in pause length. This is potentially related to articulation rate and speech production, and conceptually critical to alogia.
Blunted affect
Mel-Frequency-Capstral-Coefficients – 2: SD (mfcc2_sma3_stddevNorm)	Computed as a spectrum of transformed frequency values over time	Captures variability in the global signature of the signal spectrum over time, based on a short-term frequency representation based on a nonlinear mel scale of frequency. It broadly reflects global changes in the vocal tract and is critical for speech recognition in humans and in automated systems. The MFCC2 reflects finer spectral details than MFCC1.
Harmonic Difference: H1 – A3 (logRelF0-H1-A3_sma3nz_amean)	Mean ratio of energy of the first F0 harmonic (H1) to the energy of the highest harmonic in the third formant range (A3)	Ratio of energy of the first F0 harmonic to the third F0 harmonic - generated from the vocal folds as opposed to the vocal tracts. A measure of “spectral tilt” (i.e., tendency for lower frequencies to have less volume), and associated with breathy voice in men, and lack of “creaky voice”
Both blunted vocal affect and alogia
Second Formant: M (F2frequency_sma3nz_amean)	Average of formant 2 frequency values	Captures spectral shaping of vocal signal, computed as the average frequency from vowel shaping. The second formant typically reflects tongue body movement from front to back.

Acoustic features determined to be most stable using stability selection.

BvA blunted vocal affect.