Skip to main content
. 2024 Mar 5;15:1342835. doi: 10.3389/fpsyt.2024.1342835

Table 1.

Mental health related vocal features implemented in the Mental Fitness study app.

Feature value distributions
(25th-75th percentile)
Feature name Description Correlation with depression System or process covered Reference dataset
(19,615 samples
19,615 individuals)
Current study
(1,336 samples from 104 individuals)
Jitter Variation in the time between consecutive pitch periods Positive Vocal cord control 4.9 – 7.8% 5.0 – 7.7%
Shimmer Variation in the amplitude of consecutive pitch periods Positive Vocal cord control 2.5 – 6.6% 2.3 – 5.8%
Pitch variability Intentional variation in voice pitch used for intonation Negative Higher-level cognitive process 0.15 – 0.28 octaves 0.17 – 0.30 octaves
Energy variability Intentional variation in energy (intensity) of voice used for emphasis Negative Higher-level cognitive process 6.9 – 9.5 dB 6.9 – 8.8 dB
Vowel space Separation between frequencies of the first two formants Negative Coordination of vocal tract articulators 0.33 – 0.43 MHz2 0.34 – 0.43 MHz2
Phonation duration Average duration from phonation onset to offset (glottal vibration) Negative Glottal coordination 201 – 294 msec 198 – 272 msec
Speech rate Number of words spoken per minute Negative Higher-level cognitive process 75 – 125 words/min 79 – 120 words/min
Pause duration Median duration of gaps between voice activity Positive Higher-level cognitive process 0.31 – 0.61 sec 0.32 – 0.56 sec

Features were selected based on available evidence from published studies on vocal biomarker research in depression. A summary score algorithm was developed using normalized values of the individual features, which were obtained from a reference dataset described in the main text. Feature value distributions obtained from the current study closely matched the reference distributions.