Table 1.
Mental health related vocal features implemented in the Mental Fitness study app.
| Feature value distributions (25th-75th percentile) |
|||||
|---|---|---|---|---|---|
| Feature name | Description | Correlation with depression | System or process covered | Reference dataset (19,615 samples 19,615 individuals) |
Current study (1,336 samples from 104 individuals) |
| Jitter | Variation in the time between consecutive pitch periods | Positive | Vocal cord control | 4.9 – 7.8% | 5.0 – 7.7% |
| Shimmer | Variation in the amplitude of consecutive pitch periods | Positive | Vocal cord control | 2.5 – 6.6% | 2.3 – 5.8% |
| Pitch variability | Intentional variation in voice pitch used for intonation | Negative | Higher-level cognitive process | 0.15 – 0.28 octaves | 0.17 – 0.30 octaves |
| Energy variability | Intentional variation in energy (intensity) of voice used for emphasis | Negative | Higher-level cognitive process | 6.9 – 9.5 dB | 6.9 – 8.8 dB |
| Vowel space | Separation between frequencies of the first two formants | Negative | Coordination of vocal tract articulators | 0.33 – 0.43 MHz2 | 0.34 – 0.43 MHz2 |
| Phonation duration | Average duration from phonation onset to offset (glottal vibration) | Negative | Glottal coordination | 201 – 294 msec | 198 – 272 msec |
| Speech rate | Number of words spoken per minute | Negative | Higher-level cognitive process | 75 – 125 words/min | 79 – 120 words/min |
| Pause duration | Median duration of gaps between voice activity | Positive | Higher-level cognitive process | 0.31 – 0.61 sec | 0.32 – 0.56 sec |
Features were selected based on available evidence from published studies on vocal biomarker research in depression. A summary score algorithm was developed using normalized values of the individual features, which were obtained from a reference dataset described in the main text. Feature value distributions obtained from the current study closely matched the reference distributions.