Table 6.
PD vs. HC classification EER (in %) obtained with different classifiers: MFCC-GMM baseline, and x-vectors combined either with cosine similarity (alone and with LDA) or with PLDA, with and without data augmentation.
| High-quality microphone | Telephone | |||||||
|---|---|---|---|---|---|---|---|---|
| Males | Females | Males | Females | |||||
| Repet | Monol | Repet | Monol | Repet | Monol | Repet | Monol | |
| MFCC-GMM | 22 | 26 | 42 | 45 | 35 | 36 | 42 | 40 |
| x-vec + cos | 32 | 35 | 51 | 41 | 39 | 33 | 49 | 43 |
| x-vec + LDA + cos | 22 | 27 | 39 | 32 | 32 | 35 | 34 | 34 |
| x-vec + augLDA + cos | 24 | 25 | 34 | 30 | 33 | 33 | 39 | 33 |
| x-vec + PLDA | 24 | 28 | 39 | 35 | 33 | 36 | 34 | 36 |
| x-vec + augPLDA | 25 | 25 | 33 | 30 | 31 | 33 | 37 | 33 |
The datasets used are male and female high-quality microphone and telephone recordings. Analyzed tasks are free speech (monolog) and sentence repetitions (combined with readings for high-quality microphone recordings). Bold numbers indicate the best EERs for each dataset.