Table 4.
Performance comparison of the speaker embeddings with other pre-trained embeddings.
Model | DAIC-WOZ | Vocal Mind | ||||||
---|---|---|---|---|---|---|---|---|
BAc. | RMSE | BAc. | RMSE | |||||
Mockingjay | 0.27 | 0.70 | 0.49 | 7.09 | 0.27 | 0.70 | 0.48 | 7.58 |
vq-wav2vec | 0.32 | 0.71 | 0.52 | 6.95 | 0.25 | 0.73 | 0.49 | 7.12 |
wav2vec-2.0 | 0.38 | 0.74 | 0.55 | 6.77 | 0.32 | 0.74 | 0.52 | 7.03 |
TRILL | 0.36 | 0.77 | 0.56 | 6.46 | 0.34 | 0.76 | 0.55 | 6.80 |
ECAPA (Proposed) | 0.46 | 0.79 | 0.63 | 6.31 | 0.34 | 0.81 | 0.57 | 6.62 |