TABLE II. Normalized MSE, KL-Divergence and JS-Divergence of MFCC Reconstructions in Continuous Recordings (Bold Indicates Lowest Error for Each Maximum Speaker).
| Maximum number | Deep-MASKS | Baselines | ||||
|---|---|---|---|---|---|---|
| of speaker | empty vector | i-vector | x-vector | d-vector | VFL | DAN |
Normalized MSE (in a unit of ) | ||||||
| 1 | 0.845 (0.011) | 0.844 (0.012) | 0.849 (0.011) | 0.763 (0.011) | 0.775 (0.012) | - |
| 2 | 1.178 (0.012) | 1.147 (0.012) | 1.075 (0.012) | 1.069 (0.013) | 1.071 (0.012) | 1.152 (0.011) |
| 3 | 1.184 (0.014) | 1.149 (0.014) | 1.075 (0.014) | 1.070 (0.014) | 1.741 (0.012) | 1.155 (0.011) |
| 4 | 1.183 (0.018) | 1.149 (0.018) | 1.075 (0.017) | 1.070 (0.018) | 1.743 (0.012) | 1.155 (0.011) |
| KL-divergence | ||||||
| 1 | 0.012 (0.002) | 0.012 (0.003) | 0.013 (0.002) | 0.012 (0.002) | 0.012(0.002) | - |
| 2 | 1.05 (0.03) | 0.25 (0.02) | 0.26 (0.03) | 0.21 (0.02) | 0.21 (0.02) | 0.62 (0.03) |
| 3 | 1.06 (0.03) | 0.24 (0.02) | 0.26 (0.03) | 0.21 (0.01) | 0.21 (0.01) | 0.62 (0.03) |
| 4 | 1.05 (0.03) | 0.24 (0.02) | 0.25 (0.03) | 0.21 (0.01) | 0.21 (0.01) | 0.62 (0.03) |
| JS-divergence | ||||||
| 1 | 0.008 (0.001) | 0.009 (0.008) | 0.009 (0.007) | 0.009 (0.008) | 0.009 (0.008) | - |
| 2 | 0.32 (0.05) | 0.14 (0.03) | 0.10 (0.02) | 0.08 (0.02) | 0.08 (0.02) | 0.18 (0.03) |
| 3 | 0.32 (0.05) | 0.14 (0.03) | 0.10 (0.01) | 0.08 (0.02) | 0.08 (0.02) | 0.18 (0.03) |
| 4 | 0.32 (0.05) | 0.14 (0.03) | 0.11 (0.02) | 0.08 (0.02) | 0.08 (0.02) | 0.18 (0.03) |
