Skip to main content
. Author manuscript; available in PMC: 2024 Mar 23.
Published in final edited form as: IEEE/ACM Trans Audio Speech Lang Process. 2023 Mar 23;31:1360–1370. doi: 10.1109/taslp.2023.3260711

TABLE III:

Comparing different methods for the case of untrained number of speakers. (a) number of speakers, (b) interaction pattern.

Type Max Half None
(a) (b) Metric SI-SNR PESQ ESTOI SI-SNR PESQ ESTOI SI-SNR PESQ ESTOI
4 12341 Mix. −2.4 1.98 57.5 −2.5 2.12 62.6 −2.8 2.31 71.8
PIT 7.6 2.65 76.1 7.6 2.77 79.5 9.1 2.92 85.4
AT 13.0 3.07 83.3 14.7 3.28 85.9 16.0 3.49 89.3
De-AT 12.0 3.02 81.7 14.2 3.28 85.0 15.7 3.50 89.0
TSE 13.9 3.17 84.4 15.2 3.36 86.3 16.5 3.57 89.3
5 123451 Mix. −3.6 1.98 55.7 −3.7 2.14 62.8 −4.0 2.32 71.8
PIT 4.9 2.55 73.5 4.3 2.67 77.4 6.0 2.85 85.0
AT 12.2 3.04 82.4 14.4 3.31 86.1 15.8 3.51 89.6
De-AT 11.4 3.02 81.0 14.0 3.30 85.2 15.5 3.51 89.3
TSE 13.4 3.17 83.7 15.1 3.40 86.4 16.3 3.59 89.3