Skip to main content
. Author manuscript; available in PMC: 2024 Mar 23.
Published in final edited form as: IEEE/ACM Trans Audio Speech Lang Process. 2023 Mar 23;31:1360–1370. doi: 10.1109/taslp.2023.3260711

TABLE II:

Comparing different methods for trained numbers of speakers. (a) number of speakers, (b) interaction pattern, (c) whether trained on interaction pattern.

Type Max Half None
(a) (b) (c) Metric SI-SNR PESQ ESTOI SI-SNR PESQ ESTOI SI-SNR PESQ ESTOI
2 1212 Mix. −0.6 1.67 51.4 −0.7 1.86 59.5 −1.0 2.28 71.9
PIT 11.4 2.64 76.4 12.4 2.79 80.3 14.2 3.04 86.4
AT 13.4 2.90 81.7 14.8 3.09 85.0 16.4 3.38 89.6
De-AT 12.6 2.86 80.3 14.5 3.06 84.3 16.4 3.39 89.1
TSE 13.8 2.97 82.7 15.0 3.12 85.4 16.9 3.43 89.3
1221 Mix. −0.6 1.77 51.2 −0.8 2.06 60.1 −1.0 2.31 71.6
PIT 11.3 2.71 76.0 12.5 2.92 80.5 14.3 3.06 86.0
AT 13.2 2.97 81.2 14.7 3.28 85.0 16.5 3.47 89.3
De-AT 12.5 2.95 80.0 14.5 3.25 84.4 16.7 3.49 88.9
TSE 13.9 3.07 82.8 15.1 3.32 85.6 17.1 3.55 89.1
122221 Mix. −3.7 2.00 51.1 −3.8 2.16 60.2 −3.9 2.32 71.6
PIT 11.2 2.83 76.1 12.5 2.94 80.5 14.1 3.03 85.7
AT 12.9 3.23 80.9 14.6 3.44 84.9 16.2 3.52 89.2
De-AT 12.4 3.22 80.0 14.5 3.40 84.5 16.6 3.53 89.0
TSE 13.8 3.33 82.7 15.2 3.49 85.9 17.1 3.62 89.5
3 1231 Mix. −0.6 1.86 55.3 −0.8 2.08 62.7 −1.0 2.32 71.7
PIT 10.3 2.66 76.0 12.5 2.90 81.5 13.3 3.04 86.4
AT 13.2 2.97 82.4 15.2 3.26 86.0 16.3 3.46 89.3
De-AT 12.2 2.93 80.8 14.7 3.23 85.1 16.2 3.48 88.9
TSE 14.0 3.08 83.8 15.4 3.31 86.4 16.8 3.54 89.3
123231 Mix. −3.6 1.98 55.6 −3.7 2.14 62.6 −3.9 2.32 71.6
PIT 9.5 2.68 75.8 11.6 2.87 81.0 12.2 2.97 85.7
AT 12.7 3.07 82.5 14.6 3.30 85.8 15.9 3.50 89.3
De-AT 11.3 3.03 80.5 14.2 3.31 84.9 15.8 3.50 89.1
TSE 13.6 3.18 83.8 15.3 3.40 86.5 16.6 3.60 89.5