. Author manuscript; available in PMC: 2024 Mar 23.

Published in final edited form as: IEEE/ACM Trans Audio Speech Lang Process. 2023 Mar 23;31:1360–1370. doi: 10.1109/taslp.2023.3260711

TABLE III:

Comparing different methods for the case of untrained number of speakers. (a) number of speakers, (b) interaction pattern.

		Type	Max			Half			None
(a)	(b)	Metric	SI-SNR	PESQ	ESTOI	SI-SNR	PESQ	ESTOI	SI-SNR	PESQ	ESTOI
4	12341	Mix.	−2.4	1.98	57.5	−2.5	2.12	62.6	−2.8	2.31	71.8
		PIT	7.6	2.65	76.1	7.6	2.77	79.5	9.1	2.92	85.4
		AT	13.0	3.07	83.3	14.7	3.28	85.9	16.0	3.49	89.3
		De-AT	12.0	3.02	81.7	14.2	3.28	85.0	15.7	3.50	89.0
		TSE	13.9	3.17	84.4	15.2	3.36	86.3	16.5	3.57	89.3
5	123451	Mix.	−3.6	1.98	55.7	−3.7	2.14	62.8	−4.0	2.32	71.8
		PIT	4.9	2.55	73.5	4.3	2.67	77.4	6.0	2.85	85.0
		AT	12.2	3.04	82.4	14.4	3.31	86.1	15.8	3.51	89.6
		De-AT	11.4	3.02	81.0	14.0	3.30	85.2	15.5	3.51	89.3
		TSE	13.4	3.17	83.7	15.1	3.40	86.4	16.3	3.59	89.3