Skip to main content

View full-text article in PMC

. 2019 May 15;5(5):eaav6134. doi: 10.1126/sciadv.aav6134

Table 1. Comparison of speech separation accuracy of ODAN with two other methods for separating two-speaker mixtures (WSJ0-mix2 dataset) and three-speaker mixtures (WSJ0-mix3 dataset).

The separation accuracy of ODAN, which is the causal system, is slightly worse but comparable to the other noncausal methods.

Number of Speakers	Model	Causal	SI-SNRi (dB)	SDRi (dB)	PESQ	ESTOI
Two speakers	Original mixture	–	0	0	2.02	0.56
	DAN-LSTM (11)	No	9.1	9.5	2.73	0.77
	uPIT-LSTM (15)	Yes	–	7.0	–	–
	ODAN	Yes	9.0	9.4	2.70	0.77
Three speakers	Original mixture	–	0	0	1.66	0.39
	DAN-LSTM (11)	No	7.0	7.4	2.13	0.56
	uPIT-BLSTM (15)	No	–	7.4	–	–
	DPCL++ (50)	No	7.1	–	–	–
	ODAN	Yes	6.7	7.2	2.03	0.55