Skip to main content
. Author manuscript; available in PMC: 2020 Jun 1.
Published in final edited form as: J Neural Eng. 2019 Mar 4;16(3):036019. doi: 10.1088/1741-2552/ab0c59

Figure 4.

Figure 4.

Overview of the Wavenet vocoder architecture. The network comprises a stack of 30 residual blocks to find a mapping between the acoustic speech signal x to itself considering the extracted features c. Each block has a separate output which are summed in the calculation of the actual prediction. We use a 10-component mixture of logistic distributions (MoL) for the prediction of audio samples.