Abstract
The term "speech synthesis" has been used for diverse technical approaches. In this paper, some of the approaches used to generate synthetic speech in a text-to-speech system are reviewed, and some of the basic motivations for choosing one method over another are discussed. It is important to keep in mind, however, that speech synthesis models are needed not just for speech generation but to help us understand how speech is created, or even how articulation can explain language structure. General issues such as the synthesis of different voices, accents, and multiple languages are discussed as special challenges facing the speech synthesis community.
Full text
PDFSelected References
These references are in PubMed. This may not be the complete list of references from this article.
- Klatt D. H., Klatt L. C. Analysis, synthesis, and perception of voice quality variations among female and male talkers. J Acoust Soc Am. 1990 Feb;87(2):820–857. doi: 10.1121/1.398894. [DOI] [PubMed] [Google Scholar]
- Klatt D. H. Review of text-to-speech conversion for English. J Acoust Soc Am. 1987 Sep;82(3):737–793. doi: 10.1121/1.395275. [DOI] [PubMed] [Google Scholar]
- Mermelstein P. Articulatory model for the study of speech production. J Acoust Soc Am. 1973 Apr;53(4):1070–1082. doi: 10.1121/1.1913427. [DOI] [PubMed] [Google Scholar]