Text-To-Speech: Basics and Advanced Topics

Enrico Zovato
Loquendo

DATE: 06 May 2010 at 3.00 p.m.
LOCATION: Room 7 – Faculty of Science – Povo

ABSTRACT:
Speech synthesis systems underwent many technological and paradigm changes. At the beginning a pure generation approach was adopted. In this case, the waveform is produced by means of parametric models, and contextual rules. With the increase in calculation and storage capabilities, a new approach was experimented. It consisted in storing a considerable amount of speech data from a single speaker and then in recombining segments of speech according to certain selection strategies. In the last decade, this technology has further improved, and it now provides highly intelligible synthetic speech, while keeping adequate acoustic quality and naturalness. This lecture will cover these topics: (i) Introduction to Speech Synthesis and some historical notes, (ii) the concatenative Text To Speech technology (text processing, unit selection mechanisms, signal processing techniques applied to the selected units, design and production of speech databases), (iii) the challenge for next generation systems in terms of flexibility, (iv) Loquendo TTS system together with its development tools, (v) guidelines on how to design and tune speech synthesis prompts by means of user controls, and finally (vi) an overview on Speech Synthesis Markup Language (SSML 1.0)Speech synthesis systems underwent many technological and paradigm changes. At the beginning a pure generation approach was adopted. In this case, the waveform is produced by means of parametric models, and contextual rules. With the increase in calculation and storage capabilities, a new approach was experimented. It consisted in storing a considerable amount of speech data from a single speaker and then in recombining segments of speech according to certain selection strategies. In the last decade, this technology has further improved, and it now provides highly intelligible synthetic speech, while keeping adequate acoustic quality and naturalness. This lecture will cover these topics: (i) Introduction to Speech Synthesis and some historical notes, (ii) the concatenative Text To Speech technology (text processing, unit selection mechanisms, signal processing techniques applied to the selected units, design and production of speech databases), (iii) the challenge for next generation systems in terms of flexibility, (iv) Loquendo TTS system together with its development tools, (v) guidelines on how to design and tune speech synthesis prompts by means of user controls, and finally (vi) an overview on Speech Synthesis Markup Language (SSML 1.0).

CONTACT: Giuseppe Riccardi
giuseppe[DOT]riccardi[AT]unitn[DOT]it

Comments are closed.