An Overview of Speech Synthesis Technology
Citations Over Time
Abstract
Speech is the most natural and convenient approach of communication and speech synthesis technology is a kind of import application in Human-machine interaction system. This paper gives a comprehensive overview of Text-to-Speech (TTS) synthesis technology. The two basic parts of speech synthesis technology are natural language processing (NLP) and digital signal processing (DSP). To the part of NLP, some important steps are pre-processing, morphological analysis, contextual analysis, syntactic-prosodic analysis, phonetization and prosody generation. To the part of DSP, two types of synthesis methods are rule-driven methods and data-driven methods. Some important synthesis approaches of DSP such as articulatory synthesis, formant synthesis, concatenative synthesis, unit selection synthesis, HMM synthesis and DNN synthesis are introduced. Finally, these approaches of speech synthesis are compared briefly. The technical trends of TTS and some hot spots of its applications in the future are discussed.
Related Papers
- → Prosody-TTS: An End-to-End Speech Synthesis System with Prosody Control(2022)22 cited
- → Joint prosody prediction and unit selection for concatenative speech synthesis(2002)58 cited
- Prosody-based unit selection for Japanese speech synthesis.(1998)
- → TechWare: HMM-based speech synthesis resources [Best of the Web](2009)4 cited
- → Study on the relation of fundamental and formant frequencies for affective speech synthesis(2016)