Prior information for rapid speaker adaptation
Citations Over TimeTop 10% of 2010 papers
Abstract
Rapidly adapting a speech recognition system to new speakers using a small amount of adaptation data is important to improve initial user experience. In this paper, a count-smoothing framework for incorporating prior information is extended to allow for the use of different forms of dynamic prior and improve the robustness of transform estimation on small amounts of data. Prior information is obtained from existing rapid adaptation techniques like VTLN and PCMLLR. Results using VTLN as a dynamic prior for CMLLR estimation show that transforms estimated on just one utterance can yield relative gains of 15% and 46% over a baseline gender independent model on two tasks. Index Terms: automatic speech recognition, speaker adaptation, VTLN, prior knowledge
Related Papers
- → A puzzle about accommodation and truth(2021)16 cited
- → Analysis of User Reactions to Turn-Taking Failures in Spoken Dialogue Systems(2007)10 cited
- → A computational model of incremental utterance production in task-oriented dialogues(1996)7 cited
- → Source memorization in chat interactions(2007)1 cited
- Non-language Process in Utterance Communication(2005)