Improving DNN speaker independence with I-vector inputs
2014pp. 225–229
Citations Over TimeTop 1% of 2014 papers
Abstract
We propose providing additional utterance-level features as inputs to a deep neural network (DNN) to facilitate speaker, channel and background normalization. Modifications of the basic algorithm are developed which result in significant reductions in word error rates (WERs). The algorithms are shown to combine well with speaker adaptation by backpropagation, resulting in a 9% relative WER reduction. We address implementation of the algorithm for a streaming task.
Related Papers
- → On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition(1993)79 cited
- → On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition(1991)17 cited
- → GMM-UBM Modeling for Speaker Recognition on a Romanian Large Speech Corpora(2018)2 cited
- → GMM-UBM Modeling for Speaker Recognition on a Romanian Large Speech Corpora(2018)5 cited
- → Analysis of User Reactions to Turn-Taking Failures in Spoken Dialogue Systems(2007)10 cited