A frequency warping approach to speaker normalization
Citations Over TimeTop 10% of 1998 papers
Abstract
In an effort to reduce the degradation in speech recognition performance caused by variation in vocal tract shape among speakers, a frequency warping approach to speaker normalization is investigated. A set of low complexity, maximum likelihood based frequency warping procedures have been applied to speaker normalization for a telephone based connected digit recognition task. This paper presents an efficient means for estimating a linear frequency warping factor and a simple mechanism for implementing frequency warping by modifying the filterbank in mel-frequency cepstrum feature analysis. An experimental study comparing these techniques to other well-known techniques for reducing variability is described. The results have shown that frequency warping is consistently able to reduce word error rate by 20% even for very short utterances.
Related Papers
- → A frequency warping approach for vocal tract length normalization(2005)2 cited
- → Combining Evidences from Mel Cepstral and Cochlear Cepstral Features for Speaker Recognition Using Whispered Speech(2015)1 cited
- → Improved scale-cepstral analysis in speech(2002)5 cited
- → <title>Improvements in scale-transform-based features for speech analysis</title>(1997)
- → Speaker normalization using dynamic frequency warping(2008)