Non-stationary feature extraction for automatic speech recognition
2011Vol. 88, pp. 5204–5207
Citations Over TimeTop 11% of 2011 papers
Abstract
In current speech recognition systems mainly Short-Time Fourier Transform based features like MFCC are applied. Dropping the short-time stationarity assumption of the voiced speech, this paper introduces the non-stationary signal analysis into the ASR framework. We present new acoustic features extracted by a pitch-adaptive Gammatone filter bank. The noise robustness was proved on AURORA 2 and 4 tasks, where the proposed features outperform the standard MFCC. Furthermore, successful combination experiments via ROVER indicate the differences between the new features and MFCC.
Related Papers
- → Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition(2005)32 cited
- → Significance of Filterbank Structure for Capturing Dysarthric Information through Cepstral Coefficients(2022)7 cited
- → Optimization of filter-bank to improve the extraction of MFCC features in speech recognition(2005)13 cited
- → Novel Gammatone Filterbank Based Spectro-Temporal Features for Robust Phoneme Recognition(2017)1 cited
- Optimization of filter-bank to improve the extraction of MFCC features in speech recognition(2004)