0 citations0 references

Non-stationary feature extraction for automatic speech recognition

2011Vol. 88, pp. 5204–5207

Citations Over TimeTop 11% of 2011 papers

Zoltán Tüske, Pavel Golik, Ralf Schlüter, Friedel Drepper

Abstract

In current speech recognition systems mainly Short-Time Fourier Transform based features like MFCC are applied. Dropping the short-time stationarity assumption of the voiced speech, this paper introduces the non-stationary signal analysis into the ASR framework. We present new acoustic features extracted by a pitch-adaptive Gammatone filter bank. The noise robustness was proved on AURORA 2 and 4 tasks, where the proposed features outperform the standard MFCC. Furthermore, successful combination experiments via ROVER indicate the differences between the new features and MFCC.

Related Papers

→ Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition(2005)32 cited
→ Significance of Filterbank Structure for Capturing Dysarthric Information through Cepstral Coefficients(2022)7 cited
→ Optimization of filter-bank to improve the extraction of MFCC features in speech recognition(2005)13 cited
→ Novel Gammatone Filterbank Based Spectro-Temporal Features for Robust Phoneme Recognition(2017)1 cited
Optimization of filter-bank to improve the extraction of MFCC features in speech recognition(2004)