Robust, real-time endpoint detector with energy normalization for ASR in adverse environments
Citations Over TimeTop 11% of 2002 papers
Abstract
When automatic speech recognition (ASR) is applied to hands-free or other adverse acoustic environments, endpoint detection and energy normalization can be crucial to the entire system. In low signal-to-noise (SNR) situations, conventional approaches of endpointing and energy normalization often fail and ASR performances usually degrade dramatically. The goal of this paper is to find a fast, accurate, and robust endpointing algorithm for real-time ASR. We propose a novel approach of using a special filter plus a 3-state decision logic for endpoint detection. The filter has been designed under several criteria to ensure the accuracy and robustness of detection. The detected endpoints are then applied to energy normalization simultaneously. Evaluation results show that the proposed algorithm significantly reduce the string error rates on 7 out of 12 tested databases. The reduction rates even exceeded 50% on two of them. The algorithm only uses one-dimensional energy with 24-frame lookahead; therefore, it has a low complexity and is suitable for real-time ASR.
Related Papers
- → Extended Batch Normalization(2020)11 cited
- → On Batch Orthogonalization Layers(2018)1 cited
- CONTROLLING COVARIATE SHIFT USING EQUILIBRIUM NORMALIZATION OF WEIGHTS(2018)
- → Controlling Covariate Shift using Balanced Normalization of Weights(2018)
- → On normalization of inconsistency indicators in pairwise comparisons(2017)