Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
Citations Over TimeTop 10% of 2017 papers
Abstract
Keyword spotting (KWS) constitutes a major component of human-technology interfaces.Maximizing the detection accuracy at a low false alarm (FA) rate, while minimizing the footprint size, latency and complexity are the goals for KWS.Towards achieving them, we study Convolutional Recurrent Neural Networks (CRNNs).Inspired by large-scale state-ofthe-art speech recognition systems, we combine the strengths of convolutional layers and recurrent layers to exploit local structure and long-range context.We analyze the effect of architecture parameters, and propose training strategies to improve performance.With only ~230k parameters, our CRNN model yields acceptably low latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise ratio.
Related Papers
- → Keyword spotting method based on speech feature space trace matching(2004)7 cited
- → An approach of keyword spotting based on HMM(2002)4 cited
- → Mutitask Learning Based Muti-examples Keywords Spotting in Low Resource Condition(2018)4 cited
- → Attention-Based End-to-End Keywords Spotting(2020)1 cited
- → Word Spotting based on the Generalized Hough Transform and continuous DP matching(1998)