Lattice rescoring strategies for long short term memory language models in speech recognition
Citations Over TimeTop 10% of 2017 papers
Abstract
Recurrent neural network (RNN) language models (LMs) and Long Short Term Memory (LSTM) LMs, a variant of RNN LMs, have been shown to outperform traditional N-gram LMs on speech recognition tasks. However, these models are computationally more expensive than N-gram LMs for decoding, and thus, challenging to integrate into speech recognizers. Recent research has proposed the use of lattice-rescoring algorithms using RNNLMs and LSTMLMs as an efficient strategy to integrate these models into a speech recognition system. In this paper, we evaluate existing lattice rescoring algorithms along with new variants on a YouTube speech recognition task. Lattice rescoring using LSTMLMs reduces the word error rate (WER) for this task by 8% relative to the WER obtained using an N-gram LM.
Related Papers
- → Speaker-aware training of LSTM-RNNS for acoustic modelling(2016)45 cited
- → Character-level incremental speech recognition with recurrent neural networks(2016)62 cited
- → Comparison and Analysis of Several Phonetic Decoding Approaches(2013)1 cited
- → Exploring Architectures, Data and Units For Streaming End-to-End Speech Recognition with RNN-Transducer(2018)10 cited
- → Speech recognition for DARPA Communicator(2002)9 cited