0 citations0 references

Towards Better Decoding and Language Model Integration in Sequence to Sequence Models

2017pp. 523–527

Citations Over TimeTop 1% of 2017 papers

Abstract

The recently proposed Sequence-to-Sequence (seq2seq) framework advocates replacing complex data processing pipelines, such as an entire automatic speech recognition system, with a single neural network trained in an end-to-end fashion.In this contribution, we analyse an attention-based seq2seq speech recognition system that directly transcribes recordings into characters.We observe two shortcomings: overconfidence in its predictions and a tendency to produce incomplete transcriptions when language models are used.We propose practical solutions to both problems achieving competitive speaker independent word error rates on the Wall Street Journal dataset: without separate language models we reach 10.6% WER, while together with a trigram language model, we reach 6.7% WER.

Related Papers

→ On Achievable Rates for Relay Channels(2007)5 cited
A Study of Joint Equalization and TCM Decoding with Multiple Decoding Depths(2006)
→ An optimized Inactivation Decoding of BATS Codes(2021)
→ An enhanced BP secondary decoding algorithm(2022)
→ Decoding: Making Predictions(2011)