Punctuation prediction using a bidirectional recurrent neural network with part-of-speech tagging
Citations Over TimeTop 20% of 2017 papers
Abstract
Most automatic speech recognition (ASR) systems are incapable of generating punctuation, making it difficult to read the transcribed output and less appropriate for tasks such as dictation. This paper introduces a procedure to automatically insert punctuation into unpunctuated sentences by using a bidirectional recurrent neural network with attention mechanism and Part-of-Speech (POS) Tags. Using the WikiText Long Term Dependency Language Modelling Dataset and handling 11 different punctuation symbols, the model managed to achieve a punctuation error rate of 31.4% and an F1 score of 78.5%. When the system was trained on consecutive sentences and a smaller dataset using the Europarl v7 corpus, the model still managed to achieve a punctuation error rate of 48.1% and an F1 score of 64.7%. In both cases, our proposed system outperforms previous state-of-the-art systems trained on the same datasets, showing the advantage of using POS tags information and an encoderdecoder network.
Related Papers
- → Language Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks(2016)146 cited
- → Bidirectional recurrent neural network language models for automatic speech recognition(2015)83 cited
- → Language Models with RNNs for Rescoring Hypotheses of Russian ASR(2016)3 cited
- → Deep Learning Based Language Modeling for Domain-Specific Speech Recognition(2017)1 cited
- A STUDY ON THE EFL STUDENTS‟ABILITY OF DICTATION INTEGRATED PUNCTUATION MARKS IN WRITING SKILL(2019)