Application of pretrained deep neural networks to large vocabulary speech recognition
Citations Over TimeTop 1% of 2012 papers
Abstract
The use of Deep Belief Networks (DBN) to pretrain Neural Networks has recently led to a resurgence in the use of Artificial Neural Network Hidden Markov Model (ANN/HMM) hybrid systems for Automatic Speech Recognition (ASR). In this paper we report results of a DBN-pretrained context-dependent ANN/HMM system trained on two datasets that are much larger than any reported previously with DBN-pretrained ANN/HMM systems 5870 hours of Voice Search and 1400 hours of YouTube data. On the first dataset, the pretrained ANN/HMM system outperforms the best Gaussian Mixture Model Hidden Markov Model (GMM/HMM) baseline, built with a much larger dataset by 3.7% absolute WER, while on the second dataset, it outperforms the GMM/HMM baseline by 4.7% absolute. Maximum Mutual Information (MMI) fine tuning and model combination using Segmental Conditional Random Fields (SCARF) give additional gains of 0.1% and 0.4% on the first dataset and 0.5% and 0.9% absolute on the second dataset.
Related Papers
- → Vocabulary selection: a case report(1989)38 cited
- → The Influence of Frequency on the Acquisition and Textbooks Design of Second Language Vocabulary(2020)3 cited
- Factors Affecting the Retrieval of English Vocabulary(2010)
- → A study on measuring vocabulary knowledge and vocabulary size of Korean language learners.(2017)
- → A Study on the Vocabulary of Gyubanggasa, Registered in the Late Joseon Dynasty(2022)