Self-supervised discriminative training of statistical language models
Citations Over TimeTop 10% of 2009 papers
Abstract
A novel self-supervised discriminative training method for estimating language models for automatic speech recognition (ASR) is proposed. Unlike traditional discriminative training methods that require transcribed speech, only untranscribed speech and a large text corpus is required. An exponential form is assumed for the language model, as done in maximum entropy estimation, but the model is trained from the text using a discriminative criterion that targets word confusions actually witnessed in first-pass ASR output lattices. Specifically, model parameters are estimated to maximize the likelihood ratio between words w in the text corpus and w's cohorts in the test speech, i.e. other words that w competes with in the test lattices. Empirical results are presented to demonstrate statistically significant improvements over a 4-gram language model on a large vocabulary ASR task.
Related Papers
- A discriminative language model with pseudo-negative samples(2007)
- → Self-supervised discriminative training of statistical language models(2009)30 cited
- → A hierarchical Bayesian approach for semi-supervised discriminative language modeling(2012)2 cited
- Modeling of term-distance and term-occurrence information for improving n-gram language model performance(2013)
- → Resolution property of the improved maximum likelihood method(2005)