Unsupervised language model adaptation
Citations Over TimeTop 10% of 2003 papers
Abstract
This paper investigates unsupervised language model adaptation, from ASR transcripts. N-gram counts from these transcripts can be used either to adapt an existing n-gram model or to build an n-gram model from scratch. Various experimental results are reported on a particular domain adaptation task, namely building a customer care application starting from a general voicemail transcription system. The experiments investigate the effectiveness of various adaptation strategies, including iterative adaptation and self-adaptation on the test data. They show an error rate reduction of 3.9% over the unadapted baseline performance, from 28% to 24.1%, using 17 hours of unsupervised adaptation material. This is 51% of the 7.7% adaptation gain obtained by supervised adaptation. Self-adaptation on the test data resulted in a 1.3% improvement over the baseline.
Related Papers
- → Language Model Pre-training Method in Machine Translation Based on Named Entity Recognition(2020)14 cited
- → Cross-Language Information Retrieval with Latent Topic Models Trained on a Comparable Corpus(2011)11 cited
- → Semantic Analysis and Structured Language Models(2005)5 cited
- → A Case-Based Reasoning Approach for Speech Corpus Generation(2005)2 cited