Improving Statistical Machine Translation with Word Class Models
Citations Over TimeTop 10% of 2013 papers
Abstract
Automatically clustering words from a monolingual or bilingual training corpus into classes is a widely used technique in statistical natural language processing. We present a very simple and easy to implement method for using these word classes to improve translation quality. It can be applied across different machine translation paradigms and with arbitrary types of models. We show its efficacy on a small German!English and a larger French!German translation task with both standard phrase-based and hierarchical phrase-based translation systems for a common set of models. Our results show that with word class models, the baseline can be improved by up to 1.4% BLEU and 1.0% TER on the French!German task and 0.3% BLEU and 1.1% TER on the German!English task.
Related Papers
- → Design and Testing of Automatic Machine Translation System Based on Chinese-English Phrase Translation(2021)7 cited
- → Productivity and quality when editing machine translation and translation memory outputs: an empirical analysis of English to Welsh translation(2017)15 cited
- A Hybrid Approach to Example based Machine Translation for Indian Languages(2007)
- → Automatic Post-Editing Method Using Translation Knowledge Based on Intuitive Common Parts Continuum for Statistical Machine Translation(2014)
- Основные факторы улучшения машинного перевода(2015)