Distortion models for statistical machine translation
2006pp. 529–536
Citations Over TimeTop 1% of 2006 papers
Abstract
In this paper, we argue that n-gram language models are not sufficient to address word reordering required for Machine Translation. We propose a new distortion model that can be used with existing phrase-based SMT decoders to address those n-gram language model limitations. We present empirical results in Arabic to English Machine Translation that show statistically significant improvements when our proposed model is used. We also propose a novel metric to measure word order similarity (or difference) between any pair of languages based on word alignments.
Related Papers
- → Improving N-gram language modeling for code-switching speech recognition(2017)16 cited
- Modeling of term-distance and term-occurrence information for improving n-gram language model performance(2013)
- → Selective back-off smoothing for incorporating grammatical constraints into the n-gram language model(2002)7 cited
- → Automatic Post-Editing Method Using Translation Knowledge Based on Intuitive Common Parts Continuum for Statistical Machine Translation(2014)
- → Developing a method to build Japanese speech recognition system based on 3-gram language model expansion with Google database(2013)