0 citations0 references

Decomposability of translation metrics for improved evaluation and efficient algorithms

2008pp. 610–610

Citations Over TimeTop 10% of 2008 papers

David Chiang, Steve DeNeefe, Yee Seng Chan, Hwee Tou Ng

Abstract

Bleu is the de facto standard for evaluation and development of statistical machine translation systems. We describe three real-world situations involving comparisons between different versions of the same systems where one can obtain improvements in Bleu scores that are questionable or even absurd. These situations arise because Bleu lacks the property of decomposability, a property which is also computationally convenient for various applications. We propose a very conservative modification to Bleu and a cross between Bleu and word error rate that address these issues while improving correlation with human judgments.

Related Papers

→ Neural Machine Translation of Indian Languages(2017)44 cited
Better Evaluation Metrics Lead to Better Machine Translation(2011)
→ ParFDA for Instance Selection for Statistical Machine Translation(2016)7 cited
Statistical Machine Translation with Rule based Machine Translation.(2011)
→ Factored Statistical Machine Translation for German-English(2018)