Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics
Citations Over TimeTop 1% of 2004 papers
Abstract
In this paper we describe two new objective automatic evaluation methods for machine translation. The first method is based on longest common subsequence between a candidate translation and a set of reference translations. Longest common subsequence takes into account sentence level structure similarity naturally and identifies longest co-occurring in-sequence n-grams automatically. The second method relaxes strict n-gram matching to skip-bigram matching. Skip-bigram is any pair of words in their sentence order. Skip-bigram cooccurrence statistics measure the overlap of skip-bigrams between a candidate translation and a set of reference translations. The empirical results show that both methods correlate with human judgments very well in both adequacy and fluency.
Related Papers
- → Computing the longest common almost-increasing subsequence(2022)5 cited
- → Computing a longest common subsequence that is almost increasing on sequences having no repeated elements(2012)8 cited
- → A Fast Algorithm for Finding a Maximal Common Subsequence of Multiple Strings(2023)2 cited
- → Finding Maximum Noncrossing Subset of Nets Using Longest Increasing Subsequence(2012)1 cited
- → Longest Common Subsequence Problem for Sequences of Independent Blocks(2009)