0 citations0 references

Quantifying the utility of parallel corpora

2001pp. 398–399

Citations Over TimeTop 13% of 2001 papers

Martin Franz, Jason S. McCarley, Todd J. Ward, Wei-Jing Zhu

Abstract

Our English-Chinese cross-language IR system is trained from parallel corpora; we investigate its performance as a function of training corpus size for three different training corpora. We find that the performance of the system as trained on the three parallel corpora can be related by a simple measure, namely the out-of-vocabulary rate of query words.

Related Papers

→ Paraphrasing with bilingual parallel corpora(2005)544 cited
Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment(2011)
Creating and using large monolingual parallel corpora for sentential paraphrase generation(2014)
Exploiting Parallel Corpora for Supervised Word-Sense Disambiguation in English-Hungarian Machine Translation(2006)
Using Parallel Corpora for Word Sense Disambiguation(2011)