Bilingually-constrained Phrase Embeddings for Machine Translation
Citations Over TimeTop 1% of 2014 papers
Abstract
We propose Bilingually-constrained Recursive Auto-encoders (BRAE) to learn semantic phrase embeddings (compact vector representations for phrases), which can distinguish the phrases with different semantic meanings. The BRAE is trained in a way that minimizes the semantic distance of translation equivalents and maximizes the semantic distance of nontranslation pairs simultaneously. After training, the model learns how to embed each phrase semantically in two languages and also learns how to transform semantic embedding space in one language to the other. We evaluate our proposed method on two end-to-end SMT tasks (phrase table pruning and decoding with phrasal semantic similarities) which need to measure semantic similarity between a source phrase and its translation candidates. Extensive experiments show that the BRAE is remarkably effective in these two tasks.
Related Papers
- → Children's phrase set for text input method evaluations(2006)28 cited
- → Machine Translation Using Deep Learning: A Comparison(2020)4 cited
- → Development of phrase boundary effects(2006)
- Survey of Street Trees Pruning in Guangzhou Urban Green Space(2015)
- On the System of Phrase of Chinese(2008)