Dynamically Composing Domain-Data Selection with Clean-Data Selection by “Co-Curricular Learning” for Neural Machine Translation
2019pp. 1282–1292
Citations Over TimeTop 10% of 2019 papers
Abstract
Noise and domain are important aspects of data quality for neural machine translation. Existing research focus separately on domain-data selection, clean-data selection, or their static combination, leaving the dynamic interaction across them not explicitly examined. This paper introduces a “co-curricular learning” method to compose dynamic domain-data selection with dynamic clean-data selection, for transfer learning across both capabilities. We apply an EM-style optimization procedure to further refine the “co-curriculum”. Experiment results and analysis with two domains demonstrate the effectiveness of the method and the properties of data scheduled by the co-curriculum.
Related Papers
- → Machine Translation Using Deep Learning: A Comparison(2020)4 cited
- The Design of Curriculum Structure in the Synthesis of School Curriculum(2008)
- Postgraduates' Curriculum-selecting Intention and System Reform(2009)
- The Teachers' Curriculum Identity: An Issue Needs Further Study(2011)
- → Comparative analysis of the 2019 revised Nuri curriculum and the Japanese kindergarten curriculum(2023)