Challenges in Data-to-Document Generation
Citations Over TimeTop 1% of 2017 papers
Abstract
Recent neural models have shown significant progress on the problem of generating short descriptive texts conditioned on a small number of database records. In this work, we suggest a slightly more difficult data-to-text generation task, and investigate how effective current approaches are on this task. In particular, we introduce a new, large-scale corpus of data records paired with descriptive documents, propose a series of extractive evaluation methods for analyzing performance, and obtain baseline results using current neural generation methods. Experiments show that these models produce fluent text, but fail to convincingly approximate humangenerated documents. Moreover, even templated baselines exceed the performance of these neural models on some metrics, though copy-and reconstructionbased extensions lead to noticeable improvements.
Related Papers
- → Accounting for baseline targets in NDCs(2018)3 cited
- Method for Improvement of Baseline Resolving Quality in GPS Measurement(2007)
- Analysis of the Baseline Decorrelation and Critical Baseline of Interferometric SAR(2003)
- Research of Baseline Implement(2010)
- Study and application of telecom operators system security state baseline(2012)