BanditSum: Extractive Summarization as a Contextual Bandit
Citations Over TimeTop 1% of 2018 papers
Abstract
In this work, we propose a novel method for training neural networks to perform singledocument extractive summarization without heuristically-generated extractive labels. We call our approach BANDITSUM as it treats extractive summarization as a contextual bandit (CB) problem, where the model receives a document to summarize (the context), and chooses a sequence of sentences to include in the summary (the action). A policy gradient reinforcement learning algorithm is used to train the model to select sequences of sentences that maximize ROUGE score. We perform a series of experiments demonstrating that BANDITSUM is able to achieve ROUGE scores that are better than or comparable to the state-of-the-art for extractive summarization, and converges using significantly fewer update steps than competing approaches. In addition, we show empirically that BANDIT-SUM performs significantly better than competing approaches when good summary sentences appear late in the source document.
Related Papers
- Multilingual Summarization Evaluation without Human Models(2010)
- → Experiences with and Reflections on Text Summarization Tools(2009)9 cited
- On the Applications of the Experience Summarization in Modern Teaching and Research(2000)
- → Dynamic Summarization: Another Stride Towards Summarization(2007)