Enabling Language Models to Fill in the Blanks
Citations Over TimeTop 1% of 2020 papers
Abstract
We present a simple approach for text infilling, the task of predicting missing spans of text at any position in a document. While infilling could enable rich functionality especially for writing assistance tools, more attention has been devoted to language modeling-a special case of infilling where text is predicted at the end of a document. In this paper, we aim to extend the capabilities of language models (LMs) to the more general task of infilling. To this end, we train (or fine-tune) off-the-shelf LMs on sequences containing the concatenation of artificially-masked text and the text which was masked. We show that this approach, which we call infilling by language modeling, can enable LMs to infill entire sentences effectively on three different domains: short stories, scientific abstracts, and lyrics. Furthermore, we show that humans have difficulty identifying sentences infilled by our approach as machinegenerated in the domain of short stories.
Related Papers
- Korean text-to-speech and concatenation cost function(2006)
- Applications of Virtual Concatenation in Digital Wrapper Technology(2003)
- → A Chinese Text Classification Method With Low Hardware Requirement Based on Improved Model Concatenation(2020)
- → Concatenations of Terms of an Arithmetic Progression(2022)
- → Unboundedness of the first and the last Betti numbers of Numerical Semigroups Generated by Concatenation(2022)