EdiT5: Semi-Autoregressive Text Editing with T5 Warm-Start
Citations Over TimeTop 10% of 2022 papers
Abstract
We present EdiT5 - a novel semi-autoregressive text-editing approach designed to combine the strengths of non-autoregressive text-editing and autoregressive decoding. EdiT5 is faster at inference times than conventional sequence-to-sequence (seq2seq) models, while being capable of modeling flexible input-output transformations.This is achieved by decomposing the generation process into three sub-tasks: (1) tagging to decide on the subset of input tokens to be preserved in the output, (2) re-ordering to define their order in the output text, and (3) insertion to infill the missing tokens that are not present in the input. The tagging and re-ordering steps, which are responsible for generating the largest portion of the output, are non-autoregressive, while the insertion uses an autoregressive decoder.Depending on the task, EdiT5 requires significantly fewer autoregressive steps demonstrating speedups of up to 25x when compared to classic seq2seq models. Quality-wise, EdiT5 is initialized with a pre-trained T5 checkpoint yielding comparable performance to T5 in high-resource settings and clearly outperforms it on low-resource settings when evaluated on three NLG tasks: Sentence Fusion, Grammatical Error Correction, and Decontextualization.
Related Papers
- → Identification of autoregressive moving-average parameters of time series(1975)97 cited
- → Auxiliary model based recursive and iterative least squares algorithm for autoregressive output error autoregressive systems(2015)5 cited
- → Adding data process feedback to the nonlinear autoregressive model(2002)11 cited
- → Outliers in functional autoregressive time series(2005)6 cited
- Pendekatan Space Time Autoregressive (Star) Dan Generalized Space Time Autoregressive (Gstar) Melalui Metode Autoregressive (Ar) Dan Vector Autoregressive (Var).(2015)