Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet
Citations Over Time
Abstract
Recent years have seen a proliferation of attention mechanisms and the rise of Transformers in Natural Language Generation (NLG). Previously, state-of-the-art NLG architectures such as RNN and LSTM ran into vanishing gradient problems; as sentences grew larger, distance between positions remained linear, and sequential computation hindered parallelization since sentences were processed word by word. Transformers usher in a new era. In this paper, we explore three major Transformer-based models, namely GPT, BERT, and XLNet, that carry significant implications for the field. NLG is a burgeoning area that is now bolstered with rapid developments in attention mechanisms. From poetry generation to summarization, text generation derives benefit as Transformer-based language models achieve groundbreaking results.
Related Papers
- → Natural Language Generation(1987)108 cited
- → Language modeling and bidirectional coders representations: an overview of key technologies(2021)2 cited
- → Text-to-Text Surface Realisation Using Dependency-Tree Replacement(2010)2 cited
- → Design and Implementation of an Automatic Summarizer Using Extractive and Abstractive Methods(2020)