Zero-Shot Text-to-Image Generation
arXiv (Cornell University)2021
Citations Over Time
Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever
Abstract
Text-to-image generation has traditionally focused on finding better modeling assumptions for training on a fixed dataset. These assumptions might involve complex architectures, auxiliary losses, or side information such as object part labels or segmentation masks supplied during training. We describe a simple approach for this task based on a transformer that autoregressively models the text and image tokens as a single stream of data. With sufficient data and scale, our approach is competitive with previous domain-specific models when evaluated in a zero-shot fashion.
Related Papers
- → Zero in consumer decision-making: The zero-price effect and the zero-comparison effect(2022)1 cited
- → Zero: A Special Case(2001)3 cited
- → Who Invented the Zero?(2019)
- → A Solution of The Unsolved Mathematical Mystery Divide By Zero And Zero To The Power Zero(2020)
- → A solution of the unsolved mathematical mystery Divide by zero and zero to the power zero(2020)