SODA: Bottleneck Diffusion Models for Representation Learning
Citations Over TimeTop 11% of 2024 papers
Abstract
We introduce SODA, a self-supervised diffusion model, designed for representation learning. The model incorpo-rates an image encoder, which distills a source view into a compact representation, that, in turn, guides the generation of related novel views. We show that by imposing a tight bottleneck between the encoder and a denoising decoder, and leveraging novel view synthesis as a self-supervised ob-jective, we can turn diffusion models into strong represen-tation learners, capable of capturing visual semantics in an unsupervised manner. To the best of our knowledge, SODA is the first diffusion model to succeed at ImageNet linear-probe classification, and, at the same time, it accomplishes reconstruction, editing and synthesis tasks across a wide range of datasets. Further investigation reveals the disentangled nature of its emergent latent space, that serves as an effective interface to control and manipulate the produced images. All in all, we aim to shed light on the exciting and promising potential of diffusion models, not only for image generation, but also for learning rich and robust represen-tations. See our website at soda-diffusion.github.io.
Related Papers
- → Comparison of bottleneck detection methods for AGV systems(2004)56 cited
- → Identification and characteristics analysis of bottlenecks on urban expressways based on floating car data(2018)12 cited
- → Simulation test bed for manufacturing analysis: comparison of bottleneck detection methods for AGV systems(2003)11 cited
- → Empirical Complex Pattern Formation Caused by Peculiarities of Freeway Infrastructure(2004)4 cited
- → Direction of the Bottleneck in Dependence on Inventory Levels(2016)3 cited