Are generative approaches to ZSAR a look in the right direction?
Abstract
Approaches to zero-shot learning have typically involved finding embeddings in a latent space between visual and textual features and using these embeddings with nearest neighbor searches to perform classification. An alternative approach is to convert this problem to a fully supervised problem by introducing generated features using generative models. This can be done using any form of a generative approach. We have seen models like simple GANs to cycle GANs and also approaches like out-of-distribution detection models. However, generative approaches are typically unstable in training and recent research in vision and language training has seen significant progress in zero-shot learning making the use of generative approaches rather obsolete. Based on our studies we see a few primary concerns for the drop in the use of generative approaches. As mentioned before, the use of generative models typically leads to unstable training. Further, the training process is expensive and takes a lot of time. A possible direction we are considering is to change the use of the typically used I3D backbone and use transformer based backbones as we believe this will lead to better features in training for the seen classes and this will significantly boost the training process of generative models.
Related Papers
- → Dynamic Generative Diagrams(2000)2 cited
- → Generative Art Theory(2016)55 cited
- 제너러티브 디자인 방법론의 예술 및 디자인 적용사례에 관한 연구(2011)
- → TC-VAE: Uncovering Out-of-Distribution Data Generative Factors(2023)
- The Side Effect of a Generative Experiment(2002)