Learning Models for Object Recognition from Natural Language Descriptions
Citations Over TimeTop 10% of 2009 papers
Abstract
We investigate the task of learning models for visual object recognition from natural language descriptions alone. The approach contributes to the recognition of fine-grain object categories, such as animal and plant species, where it may be difficult to collect many images for training, but where textual descriptions of visual attributes are readily available. As an example we tackle recognition of butterfly species, learning models from descriptions in an online nature guide. We propose natural language processing methods for extracting salient visual attributes from these descriptions to use as ‘templates’ for the object categories, and apply vision methods to extract corresponding attributes from test images. A generative model is used to connect textual terms in the learnt templates to visual attributes. We report experiments comparing the performance of humans and the proposed method on a dataset of ten butterfly categories.
Related Papers
- → Recognition by functional parts [function-based object recognition(1994)17 cited
- → Object recognition using appearance models accumulated into environment(2002)12 cited
- Keeping Conflicts Latent: "Salient" versus "Non-Salient" Interpersonal Conflict Management Strategies of Japanese(2013)
- Detecting Salient Regions Based on "What" and "Where" Pathways of Visual Systems(2006)
- → A Computational Method to Find Salient Features(2008)