Self-Supervised Visual Descriptor Learning for Dense Correspondence
Citations Over TimeTop 1% of 2016 papers
Abstract
Robust estimation of correspondences between image pixels is an important problem in robotics, with applications in tracking, mapping, and recognition of objects, environments, and other agents. Correspondence estimation has long been the domain of hand-engineered features, but more recently deep learning techniques have provided powerful tools for learning features from raw data. The drawback of the latter approach is that a vast amount of (labeled, typically) training data are required for learning. This paper advocates a new approach to learning visual descriptors for dense correspondence estimation in which we harness the power of a strong three-dimensional generative model to automatically label correspondences in RGB-D video data. A fully convolutional network is trained using a contrastive loss to produce viewpoint- and lighting-invariant descriptors. As a proof of concept, we collected two datasets: The first depicts the upper torso and head of the same person in widely varied settings, and the second depicts an office as seen on multiple days with objects rearranged within. Our datasets focus on revisitation of the same objects and environments, and we show that by training the CNN only from local tracking data, our learned visual descriptor generalizes toward identifying nonlabeled correspondences across videos. We furthermore show that our approach to descriptor learning can be used to achieve state-of-the-art single-frame localization results on the MSR 7-scenes dataset without using any labels identifying correspondences between separate videos of the same scenes at training time.
Related Papers
- → Implementing convolutional neural network model for prediction in medical imaging(2022)6 cited
- → Leaf Features Extraction for Plant Classification using CNN(2021)7 cited
- → Deep fake Detection Through Deep Learning(2023)4 cited
- Why & When Deep Learning Works: Looking Inside Deep Learnings.(2017)
- → Convolutional Neural Network (CNN) Applied to the Risk Analysis of Accidents in Vessels Navigating the Amazon Rivers(2023)