Discriminatively-guided Deliberative Perception for Pose Estimation of Multiple 3D Object Instances
Citations Over TimeTop 10% of 2016 papers
Abstract
We introduce a novel paradigm for model-based multi-object recognition and 3 DoF pose estimation from 3D sensor data that integrates exhaustive global reasoning with discriminatively-trained algorithms in a principled fashion. Typical approaches for this task are based on scene-to-model feature matching or regression by statistical learners trained on a large database of annotated scenes. These approaches are fast but sensitive to occlusions, features, and/or training data. Generative approaches, on the other hand, e.g., methods based on rendering and verification, are robust to occlusions and require no training, but are slow at test time. We conjecture that robust and efficient perception can be achieved through a combination of generative methods and discriminatively-trained approaches. To this end, we introduce the Discriminatively-guided Deliberative Perception (D2P) paradigm that has the following desirable properties: a) D2P is a single search algorithm that looks for the 'best' rendering of the scene that matches the input, b) can be guided by any and multiple discriminative algorithms, and c) generates a solution that is provably bounded suboptimal with respect to the chosen cost function. In addition, we introduce the notions of completeness and resolution completeness for multi-object pose estimation problems, and show that D2P is resolution complete. We conduct extensive evaluations on a benchmark dataset to study various aspects of D2P in relation to existing approaches.
Related Papers
- → Machine Learning: Discriminative and Generative(2012)165 cited
- → Generative Prompt Model for Weakly Supervised Object Localization(2023)26 cited
- → Improved discriminative training for generative model(2009)1 cited
- → TC-VAE: Uncovering Out-of-Distribution Data Generative Factors(2023)
- → Generative Prompt Model for Weakly Supervised Object Localization(2023)