Separable Convolutions for Optimizing 3D Stereo Networks
Citations Over TimeTop 24% of 2021 papers
Abstract
Deep learning based 3D stereo networks give superior performance compared to 2D networks and conventional stereo methods. However, this improvement in the performance comes at the cost of increased computational complexity, thus making these networks non-practical for the real-world applications. Specifically, these networks use 3D convolutions as a major work horse to refine and regress disparities. In this work first, we show that these 3D convolutions in stereo networks consume up to 94% of overall network operations and act as a major bottleneck. Next, we propose a set of “plug-&-run” separable convolutions to reduce the number of parameters and operations. When integrated with the existing state of the art stereo networks, these convolutions lead up to $7\times$ reduction in number of operations and up to $3.5\times$ reduction in parameters without compromising their performance. In fact these convolutions lead to improvement in their performance in the majority of cases 1 1 This work is part of the project DeepStereoVision (FRE: 01IS18024B) sponsored by the German Ministry of Education & Research (BMBF).
Related Papers
- → An Object Detection and Pose Estimation Approach for Position Based Visual Servoing(2017)5 cited
- → Tracking in 3D: Image Variability Decomposition for Recovering Object Pose and Illumination(1999)15 cited
- → Foreground object segmentation from binocular stereo video(2005)2 cited
- → Object-oriented stripe structured-light vision-guided robot(2017)2 cited
- → 6-DOF object localization by combining monocular vision and robot arm kinematics(2017)1 cited