ER-Depth: Enhancing the Robustness of Self-Supervised Monocular Depth Estimation in Challenging Scenes
Citations Over TimeTop 18% of 2025 papers
Abstract
Self-supervised monocular depth estimation holds significant importance in the fields of autonomous driving and robotics. However, existing methods are typically trained and evaluated on clear, sunny datasets, overlooking the impact of various adverse conditions commonly encountered in real-world applications, such as rainy weather, low visibility, and motion blur. As a result, they often struggle in challenging scenarios and produce artifacts. To address this issue, we propose ER-Depth, a novel two-stage self-supervised framework designed for robust depth estimation. In the first stage, we propose perturbation-invariant depth consistency regularization to propagate reliable supervision from standard to challenging scenes. In the second stage, we adopt the Mean Teacher paradigm for self-distillation and present a novel consistency-based pseudo-label filtering strategy to improve the quality of pseudo-labels. Extensive experiments demonstrate that our method exhibits exceptional robustness in challenging scenarios while maintaining high performance in standard scenes, significantly outperforming existing state-of-the-art methods on challenging KITTI-C, DrivingStereo, and NuScenes-Night benchmarks. Project page: https://ruijiezhu94.github.io/ERDepth_page .
Related Papers
- → An Object Detection and Pose Estimation Approach for Position Based Visual Servoing(2017)5 cited
- → Self-monitoring to improve robustness of 3D object tracking for robotics(2011)4 cited
- → Tracking in 3D: Image Variability Decomposition for Recovering Object Pose and Illumination(1999)15 cited
- → Foreground object segmentation from binocular stereo video(2005)2 cited
- → 6-DOF object localization by combining monocular vision and robot arm kinematics(2017)1 cited