Unsupervised Learning of Monocular Depth from Videos
Abstract
To estimate single-view depth, a novel unsupervised framework is designed in this paper. In addition, the proposed model outputs optical flow and camera motion from monocular video sequences simultaneously. Without any additional extra annotations, the three components are predicted jointly from their individual networks. In our work, view synthesis losses are employed as primary supervision for training the three components. Moreover, to handle the existences of moving objects in the scene, we segment the scene into static and dynamic parts by comparing the full optical flow and 2D rigid flow resulted from camera ego-motion. Only static regions is considered when computing the view synthesis losses. Experimental results on KITTI dataset reveal that proposed algorithm in this paper excels previous unsupervised approaches, and perform comparbaly with supervised ones.
Related Papers
- Binocular vision tested with visual evoked potentials in children and infants.(1978)
- → Integration of Intermittent Visual Samples Over Time and Between the Eyes(2006)14 cited
- → Learning Optical Expansion from Scale Matching(2023)2 cited
- → Influences of monocular image degradation on the monocular components of fixation disparity(1996)9 cited
- 健常成人の対側,同側および両側眼への閃光刺激による視覚性誘発電位(VEP)の差異(1997)