Learning depth from a single image using visual-depth words

Sunok Kim, Sunghwan Choi, Kwanghoon Sohn

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)


Estimating depth from a single monocular image is a fundamental problem in computer vision. Traditional methods for such estimation usually require complicated and sometimes labor-intensive processing. In this paper, we propose a new perspective for this problem and suggest a new gradient-domain learning framework which is much simpler and more efficient. Inspired by the observation that there is substantial co-occurrence of image edges and depth discontinuities in natural scenes, we learn the relationship between local appearance features and corresponding depth gradients by making use of the K-means clustering algorithm within the image feature space. We then encode each cluster centroid with its associated depth gradients, which defines visual-depth words that model the image-depth relationship very well. This enables one to estimate the scene depth for an arbitrary image by simply selecting proper depth gradients from a compact dictionary of visual-depth words, followed by a Poisson surface reconstruction. Experimental results demonstrate that the proposed gradient-domain approach outperforms state-of-the-art methods both qualitatively and quantitatively and is generic over (unseen) scene categories which are not used for training.

Original languageEnglish
Title of host publication2015 IEEE International Conference on Image Processing, ICIP 2015 - Proceedings
PublisherIEEE Computer Society
Number of pages5
ISBN (Electronic)9781479983391
Publication statusPublished - 2015 Dec 9
EventIEEE International Conference on Image Processing, ICIP 2015 - Quebec City, Canada
Duration: 2015 Sept 272015 Sept 30

Publication series

NameProceedings - International Conference on Image Processing, ICIP
ISSN (Print)1522-4880


OtherIEEE International Conference on Image Processing, ICIP 2015
CityQuebec City

Bibliographical note

Publisher Copyright:
© 2015 IEEE.

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Signal Processing


Dive into the research topics of 'Learning depth from a single image using visual-depth words'. Together they form a unique fingerprint.

Cite this