Top-down visual saliency via joint CRF and dictionary learning

Jimei Yang, Ming Hsuan Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

326 Citations (Scopus)

Abstract

Top-down visual saliency facilities object localization by providing a discriminative representation of target objects and a probability map for reducing the search space. In this paper, we propose a novel top-down saliency model that jointly learns a Conditional Random Field (CRF) and a discriminative dictionary. The proposed model is formulated based on a CRF with latent variables. By using sparse codes as latent variables, we train the dictionary modulated by CRF, and meanwhile a CRF with sparse coding. We propose a max-margin approach to train our model via fast inference algorithms. We evaluate our model on the Graz-02 and PASCAL VOC 2007 datasets. Experimental results show that our model performs favorably against the state-of-the-art top-down saliency methods. We also observe that the dictionary update significantly improves the model performance.

Original languageEnglish
Title of host publication2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012
Pages2296-2303
Number of pages8
DOIs
Publication statusPublished - 2012
Event2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012 - Providence, RI, United States
Duration: 2012 Jun 162012 Jun 21

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Other

Other2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012
Country/TerritoryUnited States
CityProvidence, RI
Period12/6/1612/6/21

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Top-down visual saliency via joint CRF and dictionary learning'. Together they form a unique fingerprint.

Cite this