Efficient video coding based on audio-visual focus of attention

Jong Seok Lee, Francesca De Simone, Touradj Ebrahimi

Research output: Contribution to journalArticlepeer-review

34 Citations (Scopus)

Abstract

This paper proposes an efficient video coding method using audio-visual focus of attention, which is based on the observation that sound-emitting regions in an audio-visual sequence draw viewers' attention. First, an audio-visual source localization algorithm is presented, where the sound source is identified by using the correlation between the sound signal and the visual motion information. The localization result is then used to encode different regions in the scene with different quality in such a way that regions close to the source are encoded with higher quality than those far from the source. This is implemented in the framework of H.264/AVC by assigning different quantization parameters for different regions. Through experiments with both standard and high definition sequences, it is demonstrated that the proposed method can yield considerable coding gains over the constant quantization mode of H.264/AVC without noticeable degradation of perceived quality.

Original languageEnglish
Pages (from-to)704-711
Number of pages8
JournalJournal of Visual Communication and Image Representation
Volume22
Issue number8
DOIs
Publication statusPublished - 2011 Nov

Bibliographical note

Funding Information:
The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007–2011) under Grant Agreement No. 21644 (PetaMedia) and the Swiss National Foundation for Scientific Research in the framework of the NCCR Interactive Multimodal Information Management (IM2) .

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Media Technology
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Efficient video coding based on audio-visual focus of attention'. Together they form a unique fingerprint.

Cite this