Learning Spatial and Spatio-Temporal Pixel Aggregations for Image and Video Denoising

Xiangyu Xu, Muchen Li, Wenxiu Sun, Ming Hsuan Yang

Research output: Contribution to journalArticlepeer-review

32 Citations (Scopus)


Existing denoising methods typically restore clear results by aggregating pixels from the noisy input. Instead of relying on hand-crafted aggregation schemes, we propose to explicitly learn this process with deep neural networks. We present a spatial pixel aggregation network and learn the pixel sampling and averaging strategies for image denoising. The proposed model naturally adapts to image structures and can effectively improve the denoised results. Furthermore, we develop a spatio-temporal pixel aggregation network for video denoising to efficiently sample pixels across the spatio-temporal space. Our method is able to solve the misalignment issues caused by large motion in dynamic scenes. In addition, we introduce a new regularization term for effectively training the proposed video denoising model. We present extensive analysis of the proposed method and demonstrate that our model performs favorably against the state-of-the-art image and video denoising approaches on both synthetic and real-world data.

Original languageEnglish
Article number9110760
Pages (from-to)7153-7165
Number of pages13
JournalIEEE Transactions on Image Processing
Publication statusPublished - 2020

Bibliographical note

Publisher Copyright:
© 1992-2012 IEEE.

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Graphics and Computer-Aided Design


Dive into the research topics of 'Learning Spatial and Spatio-Temporal Pixel Aggregations for Image and Video Denoising'. Together they form a unique fingerprint.

Cite this