Background suppression network for weakly-supervised temporal action localization

Pilhyeon Lee, Youngjung Uh, Hyeran Byun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

112 Citations (Scopus)


Weakly-supervised temporal action localization is a very challenging problem because frame-wise labels are not given in the training stage while the only hint is video-level labels: whether each video contains action frames of interest. Previous methods aggregate frame-level class scores to produce video-level prediction and learn from video-level action labels. This formulation does not fully model the problem in that background frames are forced to be misclassified as action classes to predict video-level labels accurately. In this paper, we design Background Suppression Network (BaSNet) which introduces an auxiliary class for background and has a two-branch weight-sharing architecture with an asymmetrical training strategy. This enables BaS-Net to suppress activations from background frames to improve localization performance. Extensive experiments demonstrate the effectiveness of BaS-Net and its superiority over the state-of-theart methods on the most popular benchmarks - THUMOS'14 and ActivityNet. Our code and the trained model are available at

Original languageEnglish
Title of host publicationAAAI 2020 - 34th AAAI Conference on Artificial Intelligence
PublisherAAAI press
Number of pages8
ISBN (Electronic)9781577358350
Publication statusPublished - 2020
Event34th AAAI Conference on Artificial Intelligence, AAAI 2020 - New York, United States
Duration: 2020 Feb 72020 Feb 12

Publication series

NameAAAI 2020 - 34th AAAI Conference on Artificial Intelligence


Conference34th AAAI Conference on Artificial Intelligence, AAAI 2020
Country/TerritoryUnited States
CityNew York

Bibliographical note

Funding Information:
This project was partly supported by Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2017M3C4A7069370) and the Institute for Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (No. 2019-0-01558: Study on audio, video, 3d map and activation map generation system using deep generative model)

Publisher Copyright:
© 2020, Association for the Advancement of Artificial Intelligence.

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence


Dive into the research topics of 'Background suppression network for weakly-supervised temporal action localization'. Together they form a unique fingerprint.

Cite this