Abstract
Temporal action localization has been one of the most popular tasks in video understanding, due to the importance of detecting action instances in videos. However, not much progress has been made on extending it to work in an online fashion, although many video related tasks can benefit by going online with the growing video streaming services. To this end, we introduce a new task called Online Temporal Action Localization (On-TAL), in which the goal is to immediately detect action instances from an untrimmed streaming video. The online setting makes the new task very challenging as the actionness decision for every frame has to be made without access to future frames and also because post-processing methods cannot be used to modify past action proposals. We propose a novel framework, Context-Aware Actionness Grouping (CAG) as a solution for On-TAL and train it with the imitation learning algorithm, which allows us to avoid sophisticated reward engineering. Evaluation of our work on THUMOS14 and Activitynet1.3 shows significant improvement over non-naive baselines, demonstrating the effectiveness of our approach. As a by-product, our method can also be used for the Online Detection of Action Start (ODAS), in which our method also outperforms previous state-of-the-art models.
Original language | English |
---|---|
Title of host publication | Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 13709-13718 |
Number of pages | 10 |
ISBN (Electronic) | 9781665428125 |
DOIs | |
Publication status | Published - 2021 |
Event | 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 - Virtual, Online, Canada Duration: 2021 Oct 11 → 2021 Oct 17 |
Publication series
Name | Proceedings of the IEEE International Conference on Computer Vision |
---|---|
ISSN (Print) | 1550-5499 |
Conference
Conference | 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 |
---|---|
Country/Territory | Canada |
City | Virtual, Online |
Period | 21/10/11 → 21/10/17 |
Bibliographical note
Funding Information:This work was conducted by Center for Applied Research in Artificial Intelligence(CARAI) grant funded by Defense Acquisition Program Administration(DAPA) and Agency for Defense Development(ADD) (UD190031RD), and also by Institute of Information & communications Technology Planning & evaluation (IITP) grant funded by the Korea government(MSIT), Artifitial Intelligence Graduate School Program, Yonsei University, under Grant 2020-0-01361
Publisher Copyright:
© 2021 IEEE
All Science Journal Classification (ASJC) codes
- Software
- Computer Vision and Pattern Recognition