Abstract
Most online video understanding tasks aim to immediately process each streaming frame and output predictions frame-by-frame. For extension to instance-level predictions of existing online video tasks, Online Temporal Action Localization (On-TAL) has been recently proposed. However, simple On-TAL approaches of grouping per-frame predictions have limitations due to the lack of instance-level context. To this end, we propose Online Anchor Transformer (OAT) to extend the anchor-based action localization model to the online setting. We also introduce an online-applicable post-processing method that suppresses repetitive action proposals. Evaluations of On-TAL on THUMOS’14, MUSES, and BBDB show significant improvements in terms of mAP, and our model shows comparable performance to the state-of-the-art offline TAL methods with a minor change of the post-processing method. In addition to mAP evaluation, we additionally present a new online-oriented metric of early detection for On-TAL, and measure the responsiveness of each On-TAL approach.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2022 - 17th European Conference, Proceedings |
Editors | Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 653-669 |
Number of pages | 17 |
ISBN (Print) | 9783031198298 |
DOIs | |
Publication status | Published - 2022 |
Event | 17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel Duration: 2022 Oct 23 → 2022 Oct 27 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 13694 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 17th European Conference on Computer Vision, ECCV 2022 |
---|---|
Country/Territory | Israel |
City | Tel Aviv |
Period | 22/10/23 → 22/10/27 |
Bibliographical note
Funding Information:Acknowledgements. This work has partly supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT)(NRF-2022R1A2C2004509) and by Institute of Information communications Technology Planning Evaluation (IITP) grant funded by the Korea government(MSIT), Artificial Intelligence Innovation Hub under Grant 2021–0-02068, Artificial Intelligence Graduate School Program under Grant 2020–0-01361.
Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)