Integrating Pose and Mask Predictions for Multi-person in Videos

Miran Heo, Sukjun Hwang, Seoung Wug Oh, Joon Young Lee, Seon Joo Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

In real-world applications for video editing, humans are arguably the most important objects. When editing videos of humans, the efficient tracking of fine-grained masks and body joints is the fundamental requirement. In this paper, we propose a simple and efficient system for jointly tracking pose and segmenting high-quality masks for all humans in the video. We design a pipeline that globally tracks pose and locally segments fine-grained masks. Specifically, CenterTrack is first employed to track human poses by viewing the whole scene, and then the proposed local segmentation network leverages the pose information as a powerful query to carry out high-quality segmentation. Furthermore, we adopt a highly light-weight MLP-Mixer layer within the segmentation network that can efficiently propagate the query pose throughout the region of interest with minimal overhead. For the evaluation, we collect a new benchmark called KineMask which includes various appearances and actions. The experimental results demonstrate that our method has superior fine-grained segmentation performance. Moreover, it runs at 33 fps, achieving a great balance of speed and accuracy compared to the prevailing online Video Instance Segmentation methods.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022
PublisherIEEE Computer Society
Pages2656-2665
Number of pages10
ISBN (Electronic)9781665487399
DOIs
Publication statusPublished - 2022
Event2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022 - New Orleans, United States
Duration: 2022 Jun 192022 Jun 20

Publication series

NameIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
Volume2022-June
ISSN (Print)2160-7508
ISSN (Electronic)2160-7516

Conference

Conference2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022
Country/TerritoryUnited States
CityNew Orleans
Period22/6/1922/6/20

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Integrating Pose and Mask Predictions for Multi-person in Videos'. Together they form a unique fingerprint.

Cite this