Abstract
Convolutional Neural Networks (CNNs) have been applied to visual tracking with demonstrated success in recent years. Most CNN-based trackers utilize hierarchical features extracted from a certain layer to represent the target. However, features from a certain layer are not always effective for distinguishing the target object from the backgrounds especially in the presence of complicated interfering factors (e.g., heavy occlusion, background clutter, illumination variation, and shape deformation). In this work, we propose a CNN-based tracking algorithm which hedges deep features from different CNN layers to better distinguish target objects and background clutters. Correlation filters are applied to feature maps of each CNN layer to construct a weak tracker, and all weak trackers are hedged into a strong one. For robust visual tracking, we propose a hedge method to adaptively determine weights of weak classifiers by considering both the difference between the historical as well as instantaneous performance, and the difference among all weak trackers over time. In addition, we design a Siamese network to define the loss of each weak tracker for the proposed hedge method. Extensive experiments on large benchmark datasets demonstrate the effectiveness of the proposed algorithm against the state-of-the-art tracking methods.
Original language | English |
---|---|
Article number | 8344501 |
Pages (from-to) | 1116-1130 |
Number of pages | 15 |
Journal | IEEE transactions on pattern analysis and machine intelligence |
Volume | 41 |
Issue number | 5 |
DOIs | |
Publication status | Published - 2019 May 1 |
Bibliographical note
Funding Information:This work was supported in part by National Natural Science Foundation of China: 61620106009, 61332016, U1636214, 61650202, 61672188, 61572465, 61390510, 61732007, 61472103, 61772158, U1711265; in part by Key Research Program of Frontier Sciences, CAS: QYZDJ-SSW-SYS013; in part by the NRF grant funded by Ministry of Science, ICT Korea: NRF- 2017R1A2B4011928 and NRF-2017M3C4A7069369; in part by the NSF CAREER Grant 1149783, and gifts from Adobe, Verisk, and Nvidia. S. Zhang was also supported by the Young Excellent Talent Program ofHarbin Institute of Technology.
Funding Information:
This work was supported in part by National Natural Science Foundation of China: 61620106009, 61332016, U1636214, 61650202, 61672188, 61572465, 61390510, 61732007, 61472103, 61772158, U1711265; in part by Key Research Program of Frontier Sciences, CAS: QYZDJ-SSW-SYS013; in part by the NRF grant funded by Ministry of Science, ICT Korea: NRF-2017R1A2B4011928 and NRF-2017M3C4A7069369; in part by the NSF CAREER Grant 1149783, and gifts from Adobe, Ver-isk, and Nvidia. S. Zhang was also supported by the Young Excellent Talent Program of Harbin Institute of Technology.
Publisher Copyright:
© 1979-2012 IEEE.
All Science Journal Classification (ASJC) codes
- Software
- Computer Vision and Pattern Recognition
- Computational Theory and Mathematics
- Artificial Intelligence
- Applied Mathematics