Establishing dense semantic correspondences between object instances remains a challenging problem due to background clutter, significant scale and pose differences, and large intra-class variations. In this paper, we present an end-to-end trainable network for learning semantic correspondences using only matching image pairs without manual keypoint correspondence annotations. To facilitate network training with this weaker form of supervision, we (1) explicitly estimate the foreground regions to suppress the effect of background clutter and (2) develop cycle-consistent losses to enforce the predicted transformations across multiple images to be geometrically plausible and consistent. We train the proposed model using the PF-PASCAL dataset and evaluate the performance on the PF-PASCAL, PF-WILLOW, and TSS datasets. Extensive experimental results show that the proposed approach achieves favorably performance compared to the state-of-the-art. The code and model will be available at https://yunchunchen.github.io/WeakMatchNet/.
|Title of host publication||Computer Vision - ACCV 2018 - 14th Asian Conference on Computer Vision, Revised Selected Papers|
|Editors||C.V. Jawahar, Konrad Schindler, Hongdong Li, Greg Mori|
|Number of pages||16|
|Publication status||Published - 2019|
|Event||14th Asian Conference on Computer Vision, ACCV 2018 - Perth, Australia|
Duration: 2018 Dec 2 → 2018 Dec 6
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Conference||14th Asian Conference on Computer Vision, ACCV 2018|
|Period||18/12/2 → 18/12/6|
Bibliographical notePublisher Copyright:
© 2019, Springer Nature Switzerland AG.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)