Self-supervised Single-View 3D Reconstruction via Semantic Consistency

Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Varun Jampani, Ming Hsuan Yang, Jan Kautz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

29 Citations (Scopus)

Abstract

We learn a self-supervised, single-view 3D reconstruction model that predicts the 3D mesh shape, texture and camera pose of a target object with a collection of 2D images and silhouettes. The proposed method does not necessitate 3D supervision, manually annotated keypoints, multi-view images of an object or a prior 3D template. The key insight of our work is that objects can be represented as a collection of deformable parts, and each part is semantically coherent across different instances of the same category (e.g., wings on birds and wheels on cars). Therefore, by leveraging part segmentation of a large collection of category-specific images learned via self-supervision, we can effectively enforce semantic consistency between the reconstructed meshes and the original images. This significantly reduces ambiguities during joint prediction of shape and camera pose of an object, along with texture. To the best of our knowledge, we are the first to try and solve the single-view reconstruction problem without a category-specific template mesh or semantic keypoints. Thus our model can easily generalize to various object categories without such labels, e.g., horses, penguins, etc. Through a variety of experiments on several categories of deformable and rigid objects, we demonstrate that our unsupervised method performs comparably if not better than existing category-specific reconstruction methods learned with supervision. More details can be found at the project page https://sites.google.com/nvidia.com/unsup-mesh-2020.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2020 - 16th European Conference, 2020, Proceedings
EditorsAndrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm
PublisherSpringer Science and Business Media Deutschland GmbH
Pages677-693
Number of pages17
ISBN (Print)9783030585679
DOIs
Publication statusPublished - 2020
Event16th European Conference on Computer Vision, ECCV 2020 - Glasgow, United Kingdom
Duration: 2020 Aug 232020 Aug 28

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12359 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th European Conference on Computer Vision, ECCV 2020
Country/TerritoryUnited Kingdom
CityGlasgow
Period20/8/2320/8/28

Bibliographical note

Publisher Copyright:
© 2020, Springer Nature Switzerland AG.

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Self-supervised Single-View 3D Reconstruction via Semantic Consistency'. Together they form a unique fingerprint.

Cite this