Timezone: »
We introduce ViSER, a method for recovering articulated 3D shapes and dense3D trajectories from monocular videos. Previous work on high-quality reconstruction of dynamic 3D shapes typically relies on multiple camera views, strong category-specific priors, or 2D keypoint supervision. We show that none of these are required if one can reliably estimate long-range correspondences in a video, making use of only 2D object masks and two-frame optical flow as inputs. ViSER infers correspondences by matching 2D pixels to a canonical, deformable 3D mesh via video-specific surface embeddings that capture the pixel appearance of each surface point. These embeddings behave as a continuous set of keypoint descriptors defined over the mesh surface, which can be used to establish dense long-range correspondences across pixels. The surface embeddings are implemented as coordinate-based MLPs that are fit to each video via self-supervised losses.Experimental results show that ViSER compares favorably against prior work on challenging videos of humans with loose clothing and unusual poses as well as animals videos from DAVIS and YTVOS. Project page: viser-shape.github.io.
Author Information
Gengshan Yang (Carnegie Mellon University)
Deqing Sun (Google)
Varun Jampani (Google)
Daniel Vlasic (Massachusetts Institute of Technology)
Forrester Cole (Google Research)
Ce Liu (Microsoft)
Deva Ramanan (Carnegie Mellon University)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction »
Thu. Dec 9th 04:30 -- 06:00 PM Room
More from the Same Authors
-
2021 : Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting »
Benjamin Wilson · William Qi · Tanmay Agarwal · John Lambert · Jagjeet Singh · Siddhesh Khandelwal · Bowen Pan · Ratnesh Kumar · Andrew Hartnett · Jhony Kaesemodel Pontes · Deva Ramanan · Peter Carr · James Hays -
2021 : The CLEAR Benchmark: Continual LEArning on Real-World Imagery »
Zhiqiu Lin · Jia Shi · Deepak Pathak · Deva Ramanan -
2022 Poster: D^2NeRF: Self-Supervised Decoupling of Dynamic and Static Objects from a Monocular Video »
Tianhao Wu · Fangcheng Zhong · Andrea Tagliasacchi · Forrester Cole · Cengiz Oztireli -
2022 Poster: OmniVL: One Foundation Model for Image-Language and Video-Language Tasks »
Junke Wang · Dongdong Chen · Zuxuan Wu · Chong Luo · Luowei Zhou · Yucheng Zhao · Yujia Xie · Ce Liu · Yu-Gang Jiang · Lu Yuan -
2022 Spotlight: OmniVL: One Foundation Model for Image-Language and Video-Language Tasks »
Junke Wang · Dongdong Chen · Zuxuan Wu · Chong Luo · Luowei Zhou · Yucheng Zhao · Yujia Xie · Ce Liu · Yu-Gang Jiang · Lu Yuan -
2022 Poster: K-LITE: Learning Transferable Visual Models with External Knowledge »
Sheng Shen · Chunyuan Li · Xiaowei Hu · Yujia Xie · Jianwei Yang · Pengchuan Zhang · Zhe Gan · Lijuan Wang · Lu Yuan · Ce Liu · Kurt Keutzer · Trevor Darrell · Anna Rohrbach · Jianfeng Gao -
2022 Poster: Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone »
Zi-Yi Dou · Aishwarya Kamath · Zhe Gan · Pengchuan Zhang · Jianfeng Wang · Linjie Li · Zicheng Liu · Ce Liu · Yann LeCun · Nanyun Peng · Jianfeng Gao · Lijuan Wang -
2022 Poster: LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D Part Discovery »
Chun-Han Yao · Wei-Chih Hung · Yuanzhen Li · Michael Rubinstein · Ming-Hsuan Yang · Varun Jampani -
2022 Poster: SAMURAI: Shape And Material from Unconstrained Real-world Arbitrary Image collections »
Mark Boss · Andreas Engelhardt · Abhishek Kar · Yuanzhen Li · Deqing Sun · Jonathan Barron · Hendrik PA Lensch · Varun Jampani -
2022 Poster: Subsidiary Prototype Alignment for Universal Domain Adaptation »
Jogendra Nath Kundu · Suvaansh Bhambri · Akshay R Kulkarni · Hiran Sarkar · Varun Jampani · Venkatesh Babu R -
2022 Poster: Associating Objects and Their Effects in Video through Coordination Games »
Erika Lu · Forrester Cole · Weidi Xie · Tali Dekel · Bill Freeman · Andrew Zisserman · Michael Rubinstein -
2022 Poster: Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning »
Yujia Xie · Luowei Zhou · Xiyang Dai · Lu Yuan · Nguyen Bach · Ce Liu · Michael Zeng -
2022 Poster: Continual Learning with Evolving Class Ontologies »
Zhiqiu Lin · Deepak Pathak · Yu-Xiong Wang · Deva Ramanan · Shu Kong -
2022 Poster: Learning to Discover and Detect Objects »
Vladimir Fomenko · Ismail Elezi · Deva Ramanan · Laura Leal-TaixĂ© · Aljosa Osep -
2022 Poster: Polynomial Neural Fields for Subband Decomposition and Manipulation »
Guandao Yang · Sagie Benaim · Varun Jampani · Kyle Genova · Jonathan Barron · Thomas Funkhouser · Bharath Hariharan · Serge Belongie -
2021 Poster: Robust Visual Reasoning via Language Guided Neural Module Networks »
Arjun Akula · Varun Jampani · Soravit Changpinyo · Song-Chun Zhu -
2021 Poster: Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition »
Mark Boss · Varun Jampani · Raphael Braun · Ce Liu · Jonathan Barron · Hendrik PA Lensch -
2021 Poster: Non-local Latent Relation Distillation for Self-Adaptive 3D Human Pose Estimation »
Jogendra Nath Kundu · Siddharth Seth · Anirudh Jamkhandi · Pradyumna YM · Varun Jampani · Anirban Chakraborty · Venkatesh Babu R -
2021 Poster: Aligning Silhouette Topology for Self-Adaptive 3D Human Pose Recovery »
Ramesha Rakesh Mugaludi · Jogendra Nath Kundu · Varun Jampani · Venkatesh Babu R -
2021 Poster: NeRS: Neural Reflectance Surfaces for Sparse-view 3D Reconstruction in the Wild »
Jason Zhang · Gengshan Yang · Shubham Tulsiani · Deva Ramanan -
2020 Poster: Generative View Synthesis: From Single-view Semantics to Novel-view Images »
Tewodros Amberbir Habtegebrial · Varun Jampani · Orazio Gallo · Didier Stricker -
2019 Poster: Volumetric Correspondence Networks for Optical Flow »
Gengshan Yang · Deva Ramanan -
2019 Poster: Unsupervised learning of object structure and dynamics from videos »
Matthias Minderer · Chen Sun · Ruben Villegas · Forrester Cole · Kevin Murphy · Honglak Lee -
2017 Poster: Learning to Model the Tail »
Yu-Xiong Wang · Deva Ramanan · Martial Hebert -
2017 Poster: Attentional Pooling for Action Recognition »
Rohit Girdhar · Deva Ramanan