Timezone: »
This paper proposes to learn reliable dense correspondence from videos in a self-supervised manner. Our learning process integrates two highly related tasks: tracking large image regions and establishing fine-grained pixel-level associations between consecutive video frames. We exploit the synergy between both tasks through a shared inter-frame affinity matrix, which simultaneously models transitions between video frames at both the region- and pixel-levels. While region-level localization helps reduce ambiguities in fine-grained matching by narrowing down search regions; fine-grained matching provides bottom-up features to facilitate region-level localization. Our method outperforms the state-of-the-art self-supervised methods on a variety of visual correspondence tasks, including video-object and part-segmentation propagation, keypoint tracking, and object tracking. Our self-supervised method even surpasses the fully-supervised affinity feature representation obtained from a ResNet-18 pre-trained on the ImageNet.
Author Information
Xueting Li (University of California, Merced)
Sifei Liu (NVIDIA)
Shalini De Mello (NVIDIA)
Shalini De Mello is a Senior Research Scientist at NVIDIA Research since March 2013. Her research interests are in computer vision and machine learning for human-computer interaction and smart interfaces. Her work includes NVIDIA’s shipping products for hand gesture recognition, face detection, video stabilization and GPU-optimized libraries for the development for computer vision applications on mobile platforms. She received doctoral and master’s degrees in Electrical and Computer Engineering from the University of Texas at Austin in 2008 and 2004, respectively. Outside of work, she likes to cook, travel, read and hikes with her dog.
Xiaolong Wang (CMU)
Jan Kautz (NVIDIA)
Ming-Hsuan Yang (Google / UC Merced)
More from the Same Authors
-
2021 Spotlight: Intriguing Properties of Vision Transformers »
Muhammad Muzammal Naseer · Kanchana Ranasinghe · Salman H Khan · Munawar Hayat · Fahad Shahbaz Khan · Ming-Hsuan Yang -
2021 : Physics Informed RNN-DCT Networks for Time-Dependent Partial Differential Equations »
Benjamin Wu · Oliver Hennigh · Jan Kautz · Sanjay Choudhry · Wonmin Byeon -
2021 Poster: Intriguing Properties of Vision Transformers »
Muhammad Muzammal Naseer · Kanchana Ranasinghe · Salman H Khan · Munawar Hayat · Fahad Shahbaz Khan · Ming-Hsuan Yang -
2021 Poster: Learning 3D Dense Correspondence via Canonical Point Autoencoder »
An-Chieh Cheng · Xueting Li · Min Sun · Ming-Hsuan Yang · Sifei Liu -
2021 Poster: A Contrastive Learning Approach for Training Variational Autoencoder Priors »
Jyoti Aneja · Alex Schwing · Jan Kautz · Arash Vahdat -
2021 Poster: Exploring Cross-Video and Cross-Modality Signals for Weakly-Supervised Audio-Visual Video Parsing »
Yan-Bo Lin · Hung-Yu Tseng · Hsin-Ying Lee · Yen-Yu Lin · Ming-Hsuan Yang -
2021 Poster: Score-based Generative Modeling in Latent Space »
Arash Vahdat · Karsten Kreis · Jan Kautz -
2021 Poster: Coupled Segmentation and Edge Learning via Dynamic Graph Propagation »
Zhiding Yu · Rui Huang · Wonmin Byeon · Sifei Liu · Guilin Liu · Thomas Breuel · Anima Anandkumar · Jan Kautz -
2021 Poster: End-to-end Multi-modal Video Temporal Grounding »
Yi-Wen Chen · Yi-Hsuan Tsai · Ming-Hsuan Yang -
2020 Poster: NVAE: A Deep Hierarchical Variational Autoencoder »
Arash Vahdat · Jan Kautz -
2020 Spotlight: NVAE: A Deep Hierarchical Variational Autoencoder »
Arash Vahdat · Jan Kautz -
2020 Poster: Online Adaptation for Consistent Mesh Reconstruction in the Wild »
Xueting Li · Sifei Liu · Shalini De Mello · Kihwan Kim · Xiaolong Wang · Ming-Hsuan Yang · Jan Kautz -
2020 Poster: Convolutional Tensor-Train LSTM for Spatio-Temporal Learning »
Jiahao Su · Wonmin Byeon · Jean Kossaifi · Furong Huang · Jan Kautz · Anima Anandkumar -
2020 Poster: Self-Learning Transformations for Improving Gaze and Head Redirection »
Yufeng Zheng · Seonwook Park · Xucong Zhang · Shalini De Mello · Otmar Hilliges -
2019 Poster: Quadratic Video Interpolation »
Xiangyu Xu · Li Siyao · Wenxiu Sun · Qian Yin · Ming-Hsuan Yang -
2019 Spotlight: Quadratic Video Interpolation »
Xiangyu Xu · Li Siyao · Wenxiu Sun · Qian Yin · Ming-Hsuan Yang -
2019 Poster: Few-shot Video-to-Video Synthesis »
Ting-Chun Wang · Ming-Yu Liu · Andrew Tao · Guilin Liu · Bryan Catanzaro · Jan Kautz -
2019 Poster: Dancing to Music »
Hsin-Ying Lee · Xiaodong Yang · Ming-Yu Liu · Ting-Chun Wang · Yu-Ding Lu · Ming-Hsuan Yang · Jan Kautz -
2018 : Jan Kautz »
Jan Kautz -
2018 Poster: Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation »
Wenqi Ren · Jiawei Zhang · Lin Ma · Jinshan Pan · Xiaochun Cao · Wangmeng Zuo · Wei Liu · Ming-Hsuan Yang -
2018 Poster: Context-aware Synthesis and Placement of Object Instances »
Donghoon Lee · Sifei Liu · Jinwei Gu · Ming-Yu Liu · Ming-Hsuan Yang · Jan Kautz -
2018 Poster: Video-to-Video Synthesis »
Ting-Chun Wang · Ming-Yu Liu · Jun-Yan Zhu · Guilin Liu · Andrew Tao · Jan Kautz · Bryan Catanzaro -
2018 Poster: Deep Attentive Tracking via Reciprocative Learning »
Shi Pu · YIBING SONG · Chao Ma · Honggang Zhang · Ming-Hsuan Yang -
2017 : Poster Session (encompasses coffee break) »
Beidi Chen · Borja Balle · Daniel Lee · iuri frosio · Jitendra Malik · Jan Kautz · Ke Li · Masashi Sugiyama · Miguel A. Carreira-Perpinan · Ramin Raziperchikolaei · Theja Tulabandhula · Yung-Kyun Noh · Adams Wei Yu -
2017 Poster: Unsupervised Image-to-Image Translation Networks »
Ming-Yu Liu · Thomas Breuel · Jan Kautz -
2017 Spotlight: Unsupervised Image-to-Image Translation Networks »
Ming-Yu Liu · Thomas Breuel · Jan Kautz -
2017 Poster: Learning Affinity via Spatial Propagation Networks »
Sifei Liu · Shalini De Mello · Jinwei Gu · Guangyu Zhong · Ming-Hsuan Yang · Jan Kautz -
2017 Poster: Semi-Supervised Learning for Optical Flow with Generative Adversarial Networks »
Wei-Sheng Lai · Jia-Bin Huang · Ming-Hsuan Yang -
2017 Poster: Universal Style Transfer via Feature Transforms »
Yijun Li · Chen Fang · Jimei Yang · Zhaowen Wang · Xin Lu · Ming-Hsuan Yang -
2015 Poster: Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis »
Jimei Yang · Scott E Reed · Ming-Hsuan Yang · Honglak Lee