Timezone: »

Joint-task Self-supervised Learning for Temporal Correspondence
Xueting Li · Sifei Liu · Shalini De Mello · Xiaolong Wang · Jan Kautz · Ming-Hsuan Yang

Wed Dec 10:45 AM -- 12:45 PM PST @ East Exhibition Hall B + C #65

This paper proposes to learn reliable dense correspondence from videos in a self-supervised manner. Our learning process integrates two highly related tasks: tracking large image regions and establishing fine-grained pixel-level associations between consecutive video frames. We exploit the synergy between both tasks through a shared inter-frame affinity matrix, which simultaneously models transitions between video frames at both the region- and pixel-levels. While region-level localization helps reduce ambiguities in fine-grained matching by narrowing down search regions; fine-grained matching provides bottom-up features to facilitate region-level localization. Our method outperforms the state-of-the-art self-supervised methods on a variety of visual correspondence tasks, including video-object and part-segmentation propagation, keypoint tracking, and object tracking. Our self-supervised method even surpasses the fully-supervised affinity feature representation obtained from a ResNet-18 pre-trained on the ImageNet.

Author Information

Xueting Li (University of California, Merced)
Sifei Liu (NVIDIA)
Shalini De Mello (NVIDIA)

Shalini De Mello is a Senior Research Scientist at NVIDIA Research since March 2013. Her research interests are in computer vision and machine learning for human-computer interaction and smart interfaces. Her work includes NVIDIA’s shipping products for hand gesture recognition, face detection, video stabilization and GPU-optimized libraries for the development for computer vision applications on mobile platforms. She received doctoral and master’s degrees in Electrical and Computer Engineering from the University of Texas at Austin in 2008 and 2004, respectively. Outside of work, she likes to cook, travel, read and hikes with her dog.

Xiaolong Wang (CMU)
Jan Kautz (NVIDIA)
Ming-Hsuan Yang (Google / UC Merced)

More from the Same Authors