Timezone: »

The Emergence of Objectness: Learning Zero-shot Segmentation from Videos
Runtao Liu · Zhirong Wu · Stella Yu · Stephen Lin

Tue Dec 07 04:30 PM -- 06:00 PM (PST) @ Virtual #None

Humans can easily detect and segment moving objects simply by observing how they move, even without knowledge of object semantics. Inspired by this, we develop a zero-shot unsupervised approach for learning object segmentations. The model comprises two visual pathways: an appearance pathway that segments individual RGB images into coherent object regions, and a motion pathway that predicts the flow vector for each region between consecutive video frames. The two pathways jointly reconstruct a new representation called segment flow. This decoupled representation of appearance and motion is trained in a self-supervised manner to reconstruct one frame from another.When pretrained on an unlabeled video corpus, the model can be useful for a variety of applications, including 1) primary object segmentation from a single image in a zero-shot fashion; 2) moving object segmentation from a video with unsupervised test-time adaptation; 3) image semantic segmentation by supervised fine-tuning on a labeled image dataset. We demonstrate encouraging experimental results on all of these tasks using pretrained models.

Author Information

Runtao Liu (Johns Hopkins University)
Zhirong Wu (Microsoft)
Stella Yu (UC Berkeley / ICSI)
Stephen Lin (Microsoft Research)

More from the Same Authors