Timezone: »

Visual Pre-training for Navigation: What Can We Learn from Noise?
Felix Yanwei Wang · Ching-Yun Ko · Pulkit Agrawal

Fri Dec 02 07:28 AM -- 07:30 AM (PST) @
Event URL: https://openreview.net/forum?id=RJhC_uIwdsj »

In visual navigation, one powerful paradigm is to predict actions from observations directly. Training such an end-to-end system allows representations that are useful for downstream tasks to emerge automatically. However, the lack of inductive bias makes this system data-hungry. We hypothesize a sufficient representation of the current view and the goal view for a navigation policy can be learned by predicting the location and size of a crop of the current view that corresponds to the goal. We further show that training such random crop prediction in a self-supervised fashion purely on synthetic noise images transfers well to natural home images. The learned representation can then be bootstrapped to learn a navigation policy efficiently with little interaction data. Video page: https://sites.google.com/view/pretrain-noise

Author Information

Felix Yanwei Wang (MIT CSAIL)
Ching-Yun Ko (MIT)
Pulkit Agrawal (MIT)

More from the Same Authors