Timezone: »
The representation of objects is the building block of higher-level concepts. Infants develop the notion of objects without supervision. The prediction error of future sensory input is likely the major teaching signal for infants. Inspired by this, we propose a new framework to extract object-centric representation from single 2D images by learning to predict future scenes in the presence of moving objects. We treat objects as latent causes of which the function for an agent is to facilitate efficient prediction of the coherent motion of their parts in visual input. Distinct from previous object-centric models, our model learns to explicitly infer objects' locations in a 3D environment in addition to segmenting objects. Further, the network learns a latent code space where objects with the same geometric shape and texture/color frequently group together. The model requires no supervision or pre-training of any part of the network. We created a new synthetic dataset with more complex textures on objects and background and found several previous models not based on predictive learning overly rely on clustering colors and lose specificity in object segmentation. Our work demonstrates a new approach for learning symbolic representation grounded in sensation and action.
Author Information
Tushar Arora (The University of Tokyo)
Li Erran Li (AWS AI, Amazon)
Li Erran Li is the head of machine learning at Scale and an adjunct professor at Columbia University. Previously, he was chief scientist at Pony.ai. Before that, he was with the perception team at Uber ATG and machine learning platform team at Uber where he worked on deep learning for autonomous driving, led the machine learning platform team technically, and drove strategy for company-wide artificial intelligence initiatives. He started his career at Bell Labs. Li’s current research interests are machine learning, computer vision, learning-based robotics, and their application to autonomous driving. He has a PhD from the computer science department at Cornell University. He’s an ACM Fellow and IEEE Fellow.
Mingbo Cai (University of Tokyo)
More from the Same Authors
-
2022 : The Impact of Symbolic Representations on In-context Learning for Few-shot Reasoning »
Hanlin Zhang · yifan zhang · Li Erran Li · Eric Xing -
2022 : Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation »
yifan zhang · Hanlin Zhang · Zachary Lipton · Li Erran Li · Eric Xing -
2021 : Learning to perceive objects by prediction »
Tushar Arora · Li Erran Li · Mingbo Cai -
2021 Poster: A Causal Lens for Controllable Text Generation »
Zhiting Hu · Li Erran Li -
2019 : Welcome »
Rowan McAllister · Nicholas Rhinehart · Li Erran Li -
2019 Workshop: Machine Learning for Autonomous Driving »
Rowan McAllister · Nicholas Rhinehart · Fisher Yu · Li Erran Li · Anca Dragan -
2019 Poster: Transfer Learning via Minimizing the Performance Gap Between Domains »
Boyu Wang · Jorge Mendez · Mingbo Cai · Eric Eaton -
2018 : Opening Remark »
Li Erran Li · Anca Dragan -
2018 Workshop: NIPS Workshop on Machine Learning for Intelligent Transportation Systems 2018 »
Li Erran Li · Anca Dragan · Juan Carlos Niebles · Silvio Savarese -
2017 Workshop: 2017 NIPS Workshop on Machine Learning for Intelligent Transportation Systems »
Li Erran Li · Anca Dragan · Juan Carlos Niebles · Silvio Savarese -
2017 Workshop: ML Systems Workshop @ NIPS 2017 »
Aparna Lakshmiratan · Sarah Bird · Siddhartha Sen · Christopher Ré · Li Erran Li · Joseph Gonzalez · Daniel Crankshaw -
2016 Workshop: Machine Learning Systems »
Aparna Lakshmiratan · Li Erran Li · Siddhartha Sen · Sarah Bird · Hussein Mehanna -
2016 Workshop: Machine Learning for Intelligent Transportation Systems »
Li Erran Li · Trevor Darrell -
2016 Poster: A Bayesian method for reducing bias in neural representational similarity analysis »
Mingbo Cai · Nicolas W Schuck · Jonathan Pillow · Yael Niv