Timezone: »
To help agents reason about scenes in terms of their building blocks, we wish to extract the compositional structure of any given scene (in particular, the configuration and characteristics of objects comprising the scene). This problem is especially difficult when scene structure needs to be inferred while also estimating the agent’s location/viewpoint, as the two variables jointly give rise to the agent’s observations. We present an unsupervised variational approach to this problem. Leveraging the shared structure that exists across different scenes, our model learns to infer two sets of latent representations from RGB video input alone: a set of "object" latents, corresponding to the time-invariant, object-level contents of the scene, as well as a set of "frame" latents, corresponding to global time-varying elements such as viewpoint. This factorization of latents allows our model, SIMONe, to represent object attributes in an allocentric manner which does not depend on viewpoint. Moreover, it allows us to disentangle object dynamics and summarize their trajectories as time-abstracted, view-invariant, per-object properties. We demonstrate these capabilities, as well as the model's performance in terms of view synthesis and instance segmentation, across three procedurally generated video datasets.
Author Information
Rishabh Kabra (DeepMind)
Daniel Zoran (DeepMind)
Goker Erdogan (DeepMind)
Loic Matthey (DeepMind)
Antonia Creswell (Imperial College London)
Matt Botvinick (Google DeepMind / University College London)
Alexander Lerchner (DeepMind)
Chris Burgess (Wayve)
More from the Same Authors
-
2021 : Alchemy: A benchmark and analysis toolkit for meta-reinforcement learning agents »
Jane Wang · Michael King · Nicolas Porcel · Zeb Kurth-Nelson · Tina Zhu · Charles Deck · Peter Choy · Mary Cassin · Malcolm Reynolds · Francis Song · Gavin Buttimore · David Reichert · Neil Rabinowitz · Loic Matthey · Demis Hassabis · Alexander Lerchner · Matt Botvinick -
2021 Spotlight: Collaborating with Humans without Human Data »
DJ Strouse · Kevin McKee · Matt Botvinick · Edward Hughes · Richard Everett -
2022 Poster: Fine-tuning language models to find agreement among humans with diverse preferences »
Michiel Bakker · Martin Chadwick · Hannah Sheahan · Michael Tessler · Lucy Campbell-Gillingham · Jan Balaguer · Nat McAleese · Amelia Glaese · John Aslanides · Matt Botvinick · Christopher Summerfield -
2021 Poster: Collaborating with Humans without Human Data »
DJ Strouse · Kevin McKee · Matt Botvinick · Edward Hughes · Richard Everett -
2021 Poster: Attention over Learned Object Embeddings Enables Complex Visual Reasoning »
David Ding · Felix Hill · Adam Santoro · Malcolm Reynolds · Matt Botvinick -
2021 Poster: Unsupervised Object-Based Transition Models For 3D Partially Observable Environments »
Antonia Creswell · Rishabh Kabra · Chris Burgess · Murray Shanahan -
2021 Oral: Attention over Learned Object Embeddings Enables Complex Visual Reasoning »
David Ding · Felix Hill · Adam Santoro · Malcolm Reynolds · Matt Botvinick -
2020 : Panel discussion »
Pierre-Yves Oudeyer · Marc Bellemare · Peter Stone · Matt Botvinick · Susan Murphy · Anusha Nagabandi · Ashley Edwards · Karen Liu · Pieter Abbeel -
2020 : Invited talk: Matt Botvinick "Alchemy: A Benchmark Task Distribution for Meta-Reinforcement Learning Research" »
Matt Botvinick -
2019 Poster: Towards Interpretable Reinforcement Learning Using Attention Augmented Agents »
Alexander Mott · Daniel Zoran · Mike Chrzanowski · Daan Wierstra · Danilo Jimenez Rezende -
2018 Poster: Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies »
Alessandro Achille · Tom Eccles · Loic Matthey · Chris Burgess · Nicholas Watters · Alexander Lerchner · Irina Higgins -
2018 Poster: Learning to Share and Hide Intentions using Information Regularization »
DJ Strouse · Max Kleiman-Weiner · Josh Tenenbaum · Matt Botvinick · David Schwab -
2018 Spotlight: Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies »
Alessandro Achille · Tom Eccles · Loic Matthey · Chris Burgess · Nicholas Watters · Alexander Lerchner · Irina Higgins -
2017 : Panel Discussion »
Matt Botvinick · Emma Brunskill · Marcos Campos · Jan Peters · Doina Precup · David Silver · Josh Tenenbaum · Roy Fox -
2017 : Applying variational information bottleneck in hierarchical domains (Matt Botvinick) »
Matt Botvinick -
2017 : Poster session + Coffee break »
Mikael Kågebäck · Igor Melnyk · Amir-Hossein Karimi · Gino Brunner · Ershad Banijamali · Chris Donahue · Jake Zhao · Giambattista Parascandolo · Valentin Thomas · Abhishek Kumar · Chris Burgess · Amanda Nilsson · Maria Larsson · Cian Eastwood · Momchil Peychev -
2017 : Meta-reinforcement learning in brains and machines »
Matt Botvinick -
2017 Poster: Variational Memory Addressing in Generative Models »
Jörg Bornschein · Andriy Mnih · Daniel Zoran · Danilo Jimenez Rezende -
2017 Poster: Visual Interaction Networks: Learning a Physics Simulator from Video »
Nicholas Watters · Daniel Zoran · Theophane Weber · Peter Battaglia · Razvan Pascanu · Andrea Tacchetti -
2014 Poster: Shape and Illumination from Shading using the Generic Viewpoint Assumption »
Daniel Zoran · Dilip Krishnan · José Bento · Bill Freeman -
2013 Poster: Learning the Local Statistics of Optical Flow »
Dan Rosenbaum · Daniel Zoran · Yair Weiss