Timezone: »
We explore a feed-forward approach for decomposing a video into layers, where each layer contains an object of interest along with its associated shadows, reflections, and other visual effects. This problem is challenging since associated effects vary widely with the 3D geometry and lighting conditions in the scene, and ground-truth labels for visual effects are difficult (and in some cases impractical) to collect. We take a self-supervised approach and train a neural network to produce a foreground image and alpha matte from a rough object segmentation mask under a reconstruction and sparsity loss. Under reconstruction loss, the layer decomposition problem is underdetermined: many combinations of layers may reconstruct the input video.Inspired by the game theory concept of focal points---or \emph{Schelling points}---we pose the problem as a coordination game, where each player (network) predicts the effects for a single object without knowledge of the other players' choices. The players learn to converge on the ``natural'' layer decomposition in order to maximize the likelihood of their choices aligning with the other players'. We train the network to play this game with itself, and show how to design the rules of this game so that the focal point lies at the correct layer decomposition. We demonstrate feed-forward results on a challenging synthetic dataset, then show that pretraining on this dataset significantly reduces optimization time for real videos.
Author Information
Erika Lu (Google)
Forrester Cole (Google Research)
Weidi Xie (University of Oxford)
Tali Dekel (Weizmann Institute of Science)
Bill Freeman (MIT/Google)
Andrew Zisserman (DeepMind & University of Oxford)
Michael Rubinstein (Google)
More from the Same Authors
-
2021 : PASS: An ImageNet replacement for self-supervised pretraining without humans »
Yuki Asano · Christian Rupprecht · Andrew Zisserman · Andrea Vedaldi -
2021 Spotlight: Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering »
Vincent Sitzmann · Semon Rezchikov · Bill Freeman · Josh Tenenbaum · Fredo Durand -
2021 Spotlight: ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction »
Gengshan Yang · Deqing Sun · Varun Jampani · Daniel Vlasic · Forrester Cole · Ce Liu · Deva Ramanan -
2021 : Finding Maximally Informative Patches in Images »
Howard Zhong · Guha Balakrishnan · Richard Bowen · Ramin Zabih · Bill Freeman -
2021 : PASS: An ImageNet replacement for self-supervised pretraining without humans »
Yuki Asano · Christian Rupprecht · Andrew Zisserman · Andrea Vedaldi -
2021 : 3D Spinal Column Segmentation with Single Plane 2D-Projected Annotations »
Rhydian Windsor · Amir Jamaludin · Timor Kadir · Andrew Zisserman -
2021 : Finding Maximally Informative Patches in Images »
Howard Zhong · Guha Balakrishnan · Richard Bowen · Ramin Zabih · Bill Freeman -
2022 Poster: Segmenting Moving Objects via an Object-Centric Layered Representation »
Junyu Xie · Weidi Xie · Andrew Zisserman -
2022 Poster: D^2NeRF: Self-Supervised Decoupling of Dynamic and Static Objects from a Monocular Video »
Tianhao Wu · Fangcheng Zhong · Andrea Tagliasacchi · Forrester Cole · Cengiz Oztireli -
2023 Poster: ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections »
Chun-Han Yao · Amit Raj · Wei-Chih Hung · Michael Rubinstein · Yuanzhen Li · Ming-Hsuan Yang · Varun Jampani -
2023 Poster: Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision »
Ayush Tewari · Tianwei Yin · George Cazenavette · Semon Rezchikov · Josh Tenenbaum · Fredo Durand · Bill Freeman · Vincent Sitzmann -
2023 Poster: Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion »
Yash Bhalgat · Iro Laina · João Henriques · Andrea Vedaldi · Andrew Zisserman -
2023 Poster: SceneScape: Text-Driven Consistent Scene Generation »
Rafail Fridman · Amit Abecasis · Yoni Kasten · Tali Dekel -
2023 Poster: StyleDrop: Text-to-Image Synthesis of Any Style »
Kihyuk Sohn · Lu Jiang · Jarred Barber · Kimin Lee · Nataniel Ruiz · Dilip Krishnan · Huiwen Chang · Yuanzhen Li · Irfan Essa · Michael Rubinstein · Yuan Hao · Glenn Entis · Irina Blok · Daniel Castro Chin -
2023 Poster: Learning New Dimensions of Human Visual Similarity using Synthetic Data »
Stephanie Fu · Netanel Tamir · Shobhita Sundaram · Lucy Chai · Richard Zhang · Tali Dekel · Phillip Isola -
2023 Poster: Self-supervised Object-centric Learning for Videos »
Görkay Aydemir · Weidi Xie · Fatma Guney -
2023 Poster: Improving Category Discovery When No Representation Rules Them All »
Sagar Vaze · Andrea Vedaldi · Andrew Zisserman -
2023 Poster: Perception Test: A Diagnostic Benchmark for Multimodal Video Models »
Viorica Patraucean · Lucas Smaira · Ankush Gupta · Adria Recasens · Larisa Markeeva · Dylan Banarse · Skanda Koppula · joseph heyward · Mateusz Malinowski · Yi Yang · Carl Doersch · Tatiana Matejovicova · Yury Sulsky · Antoine Miech · Alexandre Fréchette · Hanna Klimczak · Raphael Koster · Junlin Zhang · Stephanie Winkler · Yusuf Aytar · Simon Osindero · Dima Damen · Andrew Zisserman · Joao Carreira -
2022 Spotlight: Lightning Talks 6A-3 »
Junyu Xie · Chengliang Zhong · Ali Ayub · Sravanti Addepalli · Harsh Rangwani · Jiapeng Tang · Yuchen Rao · Zhiying Jiang · Yuqi Wang · Xingzhe He · Gene Chou · Ilya Chugunov · Samyak Jain · Yuntao Chen · Weidi Xie · Sumukh K Aithal · Carter Fendley · Lev Markhasin · Yiqin Dai · Peixing You · Bastian Wandt · Yinyu Nie · Helge Rhodin · Felix Heide · Ji Xin · Angela Dai · Andrew Zisserman · Bi Wang · Xiaoxue Chen · Mayank Mishra · ZHAO-XIANG ZHANG · Venkatesh Babu R · Justus Thies · Ming Li · Hao Zhao · Venkatesh Babu R · Jimmy Lin · Fuchun Sun · Matthias Niessner · Guyue Zhou · Xiaodong Mu · Chuang Gan · Wenbing Huang -
2022 Spotlight: Segmenting Moving Objects via an Object-Centric Layered Representation »
Junyu Xie · Weidi Xie · Andrew Zisserman -
2022 Poster: LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D Part Discovery »
Chun-Han Yao · Wei-Chih Hung · Yuanzhen Li · Michael Rubinstein · Ming-Hsuan Yang · Varun Jampani -
2022 Poster: ReCo: Retrieve and Co-segment for Zero-shot Transfer »
Gyungin Shin · Weidi Xie · Samuel Albanie -
2022 Poster: Flamingo: a Visual Language Model for Few-Shot Learning »
Jean-Baptiste Alayrac · Jeff Donahue · Pauline Luc · Antoine Miech · Iain Barr · Yana Hasson · Karel Lenc · Arthur Mensch · Katherine Millican · Malcolm Reynolds · Roman Ring · Eliza Rutherford · Serkan Cabi · Tengda Han · Zhitao Gong · Sina Samangooei · Marianne Monteiro · Jacob L Menick · Sebastian Borgeaud · Andy Brock · Aida Nematzadeh · Sahand Sharifzadeh · Mikołaj Bińkowski · Ricardo Barreira · Oriol Vinyals · Andrew Zisserman · Karén Simonyan -
2022 Poster: TAP-Vid: A Benchmark for Tracking Any Point in a Video »
Carl Doersch · Ankush Gupta · Larisa Markeeva · Adria Recasens · Lucas Smaira · Yusuf Aytar · Joao Carreira · Andrew Zisserman · Yi Yang -
2021 Poster: Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering »
Vincent Sitzmann · Semon Rezchikov · Bill Freeman · Josh Tenenbaum · Fredo Durand -
2021 Poster: ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction »
Gengshan Yang · Deqing Sun · Varun Jampani · Daniel Vlasic · Forrester Cole · Ce Liu · Deva Ramanan -
2020 Poster: CrossTransformers: spatially-aware few-shot transfer »
Carl Doersch · Ankush Gupta · Andrew Zisserman -
2020 Poster: Self-supervised Co-Training for Video Representation Learning »
Tengda Han · Weidi Xie · Andrew Zisserman -
2020 Poster: Self-Supervised MultiModal Versatile Networks »
Jean-Baptiste Alayrac · Adria Recasens · Rosalia Schneider · Relja Arandjelović · Jason Ramapuram · Jeffrey De Fauw · Lucas Smaira · Sander Dieleman · Andrew Zisserman -
2020 Poster: Multi-Plane Program Induction with 3D Box Priors »
Yikai Li · Jiayuan Mao · Xiuming Zhang · Bill Freeman · Josh Tenenbaum · Noah Snavely · Jiajun Wu -
2020 Demonstration: MosAIc: Finding Artistic Connections across Culture with Conditional Image Retrieval »
Mark Hamilton · Stephanie Fu · Mindren Lu · Johnny Bui · Margaret Wang · Felix Tran · Marina Rogers · Darius Bopp · Christopher Hoder · Lei Zhang · Bill Freeman -
2019 : Feathers, wings and the future of computer vision research »
Bill Freeman -
2019 Poster: Computational Mirrors: Blind Inverse Light Transport by Deep Matrix Factorization »
Miika Aittala · Prafull Sharma · Lukas Murmann · Adam Yedidia · Gregory Wornell · Bill Freeman · Fredo Durand -
2019 Poster: Unsupervised Learning of Object Keypoints for Perception and Control »
Tejas Kulkarni · Ankush Gupta · Catalin Ionescu · Sebastian Borgeaud · Malcolm Reynolds · Andrew Zisserman · Volodymyr Mnih -
2019 Poster: Sim2real transfer learning for 3D human pose estimation: motion to the rescue »
Carl Doersch · Andrew Zisserman -
2019 Poster: Unsupervised learning of object structure and dynamics from videos »
Matthias Minderer · Chen Sun · Ruben Villegas · Forrester Cole · Kevin Murphy · Honglak Lee -
2018 Poster: Learning to Reconstruct Shapes from Unseen Classes »
Xiuming Zhang · Zhoutong Zhang · Chengkai Zhang · Josh Tenenbaum · Bill Freeman · Jiajun Wu -
2018 Oral: Learning to Reconstruct Shapes from Unseen Classes »
Xiuming Zhang · Zhoutong Zhang · Chengkai Zhang · Josh Tenenbaum · Bill Freeman · Jiajun Wu -
2018 Poster: Visual Object Networks: Image Generation with Disentangled 3D Representations »
Jun-Yan Zhu · Zhoutong Zhang · Chengkai Zhang · Jiajun Wu · Antonio Torralba · Josh Tenenbaum · Bill Freeman -
2018 Poster: Learning to Navigate in Cities Without a Map »
Piotr Mirowski · Matt Grimes · Mateusz Malinowski · Karl Moritz Hermann · Keith Anderson · Denis Teplyashin · Karen Simonyan · koray kavukcuoglu · Andrew Zisserman · Raia Hadsell -
2018 Poster: Learning to Exploit Stability for 3D Scene Parsing »
Yilun Du · Zhijian Liu · Hector Basevi · Ales Leonardis · Bill Freeman · Josh Tenenbaum · Jiajun Wu -
2018 Poster: 3D-Aware Scene Manipulation via Inverse Graphics »
Shunyu Yao · Tzu Ming Hsu · Jun-Yan Zhu · Jiajun Wu · Antonio Torralba · Bill Freeman · Josh Tenenbaum -
2018 Poster: Co-regularized Alignment for Unsupervised Domain Adaptation »
Abhishek Kumar · Prasanna Sattigeri · Kahini Wadhawan · Leonid Karlinsky · Rogerio Feris · Bill Freeman · Gregory Wornell -
2017 : Sight and sound »
Bill Freeman -
2017 Spotlight: Shape and Material from Sound »
Zhoutong Zhang · Qiujia Li · Zhengjia Huang · Jiajun Wu · Josh Tenenbaum · Bill Freeman -
2017 Spotlight: Scene Physics Acquisition via Visual De-animation »
Jiajun Wu · Erika Lu · Pushmeet Kohli · Bill Freeman · Josh Tenenbaum -
2017 Poster: Learning to See Physics via Visual De-animation »
Jiajun Wu · Erika Lu · Pushmeet Kohli · Bill Freeman · Josh Tenenbaum -
2017 Poster: Shape and Material from Sound »
Zhoutong Zhang · Qiujia Li · Zhengjia Huang · Jiajun Wu · Josh Tenenbaum · Bill Freeman -
2017 Poster: MarrNet: 3D Shape Reconstruction via 2.5D Sketches »
Jiajun Wu · Yifan Wang · Tianfan Xue · Xingyuan Sun · Bill Freeman · Josh Tenenbaum -
2016 : Bill Freeman »
Bill Freeman -
2016 Poster: Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling »
Jiajun Wu · Chengkai Zhang · Tianfan Xue · Bill Freeman · Josh Tenenbaum -
2016 Poster: Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks »
Tianfan Xue · Jiajun Wu · Katherine Bouman · Bill Freeman -
2016 Oral: Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks »
Tianfan Xue · Jiajun Wu · Katherine Bouman · Bill Freeman -
2015 Poster: Spatial Transformer Networks »
Max Jaderberg · Karen Simonyan · Andrew Zisserman · koray kavukcuoglu -
2015 Spotlight: Spatial Transformer Networks »
Max Jaderberg · Karen Simonyan · Andrew Zisserman · koray kavukcuoglu -
2015 Poster: Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning »
Jiajun Wu · Ilker Yildirim · Joseph Lim · Bill Freeman · Josh Tenenbaum -
2014 Poster: Shape and Illumination from Shading using the Generic Viewpoint Assumption »
Daniel Zoran · Dilip Krishnan · José Bento · Bill Freeman -
2014 Poster: Two-Stream Convolutional Networks for Action Recognition in Videos »
Karen Simonyan · Andrew Zisserman -
2014 Spotlight: Two-Stream Convolutional Networks for Action Recognition in Videos »
Karen Simonyan · Andrew Zisserman -
2013 Poster: Deep Fisher Networks for Large-Scale Image Classification »
Karen Simonyan · Andrea Vedaldi · Andrew Zisserman -
2013 Spotlight: Deep Fisher Networks for Large-Scale Image Classification »
Karen Simonyan · Andrea Vedaldi · Andrew Zisserman -
2011 Poster: Pylon Model for Semantic Segmentation »
Victor Lempitsky · Andrea Vedaldi · Andrew Zisserman -
2010 Workshop: Machine Learning meets Computational Photography »
Stefan Harmeling · Michael Hirsch · Bill Freeman · Peyman Milanfar -
2010 Poster: Simultaneous Object Detection and Ranking with Weak Supervision »
Matthew B Blaschko · Andrea Vedaldi · Andrew Zisserman -
2010 Spotlight: Learning To Count Objects in Images »
Victor Lempitsky · Andrew Zisserman -
2010 Poster: Learning To Count Objects in Images »
Victor Lempitsky · Andrew Zisserman -
2009 Poster: Segmenting Scenes by Matching Image Composites »
Bryan C Russell · Alexei A Efros · Josef Sivic · Bill Freeman · Andrew Zisserman -
2009 Poster: Structured output regression for detection with partial truncation »
Andrea Vedaldi · Andrew Zisserman -
2009 Poster: Nonparametric Bayesian Texture Learning and Synthesis »
Leo Zhu · Yuanhao Chen · Bill Freeman · Antonio Torralba -
2008 Mini Symposium: Computational Photography »
Bill Freeman · Bernhard Schölkopf -
2008 Poster: SDL: Supervised Dictionary Learning »
Julien Mairal · Francis Bach · Jean A Ponce · Guillermo Sapiro · Andrew Zisserman -
2007 Spotlight: Learning Visual Attributes »
Vittorio Ferrari · Andrew Zisserman -
2007 Poster: Learning Visual Attributes »
Vittorio Ferrari · Andrew Zisserman -
2006 Poster: Bayesian Image Super-resolution, Continued »
Lyndsey C Pickup · David Capel · Stephen J Roberts · Andrew Zisserman -
2006 Spotlight: Bayesian Image Super-resolution, Continued »
Lyndsey C Pickup · David Capel · Stephen J Roberts · Andrew Zisserman