Missing data poses significant challenges while learning representations of video sequences. We present Disentangled Imputed Video autoEncoder (DIVE), a deep generative model that imputes and predicts future video frames in the presence of missing data. Specifically, DIVE introduces a missingness latent variable, disentangles the hidden video representations into static and dynamic appearance, pose, and missingness factors for each object, while it imputes each object trajectory where data is missing. On a moving MNIST dataset with various missing scenarios, DIVE outperforms the state of the art baselines by a substantial margin. We also present comparisons on a real-world MOTSChallenge pedestrian dataset, which demonstrates the practical value of our method in a more realistic setting. Our code can be found in https://github.com/Rose-STL-Lab/DIVE.
Armand Comas (Northeastern University)
Passionate about video neurosymbolic representation learning, object-oriented learning, relational inference, abstract reasoning, causality and dynamics. But I'll enjoy discussing any topic!
Chi Zhang (Northeastern University)
Zlatan Feric (Northeastern University)
Octavia Camps (Northeastern University)
Rose Yu (University of California, San Diego)
More from the Same Authors
2020 Workshop: Machine Learning for Engineering Modeling, Simulation and Design »
Alex Beatson · Priya Donti · Amira Abdel-Rahman · Stephan Hoyer · Rose Yu · J. Zico Kolter · Ryan Adams
2020 Poster: Deep Imitation Learning for Bimanual Robotic Manipulation »
Fan Xie · Alexander Chowdhury · M. Clara De Paolis Kaluza · Linfeng Zhao · Lawson Wong · Rose Yu
2020 Session: Orals & Spotlights Track 06: Dynamical Sys/Density/Sparsity »
Animesh Garg · Rose Yu