Timezone: »

Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
Tianfan Xue · Jiajun Wu · Katherine Bouman · Bill Freeman

Wed Dec 07 02:00 AM -- 02:20 AM (PST) @ Area 1 + 2

We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods, which have tackled this problem in a deterministic or non-parametric way, we propose a novel approach which models future frames in a probabilistic manner. Our proposed method is therefore able to synthesize multiple possible next frames using the same model. Solving this challenging problem involves low- and high-level image and motion understanding for successful image synthesis. Here, we propose a novel network structure, namely a Cross Convolutional Network, that encodes images as feature maps and motion information as convolutional kernels to aid in synthesizing future frames. In experiments, our model performs well on both synthetic data, such as 2D shapes and animated game sprites, as well as on real-wold video data. We show that our model can also be applied to tasks such as visual analogy-making, and present analysis of the learned network representations.

Author Information

Tianfan Xue (MIT CSAIL)

Tianfan Xue is currently a fifth-year Ph.D. student in MIT CSAIL. Before that, he received his B.E. degree from Tsinghua Universtiy, and M.Phil. degree from The Chinese University of Hong Kong. His research interests include computer vision, image processing, and machine learning.

Jiajun Wu (MIT)

Jiajun Wu is a fifth-year Ph.D. student at Massachusetts Institute of Technology, advised by Professor Bill Freeman and Professor Josh Tenenbaum. His research interests lie on the intersection of computer vision, machine learning, and computational cognitive science. Before coming to MIT, he received his B.Eng. from Tsinghua University, China, advised by Professor Zhuowen Tu. He has also spent time working at research labs of Microsoft, Facebook, and Baidu.

Katherine Bouman (MIT)
Bill Freeman (MIT/Google)

More from the Same Authors