Timezone: »
We study the design of deep architectures for lossy image compression. We present two architectural recipes in the context of multi-stage progressive encoders and empirically demonstrate their importance on compression performance. Specifically, we show that: 1) predicting the original image data from residuals in a multi-stage progressive architecture facilitates learning and leads to improved performance at approximating the original content and 2) learning to inpaint (from neighboring image pixels) before performing compression reduces the amount of information that must be stored to achieve a high-quality approximation. Incorporating these design choices in a baseline progressive encoder yields an average reduction of over 60% in file size with similar quality compared to the original residual encoder.
Author Information
Mohammad Haris Baig (Dartmouth College)
Vladlen Koltun (Intel Labs)
Lorenzo Torresani (Dartmouth/Facebook)
Lorenzo Torresani is an Associate Professor with tenure in the Computer Science Department at Dartmouth College and a Research Scientist at Facebook AI. He received a Laurea Degree in Computer Science with summa cum laude honors from the University of Milan (Italy) in 1996, and an M.S. and a Ph.D. in Computer Science from Stanford University in 2001 and 2005, respectively. In the past, he has worked at several industrial research labs including Microsoft Research Cambridge, Like.com and Digital Persona. His research interests are in computer vision and deep learning. He is the recipient of several awards, including a CVPR best student paper prize, a National Science Foundation CAREER Award, a Google Faculty Research Award, three Facebook Faculty Awards, and a Fulbright U.S. Scholar Award.
More from the Same Authors
-
2020 Workshop: Machine Learning for Autonomous Driving »
Rowan McAllister · Xinshuo Weng · Daniel Omeiza · Nick Rhinehart · Fisher Yu · German Ros · Vladlen Koltun -
2020 Poster: Self-Supervised Learning by Cross-Modal Audio-Video Clustering »
Humam Alwassel · Dhruv Mahajan · Bruno Korbar · Lorenzo Torresani · Bernard Ghanem · Du Tran -
2020 Poster: Multiscale Deep Equilibrium Models »
Shaojie Bai · Vladlen Koltun · J. Zico Kolter -
2020 Poster: COBE: Contextualized Object Embeddings from Narrated Instructional Video »
Gedas Bertasius · Lorenzo Torresani -
2020 Spotlight: Self-Supervised Learning by Cross-Modal Audio-Video Clustering »
Humam Alwassel · Dhruv Mahajan · Bruno Korbar · Lorenzo Torresani · Bernard Ghanem · Du Tran -
2020 Oral: Multiscale Deep Equilibrium Models »
Shaojie Bai · Vladlen Koltun · J. Zico Kolter -
2019 : Vladlen Koltun (Intel) »
Vladlen Koltun -
2019 : Invited Talk »
Vladlen Koltun -
2019 Poster: STAR-Caps: Capsule Networks with Straight-Through Attentive Routing »
Karim Ahmed · Lorenzo Torresani -
2019 Poster: Learning Temporal Pose Estimation from Sparsely-Labeled Videos »
Gedas Bertasius · Christoph Feichtenhofer · Du Tran · Jianbo Shi · Lorenzo Torresani -
2019 Poster: Differentiable Cloth Simulation for Inverse Problems »
Junbang Liang · Ming Lin · Vladlen Koltun -
2019 Poster: Deep Equilibrium Models »
Shaojie Bai · J. Zico Kolter · Vladlen Koltun -
2019 Spotlight: Deep Equilibrium Models »
Shaojie Bai · J. Zico Kolter · Vladlen Koltun -
2018 Poster: Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization »
Bruno Korbar · Du Tran · Lorenzo Torresani -
2016 : ViCom: Benchmark and Methods for Video Comprehension »
Du Tran · Maksim Bolonkin · Manohar Paluri · Lorenzo Torresani -
2016 : Introduction »
Lorenzo Torresani -
2016 Workshop: Large Scale Computer Vision Systems »
Manohar Paluri · Lorenzo Torresani · Gal Chechik · Dario Garcia · Du Tran