Timezone: »
Palette: Image-to-Image Diffusion Models
Chitwan Saharia · William Chan · Huiwen Chang · Chris Lee · Jonathan Ho · Tim Salimans · David Fleet · Mohammad Norouzi
Event URL: https://openreview.net/forum?id=c7NBMfDXbW »
We introduce Palette, a simple and general framework for image-to-image translation using conditional diffusion models. Palette models trained on four challenging image-to-image translation tasks (colorization, inpainting, uncropping, and JPEG restoration) outperform strong GAN and regression baselines and bridge the gap with natural images in terms of sample quality scores. This is accomplished without task-specific hyper-parameter tuning, architecture customization, or any auxiliary loss, demonstrating a desirable degree of generality and flexibility. We uncover the impact of an $L_2 $vs. $L_1$ loss in the denoising diffusion objective on sample diversity, and demonstrate the importance of self-attention through empirical architecture studies. Importantly, we advocate a unified evaluation protocol based on ImageNet, with human evaluation and sample quality scores (FID, Inception Score, Classification Accuracy of a pre-trained ResNet-50, and Perceptual Distance against original images). We expect this standardized evaluation protocol to play a critical role in advancing image-to-image translation research. Finally, we show that a generalist, multi-task Palette model performs as well or better than task-specific specialist counterparts. Check out https://bit.ly/palette-diffusion for more details.
We introduce Palette, a simple and general framework for image-to-image translation using conditional diffusion models. Palette models trained on four challenging image-to-image translation tasks (colorization, inpainting, uncropping, and JPEG restoration) outperform strong GAN and regression baselines and bridge the gap with natural images in terms of sample quality scores. This is accomplished without task-specific hyper-parameter tuning, architecture customization, or any auxiliary loss, demonstrating a desirable degree of generality and flexibility. We uncover the impact of an $L_2 $vs. $L_1$ loss in the denoising diffusion objective on sample diversity, and demonstrate the importance of self-attention through empirical architecture studies. Importantly, we advocate a unified evaluation protocol based on ImageNet, with human evaluation and sample quality scores (FID, Inception Score, Classification Accuracy of a pre-trained ResNet-50, and Perceptual Distance against original images). We expect this standardized evaluation protocol to play a critical role in advancing image-to-image translation research. Finally, we show that a generalist, multi-task Palette model performs as well or better than task-specific specialist counterparts. Check out https://bit.ly/palette-diffusion for more details.
Author Information
Chitwan Saharia (Google)
William Chan (Carnegie Mellon University)
Huiwen Chang (Google Research)
Chris Lee (Carnegie Mellon University)
Jonathan Ho (Google Brain)
Tim Salimans (Google Brain Amsterdam)
David Fleet (University of Toronto)
Mohammad Norouzi (Google Brain)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 : Palette: Image-to-Image Diffusion Models »
Dates n/a. Room None
More from the Same Authors
-
2021 : Classifier-Free Diffusion Guidance »
Jonathan Ho · Tim Salimans -
2021 : Classifier-Free Diffusion Guidance »
Jonathan Ho · Tim Salimans -
2021 Poster: Why Do Better Loss Functions Lead to Less Transferable Features? »
Simon Kornblith · Ting Chen · Honglak Lee · Mohammad Norouzi -
2021 Poster: Structured Denoising Diffusion Models in Discrete State-Spaces »
Jacob Austin · Daniel D. Johnson · Jonathan Ho · Daniel Tarlow · Rianne van den Berg -
2021 Poster: Variational Diffusion Models »
Diederik Kingma · Tim Salimans · Ben Poole · Jonathan Ho -
2020 Poster: Exemplar VAE: Linking Generative Models, Nearest Neighbor Retrieval, and Data Augmentation »
Sajad Norouzi · David Fleet · Mohammad Norouzi -
2020 Poster: A Spectral Energy Distance for Parallel Speech Synthesis »
Alexey Gritsenko · Tim Salimans · Rianne van den Berg · Jasper Snoek · Nal Kalchbrenner -
2020 : Policy Panel »
Roya Pakzad · Dia Kayyali · Marzyeh Ghassemi · Shakir Mohamed · Mohammad Norouzi · Ted Pedersen · Anver Emon · Abubakar Abid · Darren Byler · Samhaa R. El-Beltagy · Nayel Shafei · Mona Diab -
2020 Affinity Workshop: Muslims in ML »
Marzyeh Ghassemi · Mohammad Norouzi · Shakir Mohamed · Aya Salama · Tasmie Sarker -
2017 Poster: Bridging the Gap Between Value and Policy Based Reinforcement Learning »
Ofir Nachum · Mohammad Norouzi · Kelvin Xu · Dale Schuurmans -
2017 Poster: Filtering Variational Objectives »
Chris Maddison · John Lawson · George Tucker · Nicolas Heess · Mohammad Norouzi · Andriy Mnih · Arnaud Doucet · Yee Teh -
2016 Poster: Generative Adversarial Imitation Learning »
Jonathan Ho · Stefano Ermon -
2015 Poster: Efficient Non-greedy Optimization of Decision Trees »
Mohammad Norouzi · Maxwell Collins · Matthew A Johnson · David Fleet · Pushmeet Kohli -
2013 Poster: Efficient Optimization for Sparse Gaussian Process Regression »
Yanshuai Cao · Marcus Brubaker · David Fleet · Aaron Hertzmann -
2012 Poster: Hamming Distance Metric Learning »
Mohammad Norouzi · Russ Salakhutdinov · David Fleet -
2008 Session: Oral session 7: Complex Dynamical Systems: Modeling and Estimation »
David Fleet