Timezone: »
We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding. Imagen builds on the power of large transformer language models in understanding text and hinges on the strength of diffusion models in high-fidelity image generation. Our key discovery is that generic large language models (e.g., T5), pretrained on text-only corpora, are surprisingly effective at encoding text for image synthesis: increasing the size of the language model in Imagen boosts both sample fidelity and image-text alignment much more than increasing the size of the image diffusion model. Imagen achieves a new state-of-the-art FID score of 7.27 on the COCO dataset, without ever training on COCO, and human raters find Imagen samples to be on par with the COCO data itself in image-text alignment. To assess text-to-image models in greater depth, we introduce DrawBench, a comprehensive and challenging benchmark for text-to-image models. With DrawBench, we compare Imagen with recent methods including VQ-GAN+CLIP, Latent Diffusion Models, and DALL-E 2, and find that human raters prefer Imagen over other models in side-by-side comparisons, both in terms of sample quality and image-text alignment.
Author Information
Chitwan Saharia (Google)
William Chan (Carnegie Mellon University)
Saurabh Saxena (Google)
Lala Li (Google)
Jay Whang (University of Texas, Austin)
Emily Denton (Google)
Kamyar Ghasemipour (Robotics @ Google, University of Toronto, Vector Institute)
Raphael Gontijo Lopes (Google Brain)
Burcu Karagol Ayan (Google)
Burcu Karagol Ayan is a software engineer at Google working on language understanding and responsible AI for multimodal generative models. She holds a PhD from the University of Maryland.
Tim Salimans (Google Brain Amsterdam)
Jonathan Ho (Google)
David Fleet (Google Research, Brain Team and University of Toronto)
Mohammad Norouzi (Google Brain)
More from the Same Authors
-
2021 : Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research »
Bernard Koch · Emily Denton · Alex Hanna · Jacob G Foster -
2021 : Artsheets for Art Datasets »
Ramya Srinivasan · Emily Denton · Jordan Famularo · Negar Rostamzadeh · Fernando Diaz · Beth Coleman -
2021 : AI and the Everything in the Whole Wide World Benchmark »
Deborah Raji · Emily Denton · Emily M. Bender · Alex Hanna · Amandalynne Paullada -
2021 : Palette: Image-to-Image Diffusion Models »
Chitwan Saharia · William Chan · Huiwen Chang · Chris Lee · Jonathan Ho · Tim Salimans · David Fleet · Mohammad Norouzi -
2021 : Classifier-Free Diffusion Guidance »
Jonathan Ho · Tim Salimans -
2021 : Classifier-Free Diffusion Guidance »
Jonathan Ho · Tim Salimans -
2021 : Palette: Image-to-Image Diffusion Models »
Chitwan Saharia · William Chan · Huiwen Chang · Chris Lee · Jonathan Ho · Tim Salimans · David Fleet · Mohammad Norouzi -
2021 : Why so pessimistic? Estimating uncertainties for offline rl through ensembles, and why their independence matters »
Kamyar Ghasemipour · Shixiang (Shane) Gu · Ofir Nachum -
2022 Poster: Residual Multiplicative Filter Networks for Multiscale Reconstruction »
Shayan Shekarforoush · David Lindell · David Fleet · Marcus Brubaker -
2022 : On Distillation of Guided Diffusion Models »
Chenlin Meng · Ruiqi Gao · Diederik Kingma · Stefano Ermon · Jonathan Ho · Tim Salimans -
2022 : Imagenary Patterns with Diffusion Models »
Mohammad Norouzi -
2022 Spotlight: Residual Multiplicative Filter Networks for Multiscale Reconstruction »
Shayan Shekarforoush · David Lindell · David Fleet · Marcus Brubaker -
2022 Spotlight: Lightning Talks 5B-1 »
Devansh Arpit · Xiaojun Xu · Zifan Shi · Ivan Skorokhodov · Shayan Shekarforoush · Zhan Tong · Yiqun Wang · Shichong Peng · Linyi Li · Ivan Skorokhodov · Huan Wang · Yibing Song · David Lindell · Yinghao Xu · Seyed Alireza Moazenipourasil · Sergey Tulyakov · Peter Wonka · Yiqun Wang · Ke Li · David Fleet · Yujun Shen · Yingbo Zhou · Bo Li · Jue Wang · Peter Wonka · Marcus Brubaker · Caiming Xiong · Limin Wang · Deli Zhao · Qifeng Chen · Dit-Yan Yeung -
2022 : Invited Speaker »
David Fleet -
2022 : Invited Talk: Mohammad Norouzi »
Mohammad Norouzi -
2022 : Interactive Industrial Panel »
Jiahao Sun · Ahmed Ibrahim · Marjan Ghazvininejad · Yu Cheng · Boxing Chen · Mohammad Norouzi · Rahul Gupta -
2022 Poster: Video Diffusion Models »
Jonathan Ho · Tim Salimans · Alexey Gritsenko · William Chan · Mohammad Norouzi · David Fleet -
2022 Poster: When does dough become a bagel? Analyzing the remaining mistakes on ImageNet »
Vijay Vasudevan · Benjamin Caine · Raphael Gontijo Lopes · Sara Fridovich-Keil · Rebecca Roelofs -
2022 Poster: Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters »
Kamyar Ghasemipour · Shixiang (Shane) Gu · Ofir Nachum -
2022 Poster: A Unified Sequence Interface for Vision Tasks »
Ting Chen · Saurabh Saxena · Lala Li · Tsung-Yi Lin · David Fleet · Geoffrey Hinton -
2022 Poster: Spectral Bias in Practice: The Role of Function Frequency in Generalization »
Sara Fridovich-Keil · Raphael Gontijo Lopes · Rebecca Roelofs -
2021 : Live panel: ImageNets of "x": ImageNet's Infrastructural Impact »
Emily Denton · Alex Hanna -
2021 : ImageNets of "x": ImageNet's Infrastructural Impact »
Emily Denton · Alex Hanna -
2021 : NLP with Synthetic Text »
Mohammad Norouzi -
2021 : Career and Life: Panel Discussion - Bo Li, Adriana Romero-Soriano, Devi Parikh, and Emily Denton »
Emily Denton · Devi Parikh · Bo Li · Adriana Romero -
2021 Poster: Why Do Better Loss Functions Lead to Less Transferable Features? »
Simon Kornblith · Ting Chen · Honglak Lee · Mohammad Norouzi -
2021 Poster: Intriguing Properties of Contrastive Losses »
Ting Chen · Calvin Luo · Lala Li -
2021 : Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research »
Bernard Koch · Emily Denton · Alex Hanna · Jacob G Foster -
2021 Poster: Variational Diffusion Models »
Diederik Kingma · Tim Salimans · Ben Poole · Jonathan Ho -
2020 Workshop: Resistance AI Workshop »
Suzanne Kite · Mattie Tesfaldet · J Khadijah Abdurahman · William Agnew · Elliot Creager · Agata Foryciarz · Raphael Gontijo Lopes · Pratyusha Kalluri · Marie-Therese Png · Manuel Sabin · Maria Skoularidou · Ramon Vilarino · Rose Wang · Sayash Kapoor · Micah Carroll -
2020 Poster: Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards »
Yijie Guo · Jongwook Choi · Marcin Moczulski · Shengyu Feng · Samy Bengio · Mohammad Norouzi · Honglak Lee -
2020 Poster: Denoising Diffusion Probabilistic Models »
Jonathan Ho · Ajay Jain · Pieter Abbeel -
2020 Poster: Exemplar VAE: Linking Generative Models, Nearest Neighbor Retrieval, and Data Augmentation »
Sajad Norouzi · David Fleet · Mohammad Norouzi -
2020 Poster: A Spectral Energy Distance for Parallel Speech Synthesis »
Alexey Gritsenko · Tim Salimans · Rianne van den Berg · Jasper Snoek · Nal Kalchbrenner -
2020 Poster: RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning »
Caglar Gulcehre · Ziyu Wang · Alexander Novikov · Thomas Paine · Sergio Gómez · Konrad Zolna · Rishabh Agarwal · Josh Merel · Daniel Mankowitz · Cosmin Paduraru · Gabriel Dulac-Arnold · Jerry Li · Mohammad Norouzi · Matthew Hoffman · Nicolas Heess · Nando de Freitas -
2020 Poster: Big Self-Supervised Models are Strong Semi-Supervised Learners »
Ting Chen · Simon Kornblith · Kevin Swersky · Mohammad Norouzi · Geoffrey E Hinton -
2020 : Policy Panel »
Roya Pakzad · Dia Kayyali · Marzyeh Ghassemi · Shakir Mohamed · Mohammad Norouzi · Ted Pedersen · Anver Emon · Abubakar Abid · Darren Byler · Samhaa R. El-Beltagy · Nayel Shafei · Mona Diab -
2020 Affinity Workshop: Muslims in ML »
Marzyeh Ghassemi · Mohammad Norouzi · Shakir Mohamed · Aya Salama · Tasmie Sarker -
2020 Affinity Workshop: Queer in AI Workshop @ NeurIPS 2020 »
Raphael Gontijo Lopes · Luke Stark · Melvin Selim Atay · ST John -
2019 : Poster Session »
Rishav Chourasia · Yichong Xu · Corinna Cortes · Chien-Yi Chang · Yoshihiro Nagano · So Yeon Min · Benedikt Boecking · Phi Vu Tran · Kamyar Ghasemipour · Qianggang Ding · Shouvik Mani · Vikram Voleti · Rasool Fakoor · Miao Xu · Kenneth Marino · Lisa Lee · Volker Tresp · Jean-Francois Kagy · Marvin Zhang · Barnabas Poczos · Dinesh Khandelwal · Adrien Bardes · Evan Shelhamer · Jiacheng Zhu · Ziming Li · Xiaoyan Li · Dmitrii Krasheninnikov · Ruohan Wang · Mayoore Jaiswal · Emad Barsoum · Suvansh Sanjeev · Theeraphol Wattanavekin · Qizhe Xie · Sifan Wu · Yuki Yoshida · David Kanaa · Sina Khoshfetrat Pakazad · Mehdi Maasoumy -
2019 Poster: Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model »
Guodong Zhang · Lala Li · Zachary Nado · James Martens · Sushant Sachdeva · George Dahl · Chris Shallue · Roger Grosse -
2019 Poster: A Fourier Perspective on Model Robustness in Computer Vision »
Dong Yin · Raphael Gontijo Lopes · Jonathon Shlens · Ekin Dogus Cubuk · Justin Gilmer -
2019 Poster: SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies »
Kamyar Ghasemipour · Shixiang (Shane) Gu · Richard Zemel -
2019 Poster: Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse »
James Lucas · George Tucker · Roger Grosse · Mohammad Norouzi -
2019 Poster: Compression with Flows via Local Bits-Back Coding »
Jonathan Ho · Evan Lohn · Pieter Abbeel -
2019 Spotlight: Compression with Flows via Local Bits-Back Coding »
Jonathan Ho · Evan Lohn · Pieter Abbeel -
2018 Poster: Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning »
Supasorn Suwajanakorn · Noah Snavely · Jonathan Tompson · Mohammad Norouzi -
2018 Oral: Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning »
Supasorn Suwajanakorn · Noah Snavely · Jonathan Tompson · Mohammad Norouzi -
2018 Poster: Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing »
Chen Liang · Mohammad Norouzi · Jonathan Berant · Quoc V Le · Ni Lao -
2018 Spotlight: Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing »
Chen Liang · Mohammad Norouzi · Jonathan Berant · Quoc V Le · Ni Lao -
2017 Poster: Bridging the Gap Between Value and Policy Based Reinforcement Learning »
Ofir Nachum · Mohammad Norouzi · Kelvin Xu · Dale Schuurmans -
2017 Poster: Filtering Variational Objectives »
Chris Maddison · John Lawson · George Tucker · Nicolas Heess · Mohammad Norouzi · Andriy Mnih · Arnaud Doucet · Yee Teh -
2016 Poster: Reward Augmented Maximum Likelihood for Neural Structured Prediction »
Mohammad Norouzi · Samy Bengio · zhifeng Chen · Navdeep Jaitly · Mike Schuster · Yonghui Wu · Dale Schuurmans -
2015 Poster: Efficient Non-greedy Optimization of Decision Trees »
Mohammad Norouzi · Maxwell Collins · Matthew A Johnson · David Fleet · Pushmeet Kohli -
2013 Poster: Efficient Optimization for Sparse Gaussian Process Regression »
Yanshuai Cao · Marcus Brubaker · David Fleet · Aaron Hertzmann -
2012 Poster: Hamming Distance Metric Learning »
Mohammad Norouzi · Russ Salakhutdinov · David Fleet -
2008 Session: Oral session 7: Complex Dynamical Systems: Modeling and Estimation »
David Fleet