Timezone: »
Recent text-to-image diffusion models trained on large-scale data achieve remarkable performance on text-conditioned image synthesis (e.g., GLIDE, DALL∙E 2, Imagen, Stable Diffusion). This paper presents an embarrassingly simple method to use these text-to-image diffusion models as zero-shot image-to-image editors. Our method, CycleDiffusion, is based on a recent finding that, when the "random seed" is fixed, sampling from two diffusion model distributions will produce images with minimal differences, and the core of our idea is to infer the "random seed" that is likely to produce a source image conditioned on a source text. We formalize the "random seed" as a sequence of isometric Gaussian noises that we reformulate as diffusion models' latent code. Using the "random seed" inferred from the source text-image pair, we generate a target image conditioned a target text. Experiments show that CycleDiffusion can minimally edit the image in a zero-shot manner.
Author Information
Chen Henry Wu (Carnegie Mellon University)
I am working on generative models.
Fernando D De la Torre (Carnegie Mellon University)
More from the Same Authors
-
2022 Poster: Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models »
Chen Henry Wu · Saman Motamed · Shaunak Srivastava · Fernando D De la Torre -
2011 Poster: Matrix Completion for Image Classification »
Ricardo S Cabral · Fernando D De la Torre · Joao P Costeira · Alexandre Bernardino -
2009 Poster: Canonical Time Warping for Alignment of Human Behavior »
Feng Zhou · Fernando D De la Torre -
2008 Poster: Robust Kernel Principal Component Analysis »
Minh Hoai Nguyen · Fernando D De la Torre -
2008 Spotlight: Robust Kernel Principal Component Analysis »
Minh Hoai Nguyen · Fernando D De la Torre