Poster
Smoothed Energy Guidance: Guiding Diffusion Models by Attenuating Energy Curvature of Attention
Susung Hong
East Exhibit Hall A-C #1610
Diffusion models have shown remarkable success in visual content generation, producing high-quality samples across various domains. However, in the absence of text conditions, their performance in image generation has been limited due to the inapplicability of classifier-free guidance (CFG). Recent attempts to extend guidance to unconditional models have relied on heuristic approaches, resulting in unintended effects. In this work, we propose Smoothed Energy Guidance (SEG), a novel training- and condition-free approach that leverages the energy-based perspective of the self-attention mechanism to enhance image generation. By defining the energy of self-attention, we derive and introduce a method to actually reduce the curvature of the energy landscape of attention and use the output as the unconditional prediction. Contrary to previous works, we can smoothly control the curvature of the energy landscape by adjusting the Gaussian kernel parameter, with the guidance scale parameter fixed. Additionally, we present an efficient query blurring that is equivalent to blurring the entire attention map without significant computational overhead.
Live content is unavailable. Log in and register to view live content