Skip to yearly menu bar Skip to main content

Oral Session

Oral 3C Diffusion Models

Room R06-R09 (level 2)
Wed 13 Dec 8 a.m. PST — 8:45 a.m. PST

Wed 13 Dec. 8:00 - 8:15 PST

Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation

Diederik Kingma · Ruiqi Gao

To achieve the highest perceptual quality, state-of-the-art diffusion models are optimized with objectives that typically look very different from the maximum likelihood and the Evidence Lower Bound (ELBO) objectives. In this work, we reveal that diffusion model objectives are actually closely related to the ELBO.Specifically, we show that all commonly used diffusion model objectives equate to a weighted integral of ELBOs over different noise levels, where the weighting depends on the specific objective used. Under the condition of monotonic weighting, the connection is even closer: the diffusion objective then equals the ELBO, combined with simple data augmentation, namely Gaussian noise perturbation. We show that this condition holds for a number of state-of-the-art diffusion models. In experiments, we explore new monotonic weightings and demonstrate their effectiveness, achieving state-of-the-art FID scores on the high-resolution ImageNet benchmark.

Wed 13 Dec. 8:15 - 8:30 PST

Entropic Neural Optimal Transport via Diffusion Processes

Nikita Gushchin · Alexander Kolesov · Alexander Korotin · Dmitry Vetrov · Evgeny Burnaev

We propose a novel neural algorithm for the fundamental problem of computing the entropic optimal transport (EOT) plan between probability distributions which are accessible by samples. Our algorithm is based on the saddle point reformulation of the dynamic version of EOT which is known as the Schrödinger Bridge problem. In contrast to the prior methods for large-scale EOT, our algorithm is end-to-end and consists of a single learning step, has fast inference procedure, and allows handling small values of the entropy regularization coefficient which is of particular importance in some applied problems. Empirically, we show the performance of the method on several large-scale EOT tasks. The code for the ENOT solver can be found at

Wed 13 Dec. 8:30 - 8:45 PST

DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models

Tsun-Hsuan Johnson Wang · Juntian Zheng · Pingchuan Ma · Yilun Du · Byungchul Kim · Andrew Spielberg · Josh Tenenbaum · Chuang Gan · Daniela Rus

Nature evolves creatures with a high complexity of morphological and behavioral intelligence, meanwhile computational methods lag in approaching that diversity and efficacy. Co-optimization of artificial creatures' morphology and control in silico shows promise for applications in physical soft robotics and virtual character creation; such approaches, however, require developing new learning algorithms that can reason about function atop pure structure. In this paper, we present DiffuseBot, a physics-augmented diffusion model that generates soft robot morphologies capable of excelling in a wide spectrum of tasks. \name bridges the gap between virtually generated content and physical utility by (i) augmenting the diffusion process with a physical dynamical simulation which provides a certificate of performance, and (ii) introducing a co-design procedure that jointly optimizes physical design and control by leveraging information about physical sensitivities from differentiable simulation. We showcase a range of simulated and fabricated robots along with their capabilities. Check our website: