Timezone: »
In optimization the duality gap between the primal and the dual problems is a measure of the suboptimality of any primal-dual point. In classical mechanics the equations of motion of a system can be derived from the Hamiltonian function, which is a quantity that describes the total energy of the system. In this paper we consider a convex optimization problem consisting of the sum of two convex functions, sometimes referred to as a composite objective, and we identify the duality gap to be the `energy' of the system. In the Hamiltonian formalism the energy is conserved, so we add a contractive term to the standard equations of motion so that this energy decreases linearly (ie, geometrically) with time. This yields a continuous-time ordinary differential equation (ODE) in the primal and dual variables which converges to zero duality gap, ie, optimality. This ODE has several useful properties: it induces a natural operator splitting; at convergence it yields both the primal and dual solutions; and it is invariant to affine transformation despite only using first order information. We provide several discretizations of this ODE, some of which are new algorithms and others correspond to known techniques, such as the alternating direction method of multipliers (ADMM). We conclude with some numerical examples that show the promise of our approach. We give an example where our technique can solve a convex quadratic minimization problem orders of magnitude faster than several commonly-used gradient methods, including conjugate gradient, when the conditioning of the problem is poor. Our framework provides new insights into previously known algorithms in the literature as well as providing a technique to generate new primal-dual algorithms.
Author Information
Brendan O'Donoghue (DeepMind)
Chris Maddison (Institute for Advanced Study, Princeton)
More from the Same Authors
-
2021 Spotlight: Lossy Compression for Lossless Prediction »
Yann Dubois · Benjamin Bloem-Reddy · Karen Ullrich · Chris Maddison -
2021 Spotlight: Variational Bayesian Optimistic Sampling »
Brendan O'Donoghue · Tor Lattimore -
2021 Spotlight: Reward is enough for convex MDPs »
Tom Zahavy · Brendan O'Donoghue · Guillaume Desjardins · Satinder Singh -
2021 Spotlight: Learning Generalized Gumbel-max Causal Mechanisms »
Guy Lorberbom · Daniel D. Johnson · Chris Maddison · Daniel Tarlow · Tamir Hazan -
2021 : Optimal Representations for Covariate Shifts »
Yann Dubois · Yangjun Ruan · Chris Maddison -
2023 Poster: Probabilistic Invariant Learning with Randomized Linear Classifiers »
Leonardo Cotta · Gal Yehuda · Assaf Schuster · Chris Maddison -
2023 Poster: Shaped Attention Mechanism in the Infinite Depth-and-Width Limit at Initialization »
Lorenzo Noci · Chuning Li · Mufan Li · Bobby He · Thomas Hofmann · Chris Maddison · Dan Roy -
2023 Poster: MeGraph: Capturing Long-Range Interactions by Alternating Local and Hierarchical Aggregation on Multi-Scaled Graph Hierarchy »
Honghua Dong · Jiawei Xu · Yu Yang · Rui Zhao · Shiwen Wu · Chun Yuan · Xiu Li · Chris Maddison · Lei Han -
2021 : Invited Talk 6 »
Chris Maddison -
2021 Poster: Reward is enough for convex MDPs »
Tom Zahavy · Brendan O'Donoghue · Guillaume Desjardins · Satinder Singh -
2021 Poster: Practical Large-Scale Linear Programming using Primal-Dual Hybrid Gradient »
David Applegate · Mateo Diaz · Oliver Hinder · Haihao Lu · Miles Lubin · Brendan O'Donoghue · Warren Schudy -
2021 Poster: Variational Bayesian Optimistic Sampling »
Brendan O'Donoghue · Tor Lattimore -
2021 Poster: Variational Bayesian Reinforcement Learning with Regret Bounds »
Brendan O'Donoghue -
2021 Poster: Lossy Compression for Lossless Prediction »
Yann Dubois · Benjamin Bloem-Reddy · Karen Ullrich · Chris Maddison -
2021 Poster: Learning Generalized Gumbel-max Causal Mechanisms »
Guy Lorberbom · Daniel D. Johnson · Chris Maddison · Daniel Tarlow · Tamir Hazan -
2020 Poster: Gradient Estimation with Stochastic Softmax Tricks »
Max Paulus · Dami Choi · Danny Tarlow · Andreas Krause · Chris Maddison -
2020 Oral: Gradient Estimation with Stochastic Softmax Tricks »
Max Paulus · Dami Choi · Danny Tarlow · Andreas Krause · Chris Maddison -
2020 Poster: Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces »
Guy Lorberbom · Chris Maddison · Nicolas Heess · Tamir Hazan · Danny Tarlow -
2019 Poster: Continuous Hierarchical Representations with PoincarĂ© Variational Auto-Encoders »
Emile Mathieu · Charline Le Lan · Chris Maddison · Ryota Tomioka · Yee Whye Teh -
2017 Poster: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models »
George Tucker · Andriy Mnih · Chris J Maddison · John Lawson · Jascha Sohl-Dickstein -
2017 Oral: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models »
George Tucker · Andriy Mnih · Chris J Maddison · John Lawson · Jascha Sohl-Dickstein -
2017 Poster: Filtering Variational Objectives »
Chris Maddison · John Lawson · George Tucker · Nicolas Heess · Mohammad Norouzi · Andriy Mnih · Arnaud Doucet · Yee Teh -
2014 Poster: A* Sampling »
Chris Maddison · Danny Tarlow · Tom Minka -
2014 Oral: A* Sampling »
Chris Maddison · Danny Tarlow · Tom Minka -
2013 Poster: Annealing between distributions by averaging moments »
Roger Grosse · Chris Maddison · Russ Salakhutdinov -
2013 Oral: Annealing between distributions by averaging moments »
Roger Grosse · Chris Maddison · Russ Salakhutdinov