Timezone: »
Poster
Gradient-based Hyperparameter Optimization Over Long Horizons
Paul Micaelli · Amos Storkey
Gradient-based hyperparameter optimization has earned a widespread popularity in the context of few-shot meta-learning, but remains broadly impractical for tasks with long horizons (many gradient steps), due to memory scaling and gradient degradation issues. A common workaround is to learn hyperparameters online, but this introduces greediness which comes with a significant performance drop. We propose forward-mode differentiation with sharing (FDS), a simple and efficient algorithm which tackles memory scaling issues with forward-mode differentiation, and gradient degradation issues by sharing hyperparameters that are contiguous in time. We provide theoretical guarantees about the noise reduction properties of our algorithm, and demonstrate its efficiency empirically by differentiating through $\sim 10^4$ gradient steps of unrolled optimization. We consider large hyperparameter search ranges on CIFAR-10 where we significantly outperform greedy gradient-based alternatives, while achieving $\times 20$ speedups compared to the state-of-the-art black-box methods.
Author Information
Paul Micaelli (The University of Edinburgh)
PhD Student in Machine Learning
Amos Storkey (University of Edinburgh)
More from the Same Authors
-
2021 : Hamiltonian prior to Disentangle Content and Motion in Image Sequences »
Asif Khan · Amos Storkey -
2022 : Parity in predictive performance is neither necessary nor sufficient for fairness »
Justin Engelmann · Miguel Bernabeu · Amos Storkey -
2022 : Deep Class-Conditional Gaussians for Continual Learning »
Thomas Lee · Amos Storkey -
2022 Poster: Hamiltonian Latent Operators for content and motion disentanglement in image sequences »
Asif Khan · Amos Storkey -
2020 Poster: Self-Supervised Relational Reasoning for Representation Learning »
Massimiliano Patacchiola · Amos Storkey -
2020 Spotlight: Self-Supervised Relational Reasoning for Representation Learning »
Massimiliano Patacchiola · Amos Storkey -
2020 Poster: Bayesian Meta-Learning for the Few-Shot Setting via Deep Kernels »
Massimiliano Patacchiola · Jack Turner · Elliot Crowley · Michael O'Boyle · Amos Storkey -
2020 Spotlight: Bayesian Meta-Learning for the Few-Shot Setting via Deep Kernels »
Massimiliano Patacchiola · Jack Turner · Elliot Crowley · Michael O'Boyle · Amos Storkey -
2019 Poster: Zero-shot Knowledge Transfer via Adversarial Belief Matching »
Paul Micaelli · Amos Storkey -
2019 Spotlight: Zero-shot Knowledge Transfer via Adversarial Belief Matching »
Paul Micaelli · Amos Storkey -
2019 Poster: Learning to Learn By Self-Critique »
Antreas Antoniou · Amos Storkey -
2018 Poster: Moonshine: Distilling with Cheap Convolutions »
Elliot Crowley · Gavia Gray · Amos Storkey -
2015 Poster: Covariance-Controlled Adaptive Langevin Thermostat for Large-Scale Bayesian Sampling »
Xiaocheng Shang · Zhanxing Zhu · Benedict Leimkuhler · Amos Storkey -
2014 Workshop: NIPS Workshop on Transactional Machine Learning and E-Commerce »
David Parkes · David H Wolpert · Jennifer Wortman Vaughan · Jacob D Abernethy · Amos Storkey · Mark Reid · Ping Jin · Nihar Bhadresh Shah · Mehryar Mohri · Luis E Ortiz · Robin Hanson · Aaron Roth · Satyen Kale · Sebastien Lahaie -
2012 Poster: Continuous Relaxations for Discrete Hamiltonian Monte Carlo »
Zoubin Ghahramani · Yichuan Zhang · Charles Sutton · Amos Storkey -
2012 Spotlight: Continuous Relaxations for Discrete Hamiltonian Monte Carlo »
Zoubin Ghahramani · Yichuan Zhang · Charles Sutton · Amos Storkey -
2012 Poster: The Coloured Noise Expansion and Parameter Estimation of Diffusion Processes »
Simon Lyons · Amos Storkey · Simo Sarkka -
2011 Poster: Neuronal Adaptation for Sampling-Based Probabilistic Inference in Perceptual Bistability »
David Reichert · Peggy Series · Amos Storkey -
2011 Spotlight: Neuronal Adaptation for Sampling-Based Probabilistic Inference in Perceptual Bistability »
David Reichert · Peggy Series · Amos Storkey -
2010 Poster: Hallucinations in Charles Bonnet Syndrome Induced by Homeostasis: a Deep Boltzmann Machine Model »
David Reichert · Peggy Series · Amos Storkey -
2010 Poster: Sparse Instrumental Variables (SPIV) for Genome-Wide Studies »
Felix V Agakov · Paul McKeigue · Jon Krohn · Amos Storkey -
2007 Poster: Continuous Time Particle Filtering for fMRI »
Lawrence Murray · Amos Storkey -
2007 Poster: Modelling motion primitives and their timing in biologically executed movements »
Ben H Williams · Marc Toussaint · Amos Storkey -
2006 Poster: Learning Structural Equation Models for fMRI »
Amos Storkey · Enrico Simonotto · Heather Whalley · Stephen Lawrie · Lawrence Murray · David McGonigle -
2006 Poster: Mixture Regression for Covariate Shift »
Amos Storkey · Masashi Sugiyama