Timezone: »
Uncertainty estimation in large deep-learning models is a computationally challenging task, where it is difficult to form even a Gaussian approximation to the posterior distribution. In such situations, existing methods usually resort to a diagonal approximation of the covariance matrix despite the fact that these matrices are known to give poor uncertainty estimates. To address this issue, we propose a new stochastic, low-rank, approximate natural-gradient (SLANG) method for variational inference in large deep models. Our method estimates a “diagonal plus low-rank” structure based solely on back-propagated gradients of the network log-likelihood. This requires strictly less gradient computations than methods that compute the gradient of the whole variational objective. Empirical evaluations on standard benchmarks confirm that SLANG enables faster and more accurate estimation of uncertainty than mean-field methods, and performs comparably to state-of-the-art methods.
Author Information
Aaron Mishkin (University of British Columbia)
Frederik Kunstner (EPFL)
Didrik Nielsen (DTU Compute)
Mark Schmidt (University of British Columbia)
Mohammad Emtiyaz Khan (RIKEN, Tokyo)
More from the Same Authors
-
2021 : Heavy-tailed noise does not explain the gap between SGD and Adam on Transformers »
Jacques Chen · Frederik Kunstner · Mark Schmidt -
2021 : Heavy-tailed noise does not explain the gap between SGD and Adam on Transformers »
Jacques Chen · Frederik Kunstner · Mark Schmidt -
2021 : Faster Quasi-Newton Methods for Linear Composition Problems »
Betty Shea · Mark Schmidt -
2021 : A Closer Look at Gradient Estimators with Reinforcement Learning as Inference »
Jonathan Lavington · Michael Teng · Mark Schmidt · Frank Wood -
2021 : An Empirical Study of Non-Uniform Sampling in Off-Policy Reinforcement Learning for Continuous Control »
Nicholas Ioannidis · Jonathan Lavington · Mark Schmidt -
2022 : Target-based Surrogates for Stochastic Optimization »
Jonathan Lavington · Sharan Vaswani · Reza Babanezhad Harikandeh · Mark Schmidt · Nicolas Le Roux -
2022 : Fast Convergence of Greedy 2-Coordinate Updates for Optimizing with an Equality Constraint »
Amrutha Varshini Ramesh · Aaron Mishkin · Mark Schmidt -
2022 : Fast Convergence of Random Reshuffling under Interpolation and the Polyak-Łojasiewicz Condition »
Chen Fan · Christos Thrampoulidis · Mark Schmidt -
2022 : Practical Structured Riemannian Optimization with Momentum by using Generalized Normal Coordinates »
Wu Lin · Valentin Duruisseaux · Melvin Leok · Frank Nielsen · Mohammad Emtiyaz Khan · Mark Schmidt -
2022 : Diffusion Models for Video Prediction and Infilling »
Tobias Höppe · Arash Mehrjou · Stefan Bauer · Didrik Nielsen · Andrea Dittadi -
2022 : Few-Shot Diffusion Models »
Giorgio Giannone · Didrik Nielsen · Ole Winther -
2022 : Can Calibration Improve Sample Prioritization? »
Ganesh Tata · Gautham Krishna Gudur · Gopinath Chennupati · Mohammad Emtiyaz Khan -
2022 : Invited Keynote 2 »
Mohammad Emtiyaz Khan · Mohammad Emtiyaz Khan -
2021 : Panel »
Mohammad Emtiyaz Khan · Atoosa Kasirzadeh · Anna Rogers · Javier González · Suresh Venkatasubramanian · Robert Williamson -
2020 : Closing remarks »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 : Live Q&A with Michael Friedlander (Zoom) »
Mark Schmidt -
2020 : Intro to Invited Speaker 8 »
Mark Schmidt -
2020 : Contributed talks in Session 3 (Zoom) »
Mark Schmidt · Zhan Gao · Wenjie Li · Preetum Nakkiran · Denny Wu · Chengrun Yang -
2020 : Live Q&A with Rachel Ward (Zoom) »
Mark Schmidt -
2020 : Live Q&A with Ashia Wilson (Zoom) »
Mark Schmidt -
2020 : Welcome remarks to Session 3 »
Mark Schmidt -
2020 Workshop: OPT2020: Optimization for Machine Learning »
Courtney Paquette · Mark Schmidt · Sebastian Stich · Quanquan Gu · Martin Takac -
2020 : Welcome event (gather.town) »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 Poster: Regret Bounds without Lipschitz Continuity: Online Learning with Relative-Lipschitz Losses »
Yihan Zhou · Victor Sanches Portella · Mark Schmidt · Nicholas Harvey -
2020 Poster: Continual Deep Learning by Functional Regularisation of Memorable Past »
Pingbo Pan · Siddharth Swaroop · Alexander Immer · Runa Eschenhagen · Richard Turner · Mohammad Emtiyaz Khan -
2020 Oral: Continual Deep Learning by Functional Regularisation of Memorable Past »
Pingbo Pan · Siddharth Swaroop · Alexander Immer · Runa Eschenhagen · Richard Turner · Mohammad Emtiyaz Khan -
2019 : Poster session »
Sebastian Farquhar · Erik Daxberger · Andreas Look · Matt Benatan · Ruiyi Zhang · Marton Havasi · Fredrik Gustafsson · James A Brofos · Nabeel Seedat · Micha Livne · Ivan Ustyuzhaninov · Adam Cobb · Felix D McGregor · Patrick McClure · Tim R. Davidson · Gaurush Hiranandani · Sanjeev Arora · Masha Itkina · Didrik Nielsen · William Harvey · Matias Valdenegro-Toro · Stefano Peluchetti · Riccardo Moriconi · Tianyu Cui · Vaclav Smidl · Taylan Cemgil · Jack Fitzsimons · He Zhao · · mariana vargas vieyra · Apratim Bhattacharyya · Rahul Sharma · Geoffroy Dubourg-Felonneau · Jonathan Warrell · Slava Voloshynovskiy · Mihaela Rosca · Jiaming Song · Andrew Ross · Homa Fashandi · Ruiqi Gao · Hooshmand Shokri Razaghi · Joshua Chang · Zhenzhong Xiao · Vanessa Boehm · Giorgio Giannone · Ranganath Krishnan · Joe Davison · Arsenii Ashukha · Jeremiah Liu · Sicong (Sheldon) Huang · Evgenii Nikishin · Sunho Park · Nilesh Ahuja · Mahesh Subedar · · Artyom Gadetsky · Jhosimar Arias Figueroa · Tim G. J. Rudner · Waseem Aslam · Adrián Csiszárik · John Moberg · Ali Hebbal · Kathrin Grosse · Pekka Marttinen · Bang An · Hlynur Jónsson · Samuel Kessler · Abhishek Kumar · Mikhail Figurnov · Omesh Tickoo · Steindor Saemundsson · Ari Heljakka · Dániel Varga · Niklas Heim · Simone Rossi · Max Laves · Waseem Gharbieh · Nicholas Roberts · Luis Armando Pérez Rey · Matthew Willetts · Prithvijit Chakrabarty · Sumedh Ghaisas · Carl Shneider · Wray Buntine · Kamil Adamczewski · Xavier Gitiaux · Suwen Lin · Hao Fu · Gunnar Rätsch · Aidan Gomez · Erik Bodin · Dinh Phung · Lennart Svensson · Juliano Tusi Amaral Laganá Pinto · Milad Alizadeh · Jianzhun Du · Kevin Murphy · Beatrix Benkő · Shashaank Vattikuti · Jonathan Gordon · Christopher Kanan · Sontje Ihler · Darin Graham · Michael Teng · Louis Kirsch · Tomas Pevny · Taras Holotyak -
2019 Poster: Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates »
Sharan Vaswani · Aaron Mishkin · Issam Laradji · Mark Schmidt · Gauthier Gidel · Simon Lacoste-Julien -
2019 Poster: Limitations of the empirical Fisher approximation for natural gradient descent »
Frederik Kunstner · Philipp Hennig · Lukas Balles -
2016 : Fast Patch-based Style Transfer of Arbitrary Style »
Tian Qi Chen · Mark Schmidt -
2015 Poster: StopWasting My Gradients: Practical SVRG »
Reza Babanezhad Harikandeh · Mohamed Osama Ahmed · Alim Virani · Mark Schmidt · Jakub Konečný · Scott Sallinen