Timezone: »
In this work, we improve upon the stepwise analysis of noisy iterative learning algorithms initiated by Pensia, Jog, and Loh (2018) and recently extended by Bu, Zou, and Veeravalli (2019). Our main contributions are significantly improved mutual information bounds for Stochastic Gradient Langevin Dynamics via data-dependent estimates. Our approach is based on the variational characterization of mutual information and the use of data-dependent priors that forecast the mini-batch gradient based on a subset of the training samples. Our approach is broadly applicable within the information-theoretic framework of Russo and Zou (2015) and Xu and Raginsky (2017). Our bound can be tied to a measure of flatness of the empirical risk surface. As compared with other bounds that depend on the squared norms of gradients, empirical investigations show that the terms in our bounds are orders of magnitude smaller.
Author Information
Jeffrey Negrea (University of Toronto)
Mahdi Haghifam (University of Toronto)
Gintare Karolina Dziugaite (Element AI)
Ashish Khisti (University of Toronto)
Daniel Roy (Univ of Toronto & Vector)
More from the Same Authors
-
2021 Spotlight: Towards a Unified Information-Theoretic Framework for Generalization »
Mahdi Haghifam · Gintare Karolina Dziugaite · Shay Moran · Dan Roy -
2021 : Relaxing the I.I.D. Assumption: Adaptively Minimax Optimal Regret via Root-Entropic Regularization »
Blair Bilodeau · Jeffrey Negrea · Dan Roy -
2021 : Cross-Domain Lossy Compression as Optimal Transport with an Entropy Bottleneck »
Huan Liu · George Zhang · Jun Chen · Ashish Khisti -
2021 : Stochastic Pruning: Fine-Tuning, and PAC-Bayes bound optimization »
Soufiane Hayou · Bobby He · Gintare Karolina Dziugaite -
2021 : The Dynamics of Functional Diversity throughout Neural Network Training »
Lee Zamparo · Marc-Etienne Brunet · Thomas George · Sepideh Kharaghani · Gintare Karolina Dziugaite -
2021 : Your Dataset is a Multiset and You Should Compress it Like One »
Daniel Severo · James Townsend · Ashish Khisti · Alireza Makhzani · Karen Ullrich -
2022 : Unmasking the Lottery Ticket Hypothesis: Efficient Adaptive Pruning for Finding Winning Tickets »
Mansheej Paul · Feng Chen · Brett Larsen · Jonathan Frankle · Surya Ganguli · Gintare Karolina Dziugaite -
2022 : The Effect of Data Dimensionality on Neural Network Prunability »
Zachary Ankner · Alex Renda · Gintare Karolina Dziugaite · Jonathan Frankle · Tian Jin -
2022 Poster: Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks »
Mansheej Paul · Brett Larsen · Surya Ganguli · Jonathan Frankle · Gintare Karolina Dziugaite -
2022 Poster: Pruning’s Effect on Generalization Through the Lens of Training and Regularization »
Tian Jin · Michael Carbin · Dan Roy · Jonathan Frankle · Gintare Karolina Dziugaite -
2021 : Your Dataset is a Multiset and You Should Compress it Like One »
Daniel Severo · James Townsend · Ashish Khisti · Alireza Makhzani · Karen Ullrich -
2021 Poster: Universal Rate-Distortion-Perception Representations for Lossy Compression »
George Zhang · Jingjing Qian · Jun Chen · Ashish Khisti -
2021 Poster: The future is log-Gaussian: ResNets and their infinite-depth-and-width limit at initialization »
Mufan Li · Mihai Nica · Dan Roy -
2021 Poster: Minimax Optimal Quantile and Semi-Adversarial Regret via Root-Logarithmic Regularizers »
Jeffrey Negrea · Blair Bilodeau · Nicolò Campolongo · Francesco Orabona · Dan Roy -
2021 Poster: Deep Learning on a Data Diet: Finding Important Examples Early in Training »
Mansheej Paul · Surya Ganguli · Gintare Karolina Dziugaite -
2021 Poster: Towards a Unified Information-Theoretic Framework for Generalization »
Mahdi Haghifam · Gintare Karolina Dziugaite · Shay Moran · Dan Roy -
2021 Poster: Variational Model Inversion Attacks »
Kuan-Chieh Wang · YAN FU · Ke Li · Ashish Khisti · Richard Zemel · Alireza Makhzani -
2020 : Keynote 5: Gintare Karolina Dziugaite »
Gintare Karolina Dziugaite -
2020 Poster: Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel »
Stanislav Fort · Gintare Karolina Dziugaite · Mansheej Paul · Sepideh Kharaghani · Daniel Roy · Surya Ganguli -
2020 Poster: Coded Sequential Matrix Multiplication For Straggler Mitigation »
Nikhil Krishnan Muralee Krishnan · Seyederfan Hosseini · Ashish Khisti -
2020 Poster: Adaptive Gradient Quantization for Data-Parallel SGD »
Fartash Faghri · Iman Tabrizian · Ilia Markov · Dan Alistarh · Daniel Roy · Ali Ramezani-Kebrya -
2020 Poster: Sharpened Generalization Bounds based on Conditional Mutual Information and an Application to Noisy, Iterative Algorithms »
Mahdi Haghifam · Jeffrey Negrea · Ashish Khisti · Daniel Roy · Gintare Karolina Dziugaite -
2020 Poster: In search of robust measures of generalization »
Gintare Karolina Dziugaite · Alexandre Drouin · Brady Neal · Nitarshan Rajkumar · Ethan Caballero · Linbo Wang · Ioannis Mitliagkas · Daniel Roy -
2019 : Lunch break & Poster session »
Breandan Considine · Michael Innes · Du Phan · Dougal Maclaurin · Robin Manhaeve · Alexey Radul · Shashi Gowda · Ekansh Sharma · Eli Sennesh · Maxim Kochurov · Gordon Plotkin · Thomas Wiecki · Navjot Kukreja · Chung-chieh Shan · Matthew Johnson · Dan Belov · Neeraj Pradhan · Wannes Meert · Angelika Kimmig · Luc De Raedt · Brian Patton · Matthew Hoffman · Rif A. Saurous · Daniel Roy · Eli Bingham · Martin Jankowiak · Colin Carroll · Junpeng Lao · Liam Paull · Martin Abadi · Angel Rojas Jimenez · JP Chen -
2019 : Lunch Break and Posters »
Xingyou Song · Elad Hoffer · Wei-Cheng Chang · Jeremy Cohen · Jyoti Islam · Yaniv Blumenfeld · Andreas Madsen · Jonathan Frankle · Sebastian Goldt · Satrajit Chatterjee · Abhishek Panigrahi · Alex Renda · Brian Bartoldson · Israel Birhane · Aristide Baratin · Niladri Chatterji · Roman Novak · Jessica Forde · YiDing Jiang · Yilun Du · Linara Adilova · Michael Kamp · Berry Weinstein · Itay Hubara · Tal Ben-Nun · Torsten Hoefler · Daniel Soudry · Hsiang-Fu Yu · Kai Zhong · Yiming Yang · Inderjit Dhillon · Jaime Carbonell · Yanqing Zhang · Dar Gilboa · Johannes Brandstetter · Alexander R Johansen · Gintare Karolina Dziugaite · Raghav Somani · Ari Morcos · Freddie Kalaitzis · Hanie Sedghi · Lechao Xiao · John Zech · Muqiao Yang · Simran Kaur · Qianli Ma · Yao-Hung Hubert Tsai · Ruslan Salakhutdinov · Sho Yaida · Zachary Lipton · Daniel Roy · Michael Carbin · Florent Krzakala · Lenka Zdeborová · Guy Gur-Ari · Ethan Dyer · Dilip Krishnan · Hossein Mobahi · Samy Bengio · Behnam Neyshabur · Praneeth Netrapalli · Kris Sankaran · Julien Cornebise · Yoshua Bengio · Vincent Michalski · Samira Ebrahimi Kahou · Md Rifat Arefin · Jiri Hron · Jaehoon Lee · Jascha Sohl-Dickstein · Samuel Schoenholz · David Schwab · Dongyu Li · Sang Keun Choe · Henning Petzka · Ashish Verma · Zhichao Lin · Cristian Sminchisescu -
2019 Workshop: Machine Learning with Guarantees »
Ben London · Gintare Karolina Dziugaite · Daniel Roy · Thorsten Joachims · Aleksander Madry · John Shawe-Taylor -
2019 Poster: Fast-rate PAC-Bayes Generalization Bounds via Shifted Rademacher Processes »
Jun Yang · Shengyang Sun · Daniel Roy -
2018 Poster: Data-dependent PAC-Bayes priors via differential privacy »
Gintare Karolina Dziugaite · Daniel Roy -
2017 : Daniel Roy - Deep Neural Networks: From Flat Minima to Numerically Nonvacuous Generalization Bounds via PAC-Bayes »
Daniel Roy -
2016 Poster: Measuring the reliability of MCMC inference with bidirectional Monte Carlo »
Roger Grosse · Siddharth Ancha · Daniel Roy -
2014 Workshop: 3rd NIPS Workshop on Probabilistic Programming »
Daniel Roy · Josh Tenenbaum · Thomas Dietterich · Stuart J Russell · YI WU · Ulrik R Beierholm · Alp Kucukelbir · Zenna Tavares · Yura Perov · Daniel Lee · Brian Ruttenberg · Sameer Singh · Michael Hughes · Marco Gaboardi · Alexey Radul · Vikash Mansinghka · Frank Wood · Sebastian Riedel · Prakash Panangaden -
2014 Poster: Gibbs-type Indian Buffet Processes »
Creighton Heaukulani · Daniel Roy -
2014 Poster: Mondrian Forests: Efficient Online Random Forests »
Balaji Lakshminarayanan · Daniel Roy · Yee Whye Teh -
2013 Session: Session Chair »
Daniel Roy -
2013 Session: Tutorial Session B »
Daniel Roy -
2012 Workshop: Probabilistic Programming: Foundations and Applications (2 day) »
Vikash Mansinghka · Daniel Roy · Noah Goodman -
2012 Workshop: Probabilistic Programming: Foundations and Applications (2 day) »
Vikash Mansinghka · Daniel Roy · Noah Goodman -
2012 Poster: Random function priors for exchangeable graphs and arrays »
James R Lloyd · Daniel Roy · Peter Orbanz · Zoubin Ghahramani -
2011 Poster: Complexity of Inference in Latent Dirichlet Allocation »
David Sontag · Daniel Roy -
2011 Spotlight: Complexity of Inference in Latent Dirichlet Allocation »
David Sontag · Daniel Roy -
2008 Workshop: Probabilistic Programming: Universal Languages, Systems and Applications »
Daniel Roy · John Winn · David A McAllester · Vikash Mansinghka · Josh Tenenbaum -
2008 Oral: The Mondrian Process »
Daniel Roy · Yee Whye Teh -
2008 Poster: The Mondrian Process »
Daniel Roy · Yee Whye Teh -
2007 Poster: Bayesian Agglomerative Clustering with Coalescents »
Yee Whye Teh · Hal Daumé III · Daniel Roy -
2007 Oral: Bayesian Agglomerative Clustering with Coalescents »
Yee Whye Teh · Hal Daumé III · Daniel Roy -
2006 Poster: Learning annotated hierarchies from relational data »
Daniel Roy · Charles Kemp · Vikash Mansinghka · Josh Tenenbaum -
2006 Talk: Learning annotated hierarchies from relational data »
Daniel Roy · Charles Kemp · Vikash Mansinghka · Josh Tenenbaum