Timezone: »
We propose and analyze a stochastic Newton algorithm for homogeneous distributed stochastic convex optimization, where each machine can calculate stochastic gradients of the same population objective, as well as stochastic Hessian-vector products (products of an independent unbiased estimator of the Hessian of the population objective with arbitrary vectors), with many such stochastic computations performed between rounds of communication. We show that our method can reduce the number, and frequency, of required communication rounds, compared to existing methods without hurting performance, by proving convergence guarantees for quasi-self-concordant objectives (e.g., logistic regression), alongside empirical evidence.
Author Information
Brian Bullins (Princeton University)
Kshitij Patel (Toyota Technological Institute at Chicago)
Ohad Shamir (Weizmann Institute of Science)
Nathan Srebro (University of Toronto)
Blake Woodworth (Inria)
More from the Same Authors
-
2021 Spotlight: Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems »
Itay Safran · Ohad Shamir -
2021 Spotlight: On the Power of Differentiable Learning versus PAC and SQ Learning »
Emmanuel Abbe · Pritish Kamath · Eran Malach · Colin Sandon · Nathan Srebro -
2021 : Exponential Family Model-Based Reinforcement Learning via Score Matching »
Gene Li · Junbo Li · Nathan Srebro · Zhaoran Wang · Zhuoran Yang -
2022 : On the Complexity of Finding Small Subgradients in Nonsmooth Optimization »
Guy Kornowski · Ohad Shamir -
2022 : On the Complexity of Finding Small Subgradients in Nonsmooth Optimization »
Guy Kornowski · Ohad Shamir -
2023 : An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization »
Guy Kornowski · Ohad Shamir -
2023 Poster: Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks »
Roey Magen · Ohad Shamir -
2023 Poster: From Tempered to Benign Overfitting in ReLU Neural Networks »
Guy Kornowski · Gilad Yehudai · Ohad Shamir -
2023 Poster: Accelerated Zeroth-order Method for Non-Smooth Stochastic Convex Optimization Problem with Infinite Variance »
Nikita Kornilov · Ohad Shamir · Aleksandr Lobanov · Darina Dvinskikh · Alexander Gasnikov · Innokentiy Shibaev · Eduard Gorbunov · Samuel Horváth -
2022 Poster: On Margin Maximization in Linear and ReLU Networks »
Gal Vardi · Ohad Shamir · Nati Srebro -
2022 Poster: The Sample Complexity of One-Hidden-Layer Neural Networks »
Gal Vardi · Ohad Shamir · Nati Srebro -
2022 Poster: Reconstructing Training Data From Trained Neural Networks »
Niv Haim · Gal Vardi · Gilad Yehudai · Ohad Shamir · Michal Irani -
2022 Poster: Gradient Methods Provably Converge to Non-Robust Networks »
Gal Vardi · Gilad Yehudai · Ohad Shamir -
2021 Poster: On the Power of Differentiable Learning versus PAC and SQ Learning »
Emmanuel Abbe · Pritish Kamath · Eran Malach · Colin Sandon · Nathan Srebro -
2021 Oral: Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting »
Frederic Koehler · Lijia Zhou · Danica J. Sutherland · Nathan Srebro -
2021 Poster: Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting »
Frederic Koehler · Lijia Zhou · Danica J. Sutherland · Nathan Srebro -
2021 Poster: Unifying Width-Reduced Methods for Quasi-Self-Concordant Optimization »
Deeksha Adil · Brian Bullins · Sushant Sachdeva -
2021 Poster: Representation Costs of Linear Neural Networks: Analysis and Design »
Zhen Dai · Mina Karzand · Nathan Srebro -
2021 Poster: Learning a Single Neuron with Bias Using Gradient Descent »
Gal Vardi · Gilad Yehudai · Ohad Shamir -
2021 Poster: Oracle Complexity in Nonsmooth Nonconvex Optimization »
Guy Kornowski · Ohad Shamir -
2021 Poster: An Even More Optimal Stochastic Optimization Algorithm: Minibatching and Interpolation Learning »
Blake Woodworth · Nathan Srebro -
2021 Oral: Oracle Complexity in Nonsmooth Nonconvex Optimization »
Guy Kornowski · Ohad Shamir -
2021 Poster: Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems »
Itay Safran · Ohad Shamir -
2020 : Poster Session 1 (gather.town) »
Laurent Condat · Tiffany Vlaar · Ohad Shamir · Mohammadi Zaki · Zhize Li · Guan-Horng Liu · Samuel Horváth · Mher Safaryan · Yoni Choukroun · Kumar Shridhar · Nabil Kahale · Jikai Jin · Pratik Kumar Jawanpuria · Gaurav Kumar Yadav · Kazuki Koyama · Junyoung Kim · Xiao Li · Saugata Purkayastha · Adil Salim · Dighanchal Banerjee · Peter Richtarik · Lakshman Mahto · Tian Ye · Bamdev Mishra · Huikang Liu · Jiajie Zhu -
2020 : Contributed talks in Session 1 (Zoom) »
Sebastian Stich · Laurent Condat · Zhize Li · Ohad Shamir · Tiffany Vlaar · Mohammadi Zaki -
2020 : Contributed Video: Can We Find Near-Approximately-Stationary Points of Nonsmooth Nonconvex Functions?, Ohad Shamir »
Ohad Shamir -
2020 Poster: Neural Networks with Small Weights and Depth-Separation Barriers »
Gal Vardi · Ohad Shamir -
2020 Poster: Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy »
Edward Moroshko · Blake Woodworth · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2020 Poster: Minibatch vs Local SGD for Heterogeneous Distributed Learning »
Blake Woodworth · Kumar Kshitij Patel · Nati Srebro -
2020 Spotlight: Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy »
Edward Moroshko · Blake Woodworth · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2019 Poster: On the Power and Limitations of Random Features for Understanding Neural Networks »
Gilad Yehudai · Ohad Shamir -
2018 Poster: Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization »
Blake Woodworth · Jialei Wang · Adam Smith · Brendan McMahan · Nati Srebro -
2018 Spotlight: Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization »
Blake Woodworth · Jialei Wang · Adam Smith · Brendan McMahan · Nati Srebro -
2018 Poster: Are ResNets Provably Better than Linear Predictors? »
Ohad Shamir -
2018 Poster: The Everlasting Database: Statistical Validity at a Fair Price »
Blake Woodworth · Vitaly Feldman · Saharon Rosset · Nati Srebro -
2018 Poster: Global Non-convex Optimization with Discretized Diffusions »
Murat Erdogdu · Lester Mackey · Ohad Shamir -
2017 Poster: Implicit Regularization in Matrix Factorization »
Suriya Gunasekar · Blake Woodworth · Srinadh Bhojanapalli · Behnam Neyshabur · Nati Srebro -
2017 Spotlight: Implicit Regularization in Matrix Factorization »
Suriya Gunasekar · Blake Woodworth · Srinadh Bhojanapalli · Behnam Neyshabur · Nati Srebro -
2016 Poster: Tight Complexity Bounds for Optimizing Composite Objectives »
Blake Woodworth · Nati Srebro -
2016 Poster: Dimension-Free Iteration Complexity of Finite Sum Optimization Problems »
Yossi Arjevani · Ohad Shamir -
2016 Poster: Without-Replacement Sampling for Stochastic Gradient Methods »
Ohad Shamir -
2016 Oral: Without-Replacement Sampling for Stochastic Gradient Methods »
Ohad Shamir -
2015 Poster: Communication Complexity of Distributed Convex Learning and Optimization »
Yossi Arjevani · Ohad Shamir -
2014 Poster: Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and Estimation »
Ohad Shamir -
2014 Poster: On the Computational Efficiency of Training Neural Networks »
Roi Livni · Shai Shalev-Shwartz · Ohad Shamir -
2013 Poster: Online Learning with Switching Costs and Other Adaptive Adversaries »
Nicolò Cesa-Bianchi · Ofer Dekel · Ohad Shamir