Timezone: »
Despite their high computation and communication costs, Newton-type methods remain an appealing option for distributed training due to their robustness against ill-conditioned convex problems. In this work, we study {\em communication compression} and {\em aggregation mechanisms} for curvature information in order to reduce these costs while preserving theoretically superior local convergence guarantees. We prove that the recently developed class of {\em three point compressors (3PC)} of Richtárik et al. [2022] for gradient communication can be generalized to Hessian communication as well. This result opens up a wide variety of communication strategies, such as {\em contractive compression} and {\em lazy aggregation}, available to our disposal to compress prohibitively costly curvature information. Moreover, we discovered several new 3PC mechanisms, such as {\em adaptive thresholding} and {\em Bernoulli aggregation}, which require reduced communication and occasional Hessian computations. Furthermore, we extend and analyze our approach to bidirectional communication compression and partial device participation setups to cater to the practical considerations of applications in federated learning. For all our methods, we derive fast {\em condition-number-independent} local linear and/or superlinear convergence rates. Finally, with extensive numerical evaluations on convex optimization problems, we illustrate that our designed schemes achieve state-of-the-art communication complexity compared to several key baselines using second-order information.
Author Information
Rustem Islamov (Institut Polytechnique de Paris)
Xun Qian (KAUST)
Slavomír Hanzely (KAUST)
Mher Safaryan (KAUST)
Peter Richtarik (KAUST)
More from the Same Authors
-
2021 : Better Linear Rates for SGD with Data Shuffling »
Grigory Malinovsky · Alibek Sailanbayev · Peter Richtarik -
2021 : Better Linear Rates for SGD with Data Shuffling »
Grigory Malinovsky · Alibek Sailanbayev · Peter Richtarik -
2021 : Shifted Compression Framework: Generalizations and Improvements »
Egor Shulgin · Peter Richtarik -
2021 : EF21 with Bells & Whistles: Practical Algorithmic Extensions of Modern Error Feedback »
Peter Richtarik · Igor Sokolov · Ilyas Fatkhullin · Eduard Gorbunov · Zhize Li -
2021 : On Server-Side Stepsizes in Federated Optimization: Theory Explaining the Heuristics »
Grigory Malinovsky · Konstantin Mishchenko · Peter Richtarik -
2021 : FedMix: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning »
Elnur Gasanov · Ahmed Khaled Ragab Bayoumi · Samuel Horváth · Peter Richtarik -
2021 : FedMix: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning »
Elnur Gasanov · Ahmed Khaled Ragab Bayoumi · Samuel Horváth · Peter Richtarik -
2022 Poster: Theoretically Better and Numerically Faster Distributed Optimization with Smoothness-Aware Quantization Techniques »
Bokun Wang · Mher Safaryan · Peter Richtarik -
2022 : RandProx: Primal-Dual Optimization Algorithms with Randomized Proximal Updates »
Laurent Condat · Peter Richtarik -
2022 : Certified Robustness in Federated Learning »
Motasem Alfarra · Juan Perez · Egor Shulgin · Peter Richtarik · Bernard Ghanem -
2022 Spotlight: Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling »
Dmitry Kovalev · Alexander Gasnikov · Peter Richtarik -
2022 Spotlight: Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with an Inexact Prox »
Abdurakhmon Sadiev · Dmitry Kovalev · Peter Richtarik -
2022 Spotlight: Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees »
Aleksandr Beznosikov · Peter Richtarik · Michael Diskin · Max Ryabinin · Alexander Gasnikov -
2022 Spotlight: Optimal Algorithms for Decentralized Stochastic Variational Inequalities »
Dmitry Kovalev · Aleksandr Beznosikov · Abdurakhmon Sadiev · Michael Persiianov · Peter Richtarik · Alexander Gasnikov -
2022 Spotlight: Lightning Talks 4A-1 »
Jiawei Huang · Su Jia · Abdurakhmon Sadiev · Ruomin Huang · Yuanyu Wan · Denizalp Goktas · Jiechao Guan · Andrew Li · Wei-Wei Tu · Li Zhao · Amy Greenwald · Jiawei Huang · Dmitry Kovalev · Yong Liu · Wenjie Liu · Peter Richtarik · Lijun Zhang · Zhiwu Lu · R Ravi · Tao Qin · Wei Chen · Hu Ding · Nan Jiang · Tie-Yan Liu -
2022 Workshop: Federated Learning: Recent Advances and New Challenges »
Shiqiang Wang · Nathalie Baracaldo · Olivia Choudhury · Gauri Joshi · Peter Richtarik · Praneeth Vepakomma · Han Yu -
2022 Poster: Variance Reduced ProxSkip: Algorithm, Theory and Application to Federated Learning »
Grigory Malinovsky · Kai Yi · Peter Richtarik -
2022 Poster: Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with an Inexact Prox »
Abdurakhmon Sadiev · Dmitry Kovalev · Peter Richtarik -
2022 Poster: A Damped Newton Method Achieves Global $\mathcal O \left(\frac{1}{k^2}\right)$ and Local Quadratic Convergence Rate »
Slavomír Hanzely · Dmitry Kamzolov · Dmitry Pasechnyuk · Alexander Gasnikov · Peter Richtarik · Martin Takac -
2022 Poster: BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression »
Haoyu Zhao · Boyue Li · Zhize Li · Peter Richtarik · Yuejie Chi -
2022 Poster: EF-BV: A Unified Theory of Error Feedback and Variance Reduction Mechanisms for Biased and Unbiased Compression in Distributed Optimization »
Laurent Condat · Kai Yi · Peter Richtarik -
2022 Poster: Optimal Algorithms for Decentralized Stochastic Variational Inequalities »
Dmitry Kovalev · Aleksandr Beznosikov · Abdurakhmon Sadiev · Michael Persiianov · Peter Richtarik · Alexander Gasnikov -
2022 Poster: Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling »
Dmitry Kovalev · Alexander Gasnikov · Peter Richtarik -
2022 Poster: Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees »
Aleksandr Beznosikov · Peter Richtarik · Michael Diskin · Max Ryabinin · Alexander Gasnikov -
2021 : Q&A with Professor Peter Richtarik »
Peter Richtarik -
2021 : Keynote Talk: Permutation Compressors for Provably Faster Distributed Nonconvex Optimization (Peter Richtarik) »
Peter Richtarik -
2021 Poster: Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization »
Mher Safaryan · Filip Hanzely · Peter Richtarik -
2021 Poster: EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback »
Peter Richtarik · Igor Sokolov · Ilyas Fatkhullin -
2021 Poster: Error Compensated Distributed SGD Can Be Accelerated »
Xun Qian · Peter Richtarik · Tong Zhang -
2021 Poster: CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression »
Zhize Li · Peter Richtarik -
2021 Poster: Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks »
Dmitry Kovalev · Elnur Gasanov · Alexander Gasnikov · Peter Richtarik -
2021 Oral: EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback »
Peter Richtarik · Igor Sokolov · Ilyas Fatkhullin -
2020 : Poster Session 1 (gather.town) »
Laurent Condat · Tiffany Vlaar · Ohad Shamir · Mohammadi Zaki · Zhize Li · Guan-Horng Liu · Samuel Horváth · Mher Safaryan · Yoni Choukroun · Kumar Shridhar · Nabil Kahale · Jikai Jin · Pratik Kumar Jawanpuria · Gaurav Kumar Yadav · Kazuki Koyama · Junyoung Kim · Xiao Li · Saugata Purkayastha · Adil Salim · Dighanchal Banerjee · Peter Richtarik · Lakshman Mahto · Tian Ye · Bamdev Mishra · Huikang Liu · Jiajie Zhu -
2020 Poster: Primal Dual Interpretation of the Proximal Stochastic Gradient Langevin Algorithm »
Adil Salim · Peter Richtarik -
2020 Poster: Linearly Converging Error Compensated SGD »
Eduard Gorbunov · Dmitry Kovalev · Dmitry Makarenko · Peter Richtarik -
2020 Poster: Random Reshuffling: Simple Analysis with Vast Improvements »
Konstantin Mishchenko · Ahmed Khaled Ragab Bayoumi · Peter Richtarik -
2020 Spotlight: Linearly Converging Error Compensated SGD »
Eduard Gorbunov · Dmitry Kovalev · Dmitry Makarenko · Peter Richtarik -
2020 Session: Orals & Spotlights Track 21: Optimization »
Peter Richtarik · Marco Cuturi -
2020 Poster: Lower Bounds and Optimal Algorithms for Personalized Federated Learning »
Filip Hanzely · Slavomír Hanzely · Samuel Horváth · Peter Richtarik -
2020 Poster: Optimal and Practical Algorithms for Smooth and Strongly Convex Decentralized Optimization »
Dmitry Kovalev · Adil Salim · Peter Richtarik -
2019 Poster: RSN: Randomized Subspace Newton »
Robert Gower · Dmitry Kovalev · Felix Lieder · Peter Richtarik -
2019 Poster: Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates »
Adil Salim · Dmitry Kovalev · Peter Richtarik -
2019 Spotlight: Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates »
Adil Salim · Dmitry Kovalev · Peter Richtarik -
2018 Poster: Stochastic Spectral and Conjugate Descent Methods »
Dmitry Kovalev · Peter Richtarik · Eduard Gorbunov · Elnur Gasanov -
2018 Poster: Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization »
Robert Gower · Filip Hanzely · Peter Richtarik · Sebastian Stich -
2018 Poster: SEGA: Variance Reduction via Gradient Sketching »
Filip Hanzely · Konstantin Mishchenko · Peter Richtarik -
2015 Poster: Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling »
Zheng Qu · Peter Richtarik · Tong Zhang