Timezone: »
PSPS: Preconditioned Stochastic Polyak Step-size method for badly scaled data
Farshed Abdukhakimov · Chulu Xiang · Dmitry Kamzolov · Robert Gower · Martin Takac
The family of Stochastic Gradient Methods with Polyak Step-size offers an update rule that alleviates the need of fine-tuning the learning rate of an optimizer. Recent work (Robert M Gower, Mathieu Blondel, Nidham Gazagnadou, and Fabian Pedregosa: Cutting some slack for SGD with adaptive polyak stepsizes) has been proposed to introduce a slack variable, which makes these methods applicable outside of the interpolation regime. In this paper, we combine preconditioning and slack in an updated optimization algorithm to show its performance on badly scaled and/or ill-conditioned datasets. We use Hutchinson's method to obtain an estimate of a Hessian which is used as the preconditioner.
Author Information
Farshed Abdukhakimov (MBZUAI)
Chulu Xiang (MBZUAI)
Dmitry Kamzolov (Mohamed bin Zayed University of Artificial Intelligence)
Robert Gower (Flatiron Institute)
Martin Takac (Mohamed bin Zayed University of Artificial Intelligence (MBZUAI))
More from the Same Authors
-
2021 : Random-reshuffled SARAH does not need a full gradient computations »
Aleksandr Beznosikov · Martin Takac -
2021 : Stochastic Polyak Stepsize with a Moving Target »
Robert Gower · Aaron Defazio · Mike Rabbat -
2022 : A Stochastic Prox-Linear Method for CVaR Minimization »
Si Yi Meng · Vasileios Charisopoulos · Robert Gower -
2022 : Effects of momentum scaling for SGD »
Dmitry A. Pasechnyuk · Alexander Gasnikov · Martin Takac -
2022 : Using quadratic equations for overparametrized models »
Shuang Li · William Swartworth · Martin Takac · Deanna Needell · Robert Gower -
2022 : FLECS-CGD: A Federated Learning Second-Order Framework via Compression and Sketching with Compressed Gradient Differences »
Artem Agafonov · Brahim Erraji · Martin Takac -
2022 : Cubic Regularized Quasi-Newton Methods »
Dmitry Kamzolov · Klea Ziu · Artem Agafonov · Martin Takac -
2023 Poster: Provable convergence guarantees for black-box variational inference »
Justin Domke · Robert Gower · Guillaume Garrigos -
2023 Poster: Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational Inequalities »
Aleksandr Beznosikov · Martin Takac · Alexander Gasnikov -
2023 Poster: Byzantine-Tolerant Methods for Distributed Variational Inequalities »
Nazarii Tupitsa · Eduard Gorbunov · Abdulla Jasem Almansoori · Yanlin Wu · Martin Takac · Karthik Nandakumar · Samuel Horváth -
2023 Poster: Variational Inference with Gaussian Score Matching »
Chirag Modi · Robert Gower · Charles Margossian · Yuling Yao · David Blei · Lawrence Saul -
2022 : PSPS: Preconditioned Stochastic Polyak Step-size method for badly scaled data »
Farshed Abdukhakimov -
2022 Workshop: Order up! The Benefits of Higher-Order Optimization in Machine Learning »
Albert Berahas · Jelena Diakonikolas · Jarad Forristal · Brandon Reese · Martin Takac · Yan Xu -
2022 Poster: A Damped Newton Method Achieves Global $\mathcal O \left(\frac{1}{k^2}\right)$ and Local Quadratic Convergence Rate »
Slavomír Hanzely · Dmitry Kamzolov · Dmitry Pasechnyuk · Alexander Gasnikov · Peter Richtarik · Martin Takac -
2021 : Poster Session 2 (gather.town) »
Wenjie Li · Akhilesh Soni · Jinwuk Seok · Jianhao Ma · Jeffery Kline · Mathieu Tuli · Miaolan Xie · Robert Gower · Quanqi Hu · Matteo Cacciola · Yuanlu Bai · Boyue Li · Wenhao Zhan · Shentong Mo · Junhyung Lyle Kim · Sajad Fathi Hafshejani · Chris Junchi Li · Zhishuai Guo · Harshvardhan Harshvardhan · Neha Wadia · Tatjana Chavdarova · Difan Zou · Zixiang Chen · Aman Gupta · Jacques Chen · Betty Shea · Benoit Dherin · Aleksandr Beznosikov -
2021 Workshop: OPT 2021: Optimization for Machine Learning »
Courtney Paquette · Quanquan Gu · Oliver Hinder · Katya Scheinberg · Sebastian Stich · Martin Takac -
2020 : Closing remarks »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 : Poster Session 2 (gather.town) »
Sharan Vaswani · Nicolas Loizou · Wenjie Li · Preetum Nakkiran · Zhan Gao · Sina Baghal · Jingfeng Wu · Roozbeh Yousefzadeh · Jinyi Wang · Jing Wang · Cong Xie · Anastasia Borovykh · Stanislaw Jastrzebski · Soham Dan · Yiliang Zhang · Mark Tuddenham · Sarath Pattathil · Ievgen Redko · Jeremy Cohen · Yasaman Esfandiari · Zhanhong Jiang · Mostafa ElAraby · Chulhee Yun · Michael Psenka · Robert Gower · Xiaoyu Wang -
2020 : Live Q&A with Suvrit Sra (Zoom) »
Martin Takac -
2020 : Intro to Invited Speaker 5 »
Martin Takac -
2020 : Contributed talks in Session 2 (Zoom) »
Martin Takac · Samuel Horváth · Guan-Horng Liu · Nicolas Loizou · Sharan Vaswani -
2020 : Live Q&A with Donald Goldfarb (Zoom) »
Martin Takac -
2020 : Live Q&A with Andreas Krause (Zoom) »
Martin Takac -
2020 : Welcome remarks to Session 2 »
Martin Takac -
2020 Workshop: OPT2020: Optimization for Machine Learning »
Courtney Paquette · Mark Schmidt · Sebastian Stich · Quanquan Gu · Martin Takac -
2020 : Welcome event (gather.town) »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2019 : Poster Session »
Eduard Gorbunov · Alexandre d'Aspremont · Lingxiao Wang · Liwei Wang · Boris Ginsburg · Alessio Quaglino · Camille Castera · Saurabh Adya · Diego Granziol · Rudrajit Das · Raghu Bollapragada · Fabian Pedregosa · Martin Takac · Majid Jahani · Sai Praneeth Karimireddy · Hilal Asi · Balint Daroczy · Leonard Adolphs · Aditya Rawal · Nicolas Brandt · Minhan Li · Giuseppe Ughi · Orlando Romero · Ivan Skorokhodov · Damien Scieur · Kiwook Bae · Konstantin Mishchenko · Rohan Anil · Vatsal Sharan · Aditya Balu · Chao Chen · Zhewei Yao · Tolga Ergen · Paul Grigas · Chris Junchi Li · Jimmy Ba · Stephen J Roberts · Sharan Vaswani · Armin Eftekhari · Chhavi Sharma -
2019 Poster: RSN: Randomized Subspace Newton »
Robert Gower · Dmitry Kovalev · Felix Lieder · Peter Richtarik -
2019 Poster: Towards closing the gap between the theory and practice of SVRG »
Othmane Sebbouh · Nidham Gazagnadou · Samy Jelassi · Francis Bach · Robert Gower -
2018 Poster: Reinforcement Learning for Solving the Vehicle Routing Problem »
MohammadReza Nazari · Afshin Oroojlooy · Lawrence Snyder · Martin Takac -
2016 Poster: A Multi-Batch L-BFGS Method for Machine Learning »
Albert Berahas · Jorge Nocedal · Martin Takac -
2014 Poster: Communication-Efficient Distributed Dual Coordinate Ascent »
Martin Jaggi · Virginia Smith · Martin Takac · Jonathan Terhorst · Sanjay Krishnan · Thomas Hofmann · Michael Jordan