Timezone: »
Recently the SP (Stochastic Polyak step size) method has emerged as a competitive adaptive method for setting the step sizes of SGD. SP can be interpreted as a method specialized to interpolated models, since it solves the \emph{interpolation equations}. SP solves these equation by using local linearizations of the model. We take a step further and develop a method for solving the interpolation equations that uses the local second-order approximation of the model. Our resulting method SP2 uses Hessian-vector products to speed-up the convergence of SP. Furthermore, and rather uniquely among second-order methods, the design of SP2 in no way relies on positive definite Hessian matrices or convexity of the objective function. We show SP2 is very competitive on matrix completion, non-convex test problems and logistic regression. We also provide a convergence theory on sums-of-quadratics.
Author Information
Shuang Li (University of California, Los Angeles)
William Swartworth (University of California Los Angeles)
Martin Takac (Mohamed bin Zayed University of Artificial Intelligence (MBZUAI))
Deanna Needell (UCLA)
Robert Gower (Flatiron Institute)
More from the Same Authors
-
2021 : Random-reshuffled SARAH does not need a full gradient computations »
Aleksandr Beznosikov · Martin Takac -
2021 : Stochastic Polyak Stepsize with a Moving Target »
Robert Gower · Aaron Defazio · Mike Rabbat -
2022 : A Stochastic Prox-Linear Method for CVaR Minimization »
Si Yi Meng · Vasileios Charisopoulos · Robert Gower -
2022 : Effects of momentum scaling for SGD »
Dmitry A. Pasechnyuk · Alexander Gasnikov · Martin Takac -
2022 : FLECS-CGD: A Federated Learning Second-Order Framework via Compression and Sketching with Compressed Gradient Differences »
Artem Agafonov · Brahim Erraji · Martin Takac -
2022 : Cubic Regularized Quasi-Newton Methods »
Dmitry Kamzolov · Klea Ziu · Artem Agafonov · Martin Takac -
2022 : PSPS: Preconditioned Stochastic Polyak Step-size method for badly scaled data »
Farshed Abdukhakimov · Chulu Xiang · Dmitry Kamzolov · Robert Gower · Martin Takac -
2022 Workshop: Order up! The Benefits of Higher-Order Optimization in Machine Learning »
Albert Berahas · Jelena Diakonikolas · Jarad Forristal · Brandon Reese · Martin Takac · Yan Xu -
2022 Poster: A Damped Newton Method Achieves Global $\mathcal O \left(\frac{1}{k^2}\right)$ and Local Quadratic Convergence Rate »
Slavomír Hanzely · Dmitry Kamzolov · Dmitry Pasechnyuk · Alexander Gasnikov · Peter Richtarik · Martin Takac -
2022 Poster: Online Nonnegative CP-dictionary Learning for Markovian Data »
Hanbaek Lyu · Christopher Strohmeier · Deanna Needell -
2021 : Poster Session 2 (gather.town) »
Wenjie Li · Akhilesh Soni · Jinwuk Seok · Jianhao Ma · Jeffery Kline · Mathieu Tuli · Miaolan Xie · Robert Gower · Quanqi Hu · Matteo Cacciola · Yuanlu Bai · Boyue Li · Wenhao Zhan · Shentong Mo · Junhyung Lyle Kim · Sajad Fathi Hafshejani · Chris Junchi Li · Zhishuai Guo · Harshvardhan Harshvardhan · Neha Wadia · Tatjana Chavdarova · Difan Zou · Zixiang Chen · Aman Gupta · Jacques Chen · Betty Shea · Benoit Dherin · Aleksandr Beznosikov -
2021 Workshop: OPT 2021: Optimization for Machine Learning »
Courtney Paquette · Quanquan Gu · Oliver Hinder · Katya Scheinberg · Sebastian Stich · Martin Takac -
2020 : Closing remarks »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 : Invited speaker: Online nonnegative matrix factorization for Markovian and other real data, Deanna Needell and Hanbaek Lyu »
Hanbake Lyu · Deanna Needell -
2020 : Poster Session 2 (gather.town) »
Sharan Vaswani · Nicolas Loizou · Wenjie Li · Preetum Nakkiran · Zhan Gao · Sina Baghal · Jingfeng Wu · Roozbeh Yousefzadeh · Jinyi Wang · Jing Wang · Cong Xie · Anastasia Borovykh · Stanislaw Jastrzebski · Soham Dan · Yiliang Zhang · Mark Tuddenham · Sarath Pattathil · Ievgen Redko · Jeremy Cohen · Yasaman Esfandiari · Zhanhong Jiang · Mostafa ElAraby · Chulhee Yun · Michael Psenka · Robert Gower · Xiaoyu Wang -
2020 : Live Q&A with Suvrit Sra (Zoom) »
Martin Takac -
2020 : Intro to Invited Speaker 5 »
Martin Takac -
2020 : Contributed talks in Session 2 (Zoom) »
Martin Takac · Samuel Horváth · Guan-Horng Liu · Nicolas Loizou · Sharan Vaswani -
2020 : Live Q&A with Donald Goldfarb (Zoom) »
Martin Takac -
2020 : Live Q&A with Andreas Krause (Zoom) »
Martin Takac -
2020 : Welcome remarks to Session 2 »
Martin Takac -
2020 Workshop: OPT2020: Optimization for Machine Learning »
Courtney Paquette · Mark Schmidt · Sebastian Stich · Quanquan Gu · Martin Takac -
2020 : Welcome event (gather.town) »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2019 : Poster Session »
Eduard Gorbunov · Alexandre d'Aspremont · Lingxiao Wang · Liwei Wang · Boris Ginsburg · Alessio Quaglino · Camille Castera · Saurabh Adya · Diego Granziol · Rudrajit Das · Raghu Bollapragada · Fabian Pedregosa · Martin Takac · Majid Jahani · Sai Praneeth Karimireddy · Hilal Asi · Balint Daroczy · Leonard Adolphs · Aditya Rawal · Nicolas Brandt · Minhan Li · Giuseppe Ughi · Orlando Romero · Ivan Skorokhodov · Damien Scieur · Kiwook Bae · Konstantin Mishchenko · Rohan Anil · Vatsal Sharan · Aditya Balu · Chao Chen · Zhewei Yao · Tolga Ergen · Paul Grigas · Chris Junchi Li · Jimmy Ba · Stephen J Roberts · Sharan Vaswani · Armin Eftekhari · Chhavi Sharma -
2019 : Opening Remarks »
Reinhard Heckel · Paul Hand · Alex Dimakis · Joan Bruna · Deanna Needell · Richard Baraniuk -
2019 Workshop: Solving inverse problems with deep networks: New architectures, theoretical foundations, and applications »
Reinhard Heckel · Paul Hand · Richard Baraniuk · Joan Bruna · Alex Dimakis · Deanna Needell -
2019 Poster: RSN: Randomized Subspace Newton »
Robert Gower · Dmitry Kovalev · Felix Lieder · Peter Richtarik -
2019 Poster: Towards closing the gap between the theory and practice of SVRG »
Othmane Sebbouh · Nidham Gazagnadou · Samy Jelassi · Francis Bach · Robert Gower -
2018 Poster: Reinforcement Learning for Solving the Vehicle Routing Problem »
MohammadReza Nazari · Afshin Oroojlooy · Lawrence Snyder · Martin Takac -
2016 Poster: A Multi-Batch L-BFGS Method for Machine Learning »
Albert Berahas · Jorge Nocedal · Martin Takac -
2014 Poster: Communication-Efficient Distributed Dual Coordinate Ascent »
Martin Jaggi · Virginia Smith · Martin Takac · Jonathan Terhorst · Sanjay Krishnan · Thomas Hofmann · Michael Jordan