Workshop
Beyond first order methods in machine learning systems
Anastasios Kyrillidis · Albert Berahas · Fred Roosta · Michael Mahoney
Optimization lies at the heart of many exciting developments in machine learning, statistics and signal processing. As models become more complex and datasets get larger, finding efficient, reliable and provable methods is one of the primary goals in these fields.
In the last few decades, much effort has been devoted to the development of first-order methods. These methods enjoy a low per-iteration cost and have optimal complexity, are easy to implement, and have proven to be effective for most machine learning applications. First-order methods, however, have significant limitations: (1) they require fine hyper-parameter tuning, (2) they do not incorporate curvature information, and thus are sensitive to ill-conditioning, and (3) they are often unable to fully exploit the power of distributed computing architectures.
Higher-order methods, such as Newton, quasi-Newton and adaptive gradient descent methods, are extensively used in many scientific and engineering domains. At least in theory, these methods possess several nice features: they exploit local curvature information to mitigate the effects of ill-conditioning, they avoid or diminish the need for hyper-parameter tuning, and they have enough concurrency to take advantage of distributed computing environments. Researchers have even developed stochastic versions of higher-order methods, that feature speed and scalability by incorporating curvature information in an economical and judicious manner. However, often higher-order methods are “undervalued.”
This workshop will attempt to shed light on this statement. Topics of interest include --but are not limited to-- second-order methods, adaptive gradient descent methods, regularization techniques, as well as techniques based on higher-order derivatives.
Schedule
|
Fri 8:00 a.m. - 8:30 a.m.
|
Opening Remarks
(
Opening remarks
)
>
|
Anastasios Kyrillidis · Albert Berahas · Fred Roosta · Michael Mahoney 🔗 |
|
Fri 8:30 a.m. - 9:15 a.m.
|
Economical use of second-order information in training machine learning models
(
Plenary talk
)
>
|
Donald Goldfarb 🔗 |
|
Fri 9:00 a.m. - 9:45 a.m.
|
Spotlight talks
(
Spotlight talks from paper submissions
)
>
|
Diego Granziol · Fabian Pedregosa · Hilal Asi 🔗 |
|
Fri 9:45 a.m. - 10:30 a.m.
|
Poster Session
(
Poster Session
)
>
|
40 presentersEduard Gorbunov · Alexandre d'Aspremont · Lingxiao Wang · Liwei Wang · Boris Ginsburg · Alessio Quaglino · Camille Castera · Saurabh Adya · Diego Granziol · Rudrajit Das · Raghu Bollapragada · Fabian Pedregosa · Martin Takac · Majid Jahani · Sai Praneeth Karimireddy · Hilal Asi · Balint Daroczy · Leonard Adolphs · Aditya Rawal · Nicolas Brandt · Minhan Li · Giuseppe Ughi · Orlando Romero · Ivan Skorokhodov · Damien Scieur · Kiwook Bae · Konstantin Mishchenko · Rohan Anil · Vatsal Sharan · Aditya Balu · Chao Chen · Zhewei Yao · Tolga Ergen · Paul Grigas · Chris Junchi Li · Jimmy Ba · Stephen J Roberts · Sharan Vaswani · Armin Eftekhari · Chhavi Sharma |
|
Fri 10:30 a.m. - 11:15 a.m.
|
Adaptive gradient methods: efficient implementation and generalization
(
Plenary talk - Elad Hazan
)
>
|
🔗 |
|
Fri 11:15 a.m. - 12:00 p.m.
|
Spotlight talks
(
Spotlight talks from paper submissions
)
>
|
Damien Scieur · Konstantin Mishchenko · Rohan Anil 🔗 |
|
Fri 12:00 p.m. - 2:00 p.m.
|
Lunch break
(
Lunch break
)
>
|
🔗 |
|
Fri 2:00 p.m. - 2:45 p.m.
|
K-FAC: Extensions, improvements, and applications
(
Plenary talk
)
>
|
James Martens 🔗 |
|
Fri 2:45 p.m. - 3:30 p.m.
|
Spotlight talks
(
Spotlight talks from paper submissions
)
>
|
Paul Grigas · Zhewei Yao · Aurelien Lucchi · Si Yi Meng 🔗 |
|
Fri 3:30 p.m. - 4:15 p.m.
|
Poster Session (same as above)
(
Poster Session
)
>
|
🔗 |
|
Fri 4:15 p.m. - 5:00 p.m.
|
Analysis of linear search methods for various gradient approximation schemes for noisy derivative free optimization.
(
Plenary talk
)
>
|
Katya Scheinberg 🔗 |
|
Fri 5:00 p.m. - 5:45 p.m.
|
Second-order methods for nonconvex optimization with complexity guarantees
(
Plenary talk
)
>
|
Stephen Wright 🔗 |
|
Fri 5:45 p.m. - 6:00 p.m.
|
Final remarks
(
Final remarks
)
>
|
Anastasios Kyrillidis · Albert Berahas · Fred Roosta · Michael Mahoney 🔗 |