Timezone: »

Order up! The Benefits of Higher-Order Optimization in Machine Learning
Albert Berahas · Jelena Diakonikolas · Jarad Forristal · Brandon Reese · Martin Takac · Yan Xu

Fri Dec 02 06:15 AM -- 03:00 PM (PST) @ Room 275 - 277
Event URL: https://order-up-ml.github.io/ »

Optimization is a cornerstone of nearly all modern machine learning (ML) and deep learning (DL). Simple first-order gradient-based methods dominate the field for convincing reasons: low computational cost, simplicity of implementation, and strong empirical results.

Yet second- or higher-order methods are rarely used in DL, despite also having many strengths: faster per-iteration convergence, frequent explicit regularization on step-size, and better parallelization than SGD. Additionally, many scientific fields use second-order optimization with great success.

A driving factor for this is the large difference in development effort. By the time higher-order methods were tractable for DL, first-order methods such as SGD and it’s main varients (SGD + Momentum, Adam, …) already had many years of maturity and mass adoption.

The purpose of this workshop is to address this gap, to create an environment where higher-order methods are fairly considered and compared against one-another, and to foster healthy discussion with the end goal of mainstream acceptance of higher-order methods in ML and DL.

Author Information

Albert Berahas (University of Michigan - Ann Arbor)
Jelena Diakonikolas (University of Wisconsin-Madison)
Jarad Forristal (University of Texas at Austin)
Brandon Reese (SAS Institute Inc.)
Martin Takac (Mohamed bin Zayed University of Artificial Intelligence (MBZUAI))
Yan Xu (SAS Institute Inc.)

More from the Same Authors