Poster
in
Workshop: OPT 2023: Optimization for Machine Learning

Greedy Newton: Newton's Method with Exact Line Search

Betty Shea · Mark Schmidt

Project Page [ Poster] [ OpenReview]

Abstract

A defining characteristic of Newton's method is local superlinear convergence. In successively smaller neighborhoods around a strict minimizer, the objective becomes increasingly quadratic, the Newton direction gets closer to optimal, and the method accelerates to the solution. Outside this neighborhood, however, Newton's method converges slowly or even diverges. The issue of non-convergence is usually dealt with by (a) modifications to the Hessian for positive definiteness, and (b) using damped Newton steps. But these approaches change the nature of Newton's method. The former obscures second-order information and changes the update direction arbitrarily. The latter, normally implemented as an Armijo-like line search with an initial stepsize of one, is inexact and could unnecessarily restrict stepsizes to lie between zero and one. In this paper, we analyze Newton's method under an exact line search, which we call "Greedy Newton" (GN). Many problem structures allow for efficient and precise line searches but, as far as we know, a superlinear convergence proof does not exist for this setting. We show that GN not only retains the local superlinear convergence rate, but could also improve the global rate. We experimentally show that GN may work better than damped Newton by allowing stepsizes to deviate significantly from one. Our experiments also suggest that GN combined with Hessian modifications is a powerful optimization method that works in both convex and non-convex settings.

Chat is not available.