NeurIPS Poster Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization

Poster

Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization

Tianyi Lin · Zeyu Zheng · Michael Jordan

Hall J (level 1) #541

Keywords: [ gradient-free methods ] [ finite-time convergence guarantee ] [ smoothing ] [ Goldstein subdifferential ] [ nonsmooth nonconvex optimization ]

[ Abstract ]

[ Paper] [ Poster] [ OpenReview]

Abstract: Nonsmooth nonconvex optimization problems broadly emerge in machine learning and business decision making, whereas two core challenges impede the development of efficient solution methods with finite-time convergence guarantee: the lack of computationally tractable optimality criterion and the lack of computationally powerful oracles. The contributions of this paper are two-fold. First, we establish the relationship between the celebrated Goldstein subdifferential~\citep{Goldstein-1977-Optimization} and uniform smoothing, thereby providing the basis and intuition for the design of gradient-free methods that guarantee the finite-time convergence to a set of Goldstein stationary points. Second, we propose the gradient-free method (GFM) and stochastic GFM for solving a class of nonsmooth nonconvex optimization problems and prove that both of them can return a

(δ, ϵ)

$(\delta,\epsilon)$ -Goldstein stationary point of a Lipschitz function

f

$f$ at an expected convergence rate at

O (d^{3 / 2} δ^{- 1} ϵ^{- 4})

$O(d^{3/2}\delta^{-1}\epsilon^{-4})$ where

d

$d$ is the problem dimension. Two-phase versions of GFM and SGFM are also proposed and proven to achieve improved large-deviation results. Finally, we demonstrate the effectiveness of 2-SGFM on training ReLU neural networks with the \textsc{Minst} dataset.

Chat is not available.