NeurIPS Poster First Order Stochastic Optimization with Oblivious Noise

Poster

First Order Stochastic Optimization with Oblivious Noise

Ilias Diakonikolas · Sushrut Karmalkar · Jong Ho Park · Christos Tzamos

Great Hall & Hall B1+B2 (level 1) #1825

[ Abstract ]

[ Paper] [ Poster] [ OpenReview]

Abstract: We initiate the study of stochastic optimization with oblivious noise, broadly generalizing the standard heavy-tailed noise setup.In our setting, in addition to random observation noise, the stochastic gradient may be subject to independent \emph{oblivious noise}, which may not have bounded moments and is not necessarily centered. Specifically, we assume access to a noisy oracle for the stochastic gradient of

f

$f$ at

x

$x$ , which returns a vector

\nabla f (γ, x) + ξ

$\nabla f(\gamma, x) + \xi$ , where

γ

$\gamma$ is the bounded variance observation noise and

ξ

$\xi$ is the oblivious noise that is independent of

γ

$\gamma$ and

x

$x$ . The only assumption we make on the oblivious noise

ξ

$\xi$ is that

Pr [ξ = 0] \geq α

$\Pr[\xi = 0] \ge \alpha$ , for some

α \in (0, 1)

$\alpha \in (0, 1)$ .In this setting, it is not information-theoretically possible to recover a single solution close to the target when the fraction of inliers

α

$\alpha$ is less than

1 / 2

$1/2$ . Our main result is an efficient {\em list-decodable} learner that recovers a small list of candidates at least one of which is close to the true solution. On the other hand, if

α = 1 - ϵ

$\alpha = 1-\epsilon$ , where

0 < ϵ < 1 / 2

$0< \epsilon < 1/2$ is sufficiently smallconstant, the algorithm recovers a single solution.Along the way, we develop a rejection-sampling-based algorithm to perform noisy location estimation, which may be of independent interest.

Chat is not available.