Skip to yearly menu bar Skip to main content


Stochastic Gradient Methods for Distributionally Robust Optimization with f-divergences

Hongseok Namkoong · John Duchi

Area 5+6+7+8 #73

Keywords: [ Stochastic Methods ] [ (Other) Optimization ] [ Convex Optimization ] [ Large Scale Learning and Big Data ] [ Bandit Algorithms ]


We develop efficient solution methods for a robust empirical risk minimization problem designed to give calibrated confidence intervals on performance and provide optimal tradeoffs between bias and variance. Our methods apply to distributionally robust optimization problems proposed by Ben-Tal et al., which put more weight on observations inducing high loss via a worst-case approach over a non-parametric uncertainty set on the underlying data distribution. Our algorithm solves the resulting minimax problems with nearly the same computational cost of stochastic gradient descent through the use of several carefully designed data structures. For a sample of size n, the per-iteration cost of our method scales as O(log n), which allows us to give optimality certificates that distributionally robust optimization provides at little extra cost compared to empirical risk minimization and stochastic gradient methods.

Live content is unavailable. Log in and register to view live content