Workshop

OPT 2016: Optimization for Machine Learning

Suvrit Sra · Francis Bach · Sashank J. Reddi · Niao He

Project Page

Abstract

As the ninth in its series, OPT 2016 builds on remarkable precedent established by the highly successful series of workshops: OPT 2008--OPT 2015, which have been instrumental in bridging the OPT and ML communities closer together.

The previous OPT workshops enjoyed packed to overpacked attendance. This huge interest is no surprise: optimization is the 2nd largest topic at NIPS and is indeed foundational for the wider ML community.

Looking back over the past decade, a strong trend is apparent: The intersection of OPT and ML has grown monotonically to the point that now several cutting-edge advances in optimization arise from the ML community. The distinctive feature of optimization within ML is its departure from textbook approaches, in particular, by having a different set of goals driven by “big-data,” where both models and practical implementation are crucial.

This intimate relation between OPT and ML is the core theme of our workshop. We wish to use OPT2016 as a platform to foster discussion, discovery, and dissemination of the state-of-the-art in optimization as relevant to machine learning. And even beyond that, as a platform to identify new directions and challenges that will drive future research.

How OPT differs from other related workshops:

Compared to the other optimization focused workshops that we are aware of, the distinguishing features of OPT are: (a) it provides a unique bridge between the ML community and the wider optimization community; (b) it encourages theoretical work on an equal footing with practical efficiency; and (c) it caters to a wide body of NIPS attendees, experts and beginners alike (some OPT talks are always of a more “tutorial” nature).

Extended abstract

The OPT workshops have previously covered a variety of topics, such as frameworks for convex programs (D. Bertsekas), the intersection of ML and optimization, classification (S. Wright), stochastic gradient and its tradeoffs (L. Bottou, N. Srebro), structured sparsity (Vandenberghe), randomized methods for convex optimization (A. Nemirovski), complexity theory of convex optimization (Y. Nesterov), distributed optimization (S. Boyd), asynchronous stochastic gradient (B. Recht), algebraic techniques (P. Parrilo), nonconvex optimization (A. Lewis), sums-of-squares techniques (J. Lasserre), deep learning tricks (Y. Bengio), stochastic convex optimization (G. Lan), new views on interior point (E. Hazan), among others.

Several ideas propounded in OPT have by now become important research topics in ML and optimization --- especially in the field of randomized algorithms, stochastic gradient and variance reduced stochastic gradient methods. An edited book "Optimization for Machine Learning" (S. Sra, S. Nowozin, and S. Wright; MIT Press, 2011) grew out of the first three OPT workshops, and contains high-quality contributions from many of the speakers and attendees, and there have been sustained requests for the next edition of such a volume.

Much of the recent focus has been on large-scale first-order convex optimization algorithms for machine learning, both from a theoretical and methodological point of view. Covered topics included stochastic gradient algorithms, (accelerated) proximal algorithms, decomposition and coordinate descent algorithms, parallel and distributed optimization. Theoretical and practical advances in these methods remain a topic of core interest to the workshop. Recent years have also seen interesting advances in non-convex optimization such as a growing body of results on alternating minimization, tensor factorization etc.

We also do not wish to ignore the not particularly large scale setting, where one does have time to wield substantial computational resources. In this setting, high-accuracy solutions and deep understanding of the lessons contained in the data are needed. Examples valuable to MLers may be exploration of genetic and environmental data to identify risk factors for disease; or problems dealing with setups where the amount of observed data is not huge, but the mathematical model is complex. Consequently, we encourage optimization methods on manifolds, ML problems with differential geometric antecedents, those using advanced algebraic techniques, and computational topology, for instance.

At this point, we would like to emphasize again that OPT2016 is one of the few optimization+ML workshops that lies at the intersection of theory and practice: both actual efficiency of algorithms in practice as well as their theoretical analysis are given equal value.