Timezone: »

Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates
Alp Yurtsever · Alex Gu · Suvrit Sra

Wed Dec 08 04:30 PM -- 06:00 PM (PST) @ Virtual
Three Operator Splitting (TOS) (Davis & Yin, 2017) can minimize the sum of multiple convex functions effectively when an efficient gradient oracle or proximal operator is available for each term. This requirement often fails in machine learning applications: (i) instead of full gradients only stochastic gradients may be available; and (ii) instead of proximal operators, using subgradients to handle complex penalty functions may be more efficient and realistic. Motivated by these concerns, we analyze three potentially valuable extensions of TOS. The first two permit using subgradients and stochastic gradients, and are shown to ensure a $\mathcal{O}(1/\sqrt{t})$ convergence rate. The third extension AdapTOS endows TOS with adaptive step-sizes. For the important setting of optimizing a convex loss over the intersection of convex sets AdapTOS attains universal convergence rates, i.e., the rate adapts to the unknown smoothness degree of the objective. We compare our proposed methods with competing methods on various applications.

Author Information

Alp Yurtsever (Umeå University)
Alex Gu (MIT)
Suvrit Sra (MIT)

Suvrit Sra is a Research Faculty at the Laboratory for Information and Decision Systems (LIDS) at Massachusetts Institute of Technology (MIT). He obtained his PhD in Computer Science from the University of Texas at Austin in 2007. Before moving to MIT, he was a Senior Research Scientist at the Max Planck Institute for Intelligent Systems, in Tübingen, Germany. He has also held visiting faculty positions at UC Berkeley (EECS) and Carnegie Mellon University (Machine Learning Department) during 2013-2014. His research is dedicated to bridging a number of mathematical areas such as metric geometry, matrix analysis, convex analysis, probability theory, and optimization with machine learning; more broadly, his work involves algorithmically grounded topics within engineering and science. He has been a co-chair for OPT2008-2015, NIPS workshops on "Optimization for Machine Learning," and has also edited a volume of the same name (MIT Press, 2011).

More from the Same Authors