Timezone: »
As the ninth in its series, OPT 2016 builds on remarkable precedent established by the highly successful series of workshops: OPT 2008OPT 2015, which have been instrumental in bridging the OPT and ML communities closer together.
The previous OPT workshops enjoyed packed to overpacked attendance. This huge interest is no surprise: optimization is the 2nd largest topic at NIPS and is indeed foundational for the wider ML community.
Looking back over the past decade, a strong trend is apparent: The intersection of OPT and ML has grown monotonically to the point that now several cuttingedge advances in optimization arise from the ML community. The distinctive feature of optimization within ML is its departure from textbook approaches, in particular, by having a different set of goals driven by “bigdata,” where both models and practical implementation are crucial.
This intimate relation between OPT and ML is the core theme of our workshop. We wish to use OPT2016 as a platform to foster discussion, discovery, and dissemination of the stateoftheart in optimization as relevant to machine learning. And even beyond that, as a platform to identify new directions and challenges that will drive future research.
How OPT differs from other related workshops:
Compared to the other optimization focused workshops that we are aware of, the distinguishing features of OPT are: (a) it provides a unique bridge between the ML community and the wider optimization community; (b) it encourages theoretical work on an equal footing with practical efficiency; and (c) it caters to a wide body of NIPS attendees, experts and beginners alike (some OPT talks are always of a more “tutorial” nature).
Extended abstract
The OPT workshops have previously covered a variety of topics, such as frameworks for convex programs (D. Bertsekas), the intersection of ML and optimization, classification (S. Wright), stochastic gradient and its tradeoffs (L. Bottou, N. Srebro), structured sparsity (Vandenberghe), randomized methods for convex optimization (A. Nemirovski), complexity theory of convex optimization (Y. Nesterov), distributed optimization (S. Boyd), asynchronous stochastic gradient (B. Recht), algebraic techniques (P. Parrilo), nonconvex optimization (A. Lewis), sumsofsquares techniques (J. Lasserre), deep learning tricks (Y. Bengio), stochastic convex optimization (G. Lan), new views on interior point (E. Hazan), among others.
Several ideas propounded in OPT have by now become important research topics in ML and optimization  especially in the field of randomized algorithms, stochastic gradient and variance reduced stochastic gradient methods. An edited book "Optimization for Machine Learning" (S. Sra, S. Nowozin, and S. Wright; MIT Press, 2011) grew out of the first three OPT workshops, and contains highquality contributions from many of the speakers and attendees, and there have been sustained requests for the next edition of such a volume.
Much of the recent focus has been on largescale firstorder convex optimization algorithms for machine learning, both from a theoretical and methodological point of view. Covered topics included stochastic gradient algorithms, (accelerated) proximal algorithms, decomposition and coordinate descent algorithms, parallel and distributed optimization. Theoretical and practical advances in these methods remain a topic of core interest to the workshop. Recent years have also seen interesting advances in nonconvex optimization such as a growing body of results on alternating minimization, tensor factorization etc.
We also do not wish to ignore the not particularly large scale setting, where one does have time to wield substantial computational resources. In this setting, highaccuracy solutions and deep understanding of the lessons contained in the data are needed. Examples valuable to MLers may be exploration of genetic and environmental data to identify risk factors for disease; or problems dealing with setups where the amount of observed data is not huge, but the mathematical model is complex. Consequently, we encourage optimization methods on manifolds, ML problems with differential geometric antecedents, those using advanced algebraic techniques, and computational topology, for instance.
At this point, we would like to emphasize again that OPT2016 is one of the few optimization+ML workshops that lies at the intersection of theory and practice: both actual efficiency of algorithms in practice as well as their theoretical analysis are given equal value.
Fri 11:15 p.m.  11:30 p.m.

Opening Remarks

🔗 
Fri 11:30 p.m.  12:15 a.m.

Invited Talk: Online Optimization, Smoothing, and Worstcase Competitive Ratio (Maryam Fazel, University of Washington)
(Talk)
In Online Optimization, the data in an optimization problem is revealed over time and at each step a decision variable needs to be set without knowing the future data. This setup covers online resource allocation, from classical inventory problems to the Adwords problem popular in online advertising. In this talk, we prove bounds on the competitive ratio of two primaldual algorithms for a broad class of online convex optimization problems. We give a sufficient condition on the objective function that guarantees a constant worstcase competitive ratio for monotone functions. We show how smoothing the objective can improve the competitive ratio of these algorithms, and for separable functions, we show that the optimal smoothing can be derived by solving a convex optimization problem. This result allows us to directly optimize the competitive ratio bound over a class of smoothing functions, and hence design effective smoothing customized for a given cost function. 
🔗 
Sat 12:15 a.m.  12:30 a.m.

Spotlight: Markov Chain Lifting and Distributed ADMM
(Spotlight)
The time to converge to the steady state of a finite Markov chain can be greatly reduced by a lifting operation, which creates a new Markov chain on an expanded state space. For a class of quadratic objectives, we show an analogous behavior whereby a distributed ADMM algorithm can be seen as a lifting of Gradient Descent. This provides a deep insight for its faster convergence rate under optimal parameter tuning. We conjecture that this gain is always present, contrary to when lifting a Markov chain, where sometimes we only obtain a marginal speedup. 
🔗 
Sat 12:30 a.m.  1:30 a.m.

Poster Session 1
(Break)

🔗 
Sat 1:30 a.m.  2:00 a.m.

Coffee Break 1
(Poster Session)

🔗 
Sat 2:00 a.m.  2:45 a.m.

Invited Talk: Kernelbased Methods for Bandit Convex Optimization (Sébastien Bubeck, Microsoft Research)
(Talk)
A lot of progress has been made in recent years on extending classical multiarmed bandit strategies to very large set of actions. A particularly challenging setting is the one where the action set is continuous and the underlying cost function is convex, this is the socalled bandit convex optimization (BCO) problem. I will tell the story of BCO and explain some of the new ideas that we recently developed to solve it. I will focus on three new ideas from our recent work http://arxiv.org/abs/1607.03084 with Yin Tat Lee and Ronen Eldan: (i) a new connection between kernel methods and the popular multiplicative weights strategy; (ii) a new connection between kernel methods and one of Erdos’ favorite mathematical object, the Bernoulli convolution, and (iii) a new adaptive (and increasing!) learning rate for multiplicative weights. These ideas could be of broader interest in learning/algorithm’s design 
🔗 
Sat 2:45 a.m.  3:00 a.m.

Spotlight: FrankWolfe Algorithms for Saddle Point Problems
(Spotlight)
We extend the FrankWolfe (FW) optimization algorithm to solve constrained smooth convexconcave saddle point (SP) problems. Remarkably, the method only requires access to linear minimization oracles. Leveraging recent advances in FW optimization, we provide the first proof of convergence of a FWtype saddle point solver over polytopes, thereby partially answering a 30 yearold conjecture. We verify our convergence rates empirically and observe that by using a heuristic stepsize, we can get empirical convergence under more general conditions, paving the way for future theoretical work. 
🔗 
Sat 3:00 a.m.  5:00 a.m.

Lunch Break
(Break)

🔗 
Sat 5:00 a.m.  5:45 a.m.

Invited Talk: Semidefinite Programs with a Dash of Smoothness: Why and When the LowRank Approach Works (Nicolas Boumal, Princeton University)
(Talk)
Semidefinite programs (SDPs) can be solved in polynomial time by interior point methods, but scalability can be an issue. To address this shortcoming, over a decade ago, Burer and Monteiro proposed to solve SDPs with few equality constraints via lowrank, nonconvex surrogates. Remarkably, for some applications, local optimization methods seem to converge to global optima of these nonconvex surrogates reliably. In this presentation, we show that the BurerMonteiro formulation of SDPs in a certain class almost never has any spurious local optima, that is: the nonconvexity of the lowrank formulation is benign (even saddles are strict). This class of SDPs covers applications such as maxcut, community detection in the stochastic block model, robust PCA, phase retrieval and synchronization of rotations. The crucial assumption we make is that the lowrank problem lives on a manifold. Then, theory and algorithms from optimization on manifolds can be used. Optimization on manifolds is about minimizing a cost function over a smooth manifold, such as spheres, lowrank matrices, orthonormal frames, rotations, etc. We will present the basic framework as well as parts of the more general convergence theory, including recent complexity results. (Toolbox: http://www.manopt.org) Select parts are joint work with P.A. Absil, A. Bandeira, C. Cartis and V. Voroninski. 
🔗 
Sat 5:45 a.m.  6:00 a.m.

Spotlight: QuickeNing: A Generic QuasiNewton Algorithm for Faster GradientBased Optimization
(Spotlight)
We propose a technique to accelerate gradientbased optimization algorithms by giving them the ability to exploit LBFGS heuristics. Our scheme is (i) generic and can be applied to a large class of firstorder algorithms; (ii) it is compatible with composite objectives, meaning that it may provide exactly sparse solutions when a sparsityinducing regularization is involved; (iii) it admits a linear convergence rate for stronglyconvex problems; (iv) it is easy to use and it does not require any line search. Our work is inspired in part by the Catalyst metaalgorithm, which accelerates gradientbased techniques in the sense of Nesterov; here, we adopt a different strategy based on LBFGS rules to learn and exploit the local curvature. In most practical cases, we observe significant improvements over Catalyst for solving largescale highdimensional machine learning problems. 
🔗 
Sat 6:00 a.m.  6:30 a.m.

Coffee Break 2
(Poster Session)

🔗 
Sat 6:30 a.m.  7:30 a.m.

Poster Session 2
(Poster Session)

🔗 
Sat 7:30 a.m.  8:15 a.m.

Invited Talk: Oracle Complexity of SecondOrder Methods for FiniteSum Problems (Ohad Shamir, Weizmann Institute of Science)
(Talk)
Finitesum optimization problems are ubiquitous in machine learning, and are commonly solved using firstorder methods which rely on gradient computations. Recently, there has been growing interest in secondorder methods, which rely on both gradients and Hessians. In principle, secondorder methods can require much fewer iterations than firstorder methods, and hold the promise for more efficient algorithms. Although computing and manipulating Hessians is prohibitive for highdimensional problems in general, the Hessians of individual functions in finitesum problems can often be efficiently computed, e.g. because they possess a lowrank structure. Can secondorder information indeed be used to solve such problems more efficiently? In this talk, I'll provide evidence that the answer  perhaps surprisingly  is negative, at least in terms of worstcase guarantees. However, I'll also discuss what additional assumptions and algorithmic approaches might potentially circumvent this negative result. Joint work with Yossi Arjevani. 
🔗 
Sat 8:15 a.m.  8:30 a.m.

Spotlight: Reliably Learning the ReLU in Polynomial Time
(Spotlight)
We give the first dimensionefficient algorithms for learning Rectified Linear Units (ReLUs), which are functions of the form max(0, w.x) with w a unit vector (2norm equal to 1). Our algorithm works in the challenging Reliable Agnostic learning model of Kalai, Kanade, and Mansour where the learner is given access to a distribution D on labeled examples but the labeling may be arbitrary. We construct a hypothesis that simultaneously minimizes the falsepositive rate and the l_p loss (for p=1 or p >=2) of inputs given positive labels by D. It runs in polynomialtime (in n) with respect to {\em any} distribution on S^{n1} (the unit sphere in n dimensions) and for any error parameter \epsilon = \Omega(1/ \log n). These results are in contrast to known efficient algorithms for reliably learning linear threshold functions, where epsilon must be Omega(1) and strong assumptions are required on the marginal distribution. We can compose our results to obtain the first set of efficient algorithms for learning constantdepth networks of ReLUs. Our techniques combine kernel methods and polynomial approximations with a 
🔗 
Author Information
Suvrit Sra (MIT)
Suvrit Sra is a faculty member within the EECS department at MIT, where he is also a core faculty member of IDSS, LIDS, MITML Group, as well as the statistics and data science center. His research spans topics in optimization, matrix theory, differential geometry, and probability theory, which he connects with machine learning  a key focus of his research is on the theme "Optimization for Machine Learning” (http://optml.org)
Francis Bach (INRIA  Ecole Normale Superieure)
Sashank J. Reddi (Carnegie Mellon University)
Niao He (UIUC)
More from the Same Authors

2021 Spotlight: Batch Normalization Orthogonalizes Representations in Deep Random Networks »
Hadi Daneshmand · Amir Joudaki · Francis Bach 
2022 Poster: A Nonasymptotic Analysis of Nonparametric TemporalDifference Learning »
Eloïse Berthier · Ziad Kobeissi · Francis Bach 
2022 Spotlight: Lightning Talks 1A4 »
Siwei Wang · Jing Liu · Nianqiao Ju · Shiqian Li · Eloïse Berthier · Muhammad Faaiz Taufiq · Arsene Fansi Tchango · Chen Liang · Chulin Xie · Jordan Awan · JeanFrancois Ton · Ziad Kobeissi · Wenguan Wang · Xinwang Liu · Kewen Wu · Rishab Goel · Jiaxu Miao · Suyuan Liu · Julien Martel · Ruobin Gong · Francis Bach · Chi Zhang · Rob Cornish · Sanmi Koyejo · Zhi Wen · Yee Whye Teh · Yi Yang · Jiaqi Jin · Bo Li · Yixin Zhu · Vinayak Rao · Wenxuan Tu · Gaetan Marceau Caron · Arnaud Doucet · Xinzhong Zhu · Joumana Ghosn · En Zhu 
2022 Spotlight: A Nonasymptotic Analysis of Nonparametric TemporalDifference Learning »
Eloïse Berthier · Ziad Kobeissi · Francis Bach 
2022 Poster: Variational inference via Wasserstein gradient flows »
Marc Lambert · Sinho Chewi · Francis Bach · Silvère Bonnabel · Philippe Rigollet 
2022 Poster: CCCP is FrankWolfe in disguise »
Alp Yurtsever · Suvrit Sra 
2022 Poster: Efficient Sampling on Riemannian Manifolds via Langevin MCMC »
Xiang Cheng · Jingzhao Zhang · Suvrit Sra 
2022 Poster: Asynchronous SGD Beats Minibatch SGD Under Arbitrary Delays »
Konstantin Mishchenko · Francis Bach · Mathieu Even · Blake Woodworth 
2022 Poster: On the Theoretical Properties of Noise Correlation in Stochastic Optimization »
Aurelien Lucchi · Frank Proske · Antonio Orvieto · Francis Bach · Hans Kersting 
2022 Poster: Fast Stochastic Composite Minimization and an Accelerated FrankWolfe Algorithm under Parallelization »
Benjamin DuboisTaine · Francis Bach · Quentin Berthet · Adrien Taylor 
2022 Poster: Active Labeling: Streaming Stochastic Gradients »
Vivien Cabannes · Francis Bach · Vianney Perchet · Alessandro Rudi 
2021 Test Of Time: Online Learning for Latent Dirichlet Allocation »
Matthew Hoffman · Francis Bach · David Blei 
2021 Poster: Can contrastive learning avoid shortcut solutions? »
Joshua Robinson · Li Sun · Ke Yu · Kayhan Batmanghelich · Stefanie Jegelka · Suvrit Sra 
2021 Poster: Overcoming the curse of dimensionality with Laplacian regularization in semisupervised learning »
Vivien Cabannes · Loucas PillaudVivien · Francis Bach · Alessandro Rudi 
2021 Poster: Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates »
Alp Yurtsever · Alex Gu · Suvrit Sra 
2021 Oral: Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms »
Mathieu Even · Raphaël Berthier · Francis Bach · Nicolas Flammarion · Hadrien Hendrikx · Pierre Gaillard · Laurent Massoulié · Adrien Taylor 
2021 Poster: Batch Normalization Orthogonalizes Representations in Deep Random Networks »
Hadi Daneshmand · Amir Joudaki · Francis Bach 
2021 Poster: Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms »
Mathieu Even · Raphaël Berthier · Francis Bach · Nicolas Flammarion · Hadrien Hendrikx · Pierre Gaillard · Laurent Massoulié · Adrien Taylor 
2020 : Invited speaker: SGD without replacement: optimal rate analysis and more, Suvrit Sra »
Suvrit Sra 
2020 : Francis Bach  Where is Machine Learning Going? »
Francis Bach 
2020 Poster: SGD with shuffling: optimal rates without component convexity and large epoch requirements »
Kwangjun Ahn · Chulhee Yun · Suvrit Sra 
2020 Spotlight: SGD with shuffling: optimal rates without component convexity and large epoch requirements »
Kwangjun Ahn · Chulhee Yun · Suvrit Sra 
2020 Poster: Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model »
Raphaël Berthier · Francis Bach · Pierre Gaillard 
2020 Poster: Learning with Differentiable Pertubed Optimizers »
Quentin Berthet · Mathieu Blondel · Olivier Teboul · Marco Cuturi · JeanPhilippe Vert · Francis Bach 
2020 Poster: Batch normalization provably avoids ranks collapse for randomly initialised deep networks »
Hadi Daneshmand · Jonas Kohler · Francis Bach · Thomas Hofmann · Aurelien Lucchi 
2020 Poster: Nonparametric Models for Nonnegative Functions »
Ulysse MarteauFerey · Francis Bach · Alessandro Rudi 
2020 Spotlight: Nonparametric Models for Nonnegative Functions »
Ulysse MarteauFerey · Francis Bach · Alessandro Rudi 
2020 Session: Orals & Spotlights Track 30: Optimization/Theory »
Yuxin Chen · Francis Bach 
2020 Poster: DualFree Stochastic Decentralized Optimization with Variance Reduction »
Hadrien Hendrikx · Francis Bach · Laurent Massoulié 
2020 Poster: Why are Adaptive Methods Good for Attention Models? »
Jingzhao Zhang · Sai Praneeth Karimireddy · Andreas Veit · Seungyeon Kim · Sashank Reddi · Sanjiv Kumar · Suvrit Sra 
2020 Poster: Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes »
Yi Tian · Jian Qian · Suvrit Sra 
2020 Spotlight: Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes »
Yi Tian · Jian Qian · Suvrit Sra 
2019 : Closing Remarks »
Bo Dai · Niao He · Nicolas Le Roux · Lihong Li · Dale Schuurmans · Martha White 
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren AmdahlCulleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · MariusConstantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · PierreLuc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · XiaoWen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · ChenYu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall 
2019 : Poster Spotlight 1 »
David Brandfonbrener · Joan Bruna · Tom Zahavy · Haim Kaplan · Yishay Mansour · Nikos Karampatziakis · John Langford · Paul Mineiro · Donghwan Lee · Niao He 
2019 Workshop: Bridging Game Theory and Deep Learning »
Ioannis Mitliagkas · Gauthier Gidel · Niao He · Reyhane Askari Hemmat · N H · Nika Haghtalab · Simon LacosteJulien 
2019 Workshop: The Optimization Foundations of Reinforcement Learning »
Bo Dai · Niao He · Nicolas Le Roux · Lihong Li · Dale Schuurmans · Martha White 
2019 : Opening Remarks »
Bo Dai · Niao He · Nicolas Le Roux · Lihong Li · Dale Schuurmans · Martha White 
2019 Poster: Flexible Modeling of Diversity with Strongly LogConcave Distributions »
Joshua Robinson · Suvrit Sra · Stefanie Jegelka 
2019 Poster: Are deep ResNets provably better than linear predictors? »
Chulhee Yun · Suvrit Sra · Ali Jadbabaie 
2019 Poster: Exponential Family Estimation via Adversarial Dynamics Embedding »
Bo Dai · Zhen Liu · Hanjun Dai · Niao He · Arthur Gretton · Le Song · Dale Schuurmans 
2019 Poster: Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity »
Chulhee Yun · Suvrit Sra · Ali Jadbabaie 
2019 Spotlight: Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity »
Chulhee Yun · Suvrit Sra · Ali Jadbabaie 
2019 Poster: Fast Decomposable Submodular Function Minimization using Constrained Total Variation »
Senanayak Sesh Kumar Karri · Francis Bach · Thomas Pock 
2019 Poster: Towards closing the gap between the theory and practice of SVRG »
Othmane Sebbouh · Nidham Gazagnadou · Samy Jelassi · Francis Bach · Robert Gower 
2019 Poster: An Accelerated Decentralized Stochastic Proximal Algorithm for Finite Sums »
Hadrien Hendrikx · Francis Bach · Laurent Massoulié 
2019 Poster: On Lazy Training in Differentiable Programming »
Lénaïc Chizat · Edouard Oyallon · Francis Bach 
2019 Poster: Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks »
Gauthier Gidel · Francis Bach · Simon LacosteJulien 
2019 Poster: Massively scalable Sinkhorn distances via the Nyström method »
Jason Altschuler · Francis Bach · Alessandro Rudi · Jonathan NilesWeed 
2019 Poster: Localized Structured Prediction »
Carlo Ciliberto · Francis Bach · Alessandro Rudi 
2019 Poster: UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization »
Ali Kavis · Kfir Y. Levy · Francis Bach · Volkan Cevher 
2019 Poster: Learning Positive Functions with Pseudo Mirror Descent »
Yingxiang Yang · Haoxiang Wang · Negar Kiyavash · Niao He 
2019 Spotlight: UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization »
Ali Kavis · Kfir Y. Levy · Francis Bach · Volkan Cevher 
2019 Spotlight: Learning Positive Functions with Pseudo Mirror Descent »
Yingxiang Yang · Haoxiang Wang · Negar Kiyavash · Niao He 
2019 Poster: Partially Encrypted Deep Learning using Functional Encryption »
Théo Ryffel · David Pointcheval · Francis Bach · Edouard DufourSans · Romain Gay 
2019 Poster: Globally Convergent Newton Methods for Illconditioned Generalized Selfconcordant Losses »
Ulysse MarteauFerey · Francis Bach · Alessandro Rudi 
2018 : Smooth Games in Machine Learning Beyond GANs »
Niao He 
2018 Poster: Optimal Algorithms for NonSmooth Distributed Optimization in Networks »
Kevin Scaman · Francis Bach · Sebastien Bubeck · Laurent Massoulié · Yin Tat Lee 
2018 Poster: Direct RungeKutta Discretization Achieves Acceleration »
Jingzhao Zhang · Aryan Mokhtari · Suvrit Sra · Ali Jadbabaie 
2018 Poster: Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes »
Loucas PillaudVivien · Alessandro Rudi · Francis Bach 
2018 Spotlight: Direct RungeKutta Discretization Achieves Acceleration »
Jingzhao Zhang · Aryan Mokhtari · Suvrit Sra · Ali Jadbabaie 
2018 Oral: Optimal Algorithms for NonSmooth Distributed Optimization in Networks »
Kevin Scaman · Francis Bach · Sebastien Bubeck · Laurent Massoulié · Yin Tat Lee 
2018 Poster: Relating Leverage Scores and Density using Regularized Christoffel Functions »
Edouard Pauwels · Francis Bach · JeanPhilippe Vert 
2018 Poster: Efficient Algorithms for Nonconvex Isotonic Regression through Submodular Optimization »
Francis Bach 
2018 Poster: Coupled Variational Bayes via Optimization Embedding »
Bo Dai · Hanjun Dai · Niao He · Weiyang Liu · Zhen Liu · Jianshu Chen · Lin Xiao · Le Song 
2018 Poster: RestKatyusha: Exploiting the Solution's Structure via Scheduled Restart Schemes »
Junqi Tang · Mohammad Golbabaee · Francis Bach · Mike Davies 
2018 Poster: Exponentiated Strongly Rayleigh Distributions »
Zelda Mariet · Suvrit Sra · Stefanie Jegelka 
2018 Poster: Predictive Approximate Bayesian Computation via Saddle Points »
Yingxiang Yang · Bo Dai · Negar Kiyavash · Niao He 
2018 Poster: SING: SymboltoInstrument Neural Generator »
Alexandre Defossez · Neil Zeghidour · Nicolas Usunier · Leon Bottou · Francis Bach 
2018 Poster: Quadratic Decomposable Submodular Function Minimization »
Pan Li · Niao He · Olgica Milenkovic 
2018 Poster: On the Global Convergence of Gradient Descent for Overparameterized Models using Optimal Transport »
Lénaïc Chizat · Francis Bach 
2018 Tutorial: Negative Dependence, Stable Polynomials, and All That »
Suvrit Sra · Stefanie Jegelka 
2017 : Concluding remarks »
Francis Bach · Benjamin Guedj · Pascal Germain 
2017 : Neil Lawrence, Francis Bach and François Laviolette »
Neil Lawrence · Francis Bach · Francois Laviolette 
2017 : Sharp asymptotic and finitesample rates of convergence of empirical measures in Wasserstein distance »
Francis Bach 
2017 : Overture »
Benjamin Guedj · Francis Bach · Pascal Germain 
2017 Workshop: (Almost) 50 shades of Bayesian Learning: PACBayesian trends and insights »
Benjamin Guedj · Pascal Germain · Francis Bach 
2017 Workshop: OPT 2017: Optimization for Machine Learning »
Suvrit Sra · Sashank J. Reddi · Alekh Agarwal · Benjamin Recht 
2017 Poster: On Structured Prediction Theory with Calibrated Convex Surrogate Losses »
Anton Osokin · Francis Bach · Simon LacosteJulien 
2017 Poster: Online Learning for Multivariate Hawkes Processes »
Yingxiang Yang · Jalal Etesami · Niao He · Negar Kiyavash 
2017 Oral: On Structured Prediction Theory with Calibrated Convex Surrogate Losses »
Anton Osokin · Francis Bach · Simon LacosteJulien 
2017 Poster: Elementary Symmetric Polynomials for Optimal Experimental Design »
Zelda Mariet · Suvrit Sra 
2017 Poster: Nonlinear Acceleration of Stochastic Algorithms »
Damien Scieur · Francis Bach · Alexandre d'Aspremont 
2017 Poster: Integration Methods and Optimization Algorithms »
Damien Scieur · Vincent Roulet · Francis Bach · Alexandre d'Aspremont 
2017 Poster: Polynomial time algorithms for dual volume sampling »
Chengtao Li · Stefanie Jegelka · Suvrit Sra 
2016 : Francis Bach. Harder, Better, Faster, Stronger Convergence Rates for LeastSquares Regression. »
Francis Bach 
2016 : Taming nonconvexity via geometry »
Suvrit Sra 
2016 : Submodular Functions: from Discrete to Continuous Domains »
Francis Bach 
2016 Workshop: Learning in High Dimensions with Structure »
Nikhil Rao · Prateek Jain · HsiangFu Yu · Ming Yuan · Francis Bach 
2016 Poster: Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling »
Chengtao Li · Suvrit Sra · Stefanie Jegelka 
2016 Poster: Kronecker Determinantal Point Processes »
Zelda Mariet · Suvrit Sra 
2016 Poster: Variance Reduction in Stochastic Gradient Langevin Dynamics »
Kumar Avinava Dubey · Sashank J. Reddi · Sinead Williamson · Barnabas Poczos · Alexander Smola · Eric Xing 
2016 Poster: Parameter Learning for Logsupermodular Distributions »
Tatiana Shpakova · Francis Bach 
2016 Poster: Regularized Nonlinear Acceleration »
Damien Scieur · Alexandre d'Aspremont · Francis Bach 
2016 Oral: Regularized Nonlinear Acceleration »
Damien Scieur · Alexandre d'Aspremont · Francis Bach 
2016 Poster: Stochastic Variance Reduction Methods for SaddlePoint Problems »
Balamurugan Palaniappan · Francis Bach 
2016 Poster: PACBayesian Theory Meets Bayesian Inference »
Pascal Germain · Francis Bach · Alexandre Lacoste · Simon LacosteJulien 
2016 Poster: Proximal Stochastic Methods for Nonsmooth Nonconvex FiniteSum Optimization »
Sashank J. Reddi · Suvrit Sra · Barnabas Poczos · Alexander Smola 
2016 Poster: Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds »
Hongyi Zhang · Sashank J. Reddi · Suvrit Sra 
2016 Poster: Stochastic Optimization for Largescale Optimal Transport »
Aude Genevay · Marco Cuturi · Gabriel Peyré · Francis Bach 
2016 Tutorial: LargeScale Optimization: Beyond Stochastic Gradient Descent and Convexity »
Suvrit Sra · Francis Bach 
2015 : Structured Sparsity and convex optimization »
Francis Bach 
2015 : Sharp Analysis of Random Feature Expansions »
Francis Bach 
2015 : Convergence Rates of Kernel Quadrature Rules »
Francis Bach 
2015 Workshop: Optimization for Machine Learning (OPT2015) »
Suvrit Sra · Alekh Agarwal · Leon Bottou · Sashank J. Reddi 
2015 Poster: Matrix Manifold Optimization for Gaussian Mixtures »
Reshad Hosseini · Suvrit Sra 
2015 Poster: On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants »
Sashank J. Reddi · Ahmed Hefny · Suvrit Sra · Barnabas Poczos · Alexander Smola 
2015 Poster: Rethinking LDA: Moment Matching for Discrete ICA »
Anastasia Podosinnikova · Francis Bach · Simon LacosteJulien 
2015 Poster: Spectral Norm Regularization of Orthonormal Representations for Graph Transduction »
Rakesh Shivanna · Bibaswan K Chatterjee · Raman Sankaran · Chiranjib Bhattacharyya · Francis Bach 
2014 Workshop: OPT2014: Optimization for Machine Learning »
Zaid Harchaoui · Suvrit Sra · Alekh Agarwal · Martin Jaggi · Miro Dudik · Aaditya Ramdas · Jean Lasserre · Yoshua Bengio · Amir Beck 
2014 Poster: Efficient Structured Matrix Rank Minimization »
Adams Wei Yu · Wanli Ma · Yaoliang Yu · Jaime Carbonell · Suvrit Sra 
2014 Poster: Metric Learning for Temporal Sequence Alignment »
Rémi Lajugie · Damien Garreau · Francis Bach · Sylvain Arlot 
2014 Poster: SAGA: A Fast Incremental Gradient Method With Support for NonStrongly Convex Composite Objectives »
Aaron Defazio · Francis Bach · Simon LacosteJulien 
2013 Workshop: OPT2013: Optimization for Machine Learning »
Suvrit Sra · Alekh Agarwal 
2013 Poster: Geometric optimisation on positive definite matrices for elliptically contoured distributions »
Suvrit Sra · Reshad Hosseini 
2013 Poster: Nonstronglyconvex smooth stochastic approximation with convergence rate O(1/n) »
Francis Bach · Eric Moulines 
2013 Spotlight: Nonstronglyconvex smooth stochastic approximation with convergence rate O(1/n) »
Francis Bach · Eric Moulines 
2013 Session: Oral Session 2 »
Francis Bach 
2013 Poster: Convex Relaxations for Permutation Problems »
Fajwel Fogel · Rodolphe Jenatton · Francis Bach · Alexandre d'Aspremont 
2013 Poster: Reflection methods for userfriendly submodular optimization »
Stefanie Jegelka · Francis Bach · Suvrit Sra 
2013 Session: Tutorial Session B »
Francis Bach 
2012 Workshop: Optimization for Machine Learning »
Suvrit Sra · Alekh Agarwal 
2012 Workshop: Analysis Operator Learning vs. Dictionary Learning: Fraternal Twins in Sparse Modeling »
Martin Kleinsteuber · Francis Bach · Remi Gribonval · John Wright · Simon Hawe 
2012 Poster: Multiple Operatorvalued Kernel Learning »
Hachem Kadri · Alain Rakotomamonjy · Francis Bach · philippe preux 
2012 Poster: A new metric on the manifold of kernel matrices with application to matrix geometric means »
Suvrit Sra 
2012 Poster: A Stochastic Gradient Method with an Exponential Convergence
Rate for Finite Training Sets »
Nicolas Le Roux · Mark Schmidt · Francis Bach 
2012 Oral: A Stochastic Gradient Method with an Exponential Convergence
Rate for Finite Training Sets »
Nicolas Le Roux · Mark Schmidt · Francis Bach 
2012 Poster: Scalable nonconvex inexact proximal splitting »
Suvrit Sra 
2011 Workshop: Optimization for Machine Learning »
Suvrit Sra · Stephen Wright · Sebastian Nowozin 
2011 Workshop: Sparse Representation and Lowrank Approximation »
Ameet S Talwalkar · Lester W Mackey · Mehryar Mohri · Michael W Mahoney · Francis Bach · Mike Davies · Remi Gribonval · Guillaume R Obozinski 
2011 Poster: Convergence Rates of Inexact ProximalGradient Methods for Convex Optimization »
Mark Schmidt · Nicolas Le Roux · Francis Bach 
2011 Oral: Convergence Rates of Inexact ProximalGradient Methods for Convex Optimization »
Mark Schmidt · Nicolas Le Roux · Francis Bach 
2011 Poster: NonAsymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning »
Francis Bach · Eric Moulines 
2011 Poster: Trace Lasso: a trace norm regularization for correlated designs »
Edouard Grave · Guillaume R Obozinski · Francis Bach 
2011 Spotlight: NonAsymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning »
Francis Bach · Eric Moulines 
2011 Poster: Shaping Level Sets with Submodular Functions »
Francis Bach 
2010 Workshop: New Directions in Multiple Kernel Learning »
Marius Kloft · Ulrich Rueckert · Cheng Soon Ong · Alain Rakotomamonjy · Soeren Sonnenburg · Francis Bach 
2010 Workshop: Numerical Mathematics Challenges in Machine Learning »
Matthias Seeger · Suvrit Sra 
2010 Workshop: Optimization for Machine Learning »
Suvrit Sra · Sebastian Nowozin · Stephen Wright 
2010 Spotlight: Online Learning for Latent Dirichlet Allocation »
Matthew D. Hoffman · David Blei · Francis Bach 
2010 Poster: Efficient Optimization for Discriminative Latent Class Models »
Armand Joulin · Francis Bach · Jean A Ponce 
2010 Poster: Online Learning for Latent Dirichlet Allocation »
Matthew D. Hoffman · David Blei · Francis Bach 
2010 Oral: Structured sparsityinducing norms through submodular functions »
Francis Bach 
2010 Poster: Structured sparsityinducing norms through submodular functions »
Francis Bach 
2010 Poster: Network Flow Algorithms for Structured Sparsity »
Julien Mairal · Rodolphe Jenatton · Guillaume R Obozinski · Francis Bach 
2009 Workshop: Optimization for Machine Learning »
Sebastian Nowozin · Suvrit Sra · S.V.N Vishwanthan · Stephen Wright 
2009 Workshop: Understanding Multiple Kernel Learning Methods »
Brian McFee · Gert Lanckriet · Francis Bach · Nati Srebro 
2009 Poster: Datadriven calibration of linear estimators with minimal penalties »
Sylvain Arlot · Francis Bach 
2009 Poster: Asymptotically Optimal Regularization in Smooth Parametric Models »
Percy Liang · Francis Bach · Guillaume Bouchard · Michael Jordan 
2009 Tutorial: Sparse Methods for Machine Learning: Theory and Algorithms »
Francis Bach 
2008 Workshop: Optimization for Machine Learning »
Suvrit Sra · Sebastian Nowozin · Vishwanathan S V N 
2008 Poster: Clustered MultiTask Learning: A Convex Formulation »
Laurent Jacob · Francis Bach · JeanPhilippe Vert 
2008 Poster: Sparse probabilistic projections »
Cedric Archambeau · Francis Bach 
2008 Spotlight: Sparse probabilistic projections »
Cedric Archambeau · Francis Bach 
2008 Spotlight: Clustered MultiTask Learning: A Convex Formulation »
Laurent Jacob · Francis Bach · JeanPhilippe Vert 
2008 Poster: Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning »
Francis Bach 
2008 Poster: Kernel Changepoint Analysis »
Zaid Harchaoui · Francis Bach · Eric Moulines 
2008 Poster: SDL: Supervised Dictionary Learning »
Julien Mairal · Francis Bach · Jean A Ponce · Guillermo Sapiro · Andrew Zisserman 
2007 Poster: Testing for Homogeneity with Kernel Fisher Discriminant Analysis »
Zaid Harchaoui · Francis Bach · Moulines Eric 
2007 Poster: DIFFRAC: a discriminative and flexible framework for clustering »
Francis Bach · Zaid Harchaoui 
2007 Session: Session 2: Probabilistic Optimization »
Francis Bach 
2006 Poster: Active learning for misspecified generalized linear models »
Francis Bach