Timezone: »
Optimization lies at the heart of ML algorithms. Sometimes, classical textbook algorithms suffice, but the majority problems require tailored methods that are based on a deeper understanding of the ML requirements. ML applications and researchers are driving some of the most cutting-edge developments in optimization today. The intimate relation of optimization with ML is the key motivation for our workshop, which aims to foster discussion, discovery, and dissemination of the state-of-the-art in optimization as relevant to machine learning.
Much interest has focused recently on stochastic methods, which can be used in an online setting and in settings where data sets are extremely large and high accuracy is not required. Many aspects of stochastic gradient remain to be explored, for example, different algorithmic variants, customizing to the data set structure, convergence analysis, sampling techniques, software, choice of regularization and tradeoff parameters, distributed and parallel computation. The need for an up-to-date analysis of algorithms for nonconvex problems remains an important practical issue, whose importance becomes even more pronounced as ML tackles more and more complex mathematical models.
Finally, we do not wish to ignore the not particularly large scale setting, where one does have time to wield substantial computational resources. In this setting, high-accuracy solutions and deep understanding of the lessons contained in the data are needed. Examples valuable to MLers may be exploration of genetic and environmental data to identify risk factors for disease; or problems dealing with setups where the amount of observed data is not huge, but the mathematical model is complex.
Author Information
Suvrit Sra (MIT)
Suvrit Sra is a faculty member within the EECS department at MIT, where he is also a core faculty member of IDSS, LIDS, MIT-ML Group, as well as the statistics and data science center. His research spans topics in optimization, matrix theory, differential geometry, and probability theory, which he connects with machine learning --- a key focus of his research is on the theme "Optimization for Machine Learning” (http://opt-ml.org)
Alekh Agarwal (Microsoft Research)
More from the Same Authors
-
2022 : Provable Benefits of Representational Transfer in Reinforcement Learning »
Alekh Agarwal · Yuda Song · Kaiwen Wang · Mengdi Wang · Wen Sun · Xuezhou Zhang -
2022 Poster: CCCP is Frank-Wolfe in disguise »
Alp Yurtsever · Suvrit Sra -
2022 Poster: Efficient Sampling on Riemannian Manifolds via Langevin MCMC »
Xiang Cheng · Jingzhao Zhang · Suvrit Sra -
2022 Poster: On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL »
Jinglin Chen · Aditya Modi · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal -
2022 Poster: Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity »
Alekh Agarwal · Tong Zhang -
2021 Poster: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2021 Poster: Can contrastive learning avoid shortcut solutions? »
Joshua Robinson · Li Sun · Ke Yu · Kayhan Batmanghelich · Stefanie Jegelka · Suvrit Sra -
2021 Poster: Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates »
Alp Yurtsever · Alex Gu · Suvrit Sra -
2021 Oral: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2020 : Invited speaker: SGD without replacement: optimal rate analysis and more, Suvrit Sra »
Suvrit Sra -
2020 Poster: SGD with shuffling: optimal rates without component convexity and large epoch requirements »
Kwangjun Ahn · Chulhee Yun · Suvrit Sra -
2020 Spotlight: SGD with shuffling: optimal rates without component convexity and large epoch requirements »
Kwangjun Ahn · Chulhee Yun · Suvrit Sra -
2020 Poster: Why are Adaptive Methods Good for Attention Models? »
Jingzhao Zhang · Sai Praneeth Karimireddy · Andreas Veit · Seungyeon Kim · Sashank Reddi · Sanjiv Kumar · Suvrit Sra -
2020 Poster: Policy Improvement via Imitation of Multiple Oracles »
Ching-An Cheng · Andrey Kolobov · Alekh Agarwal -
2020 Spotlight: Policy Improvement via Imitation of Multiple Oracles »
Ching-An Cheng · Andrey Kolobov · Alekh Agarwal -
2020 Poster: FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs »
Alekh Agarwal · Sham Kakade · Akshay Krishnamurthy · Wen Sun -
2020 Poster: Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes »
Yi Tian · Jian Qian · Suvrit Sra -
2020 Poster: PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning »
Alekh Agarwal · Mikael Henaff · Sham Kakade · Wen Sun -
2020 Spotlight: Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes »
Yi Tian · Jian Qian · Suvrit Sra -
2020 Oral: FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs »
Alekh Agarwal · Sham Kakade · Akshay Krishnamurthy · Wen Sun -
2020 Poster: Safe Reinforcement Learning via Curriculum Induction »
Matteo Turchetta · Andrey Kolobov · Shital Shah · Andreas Krause · Alekh Agarwal -
2020 Poster: Provably Good Batch Reinforcement Learning Without Great Exploration »
Yao Liu · Adith Swaminathan · Alekh Agarwal · Emma Brunskill -
2020 Spotlight: Safe Reinforcement Learning via Curriculum Induction »
Matteo Turchetta · Andrey Kolobov · Shital Shah · Andreas Krause · Alekh Agarwal -
2019 Poster: Flexible Modeling of Diversity with Strongly Log-Concave Distributions »
Joshua Robinson · Suvrit Sra · Stefanie Jegelka -
2019 Poster: Are deep ResNets provably better than linear predictors? »
Chulhee Yun · Suvrit Sra · Ali Jadbabaie -
2019 Poster: Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting »
Aditya Grover · Jiaming Song · Ashish Kapoor · Kenneth Tran · Alekh Agarwal · Eric Horvitz · Stefano Ermon -
2019 Poster: Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity »
Chulhee Yun · Suvrit Sra · Ali Jadbabaie -
2019 Spotlight: Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity »
Chulhee Yun · Suvrit Sra · Ali Jadbabaie -
2018 Poster: Direct Runge-Kutta Discretization Achieves Acceleration »
Jingzhao Zhang · Aryan Mokhtari · Suvrit Sra · Ali Jadbabaie -
2018 Spotlight: Direct Runge-Kutta Discretization Achieves Acceleration »
Jingzhao Zhang · Aryan Mokhtari · Suvrit Sra · Ali Jadbabaie -
2018 Poster: On Oracle-Efficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2018 Poster: Exponentiated Strongly Rayleigh Distributions »
Zelda Mariet · Suvrit Sra · Stefanie Jegelka -
2018 Spotlight: On Oracle-Efficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2018 Tutorial: Negative Dependence, Stable Polynomials, and All That »
Suvrit Sra · Stefanie Jegelka -
2017 Workshop: OPT 2017: Optimization for Machine Learning »
Suvrit Sra · Sashank J. Reddi · Alekh Agarwal · Benjamin Recht -
2017 Poster: Off-policy evaluation for slate recommendation »
Adith Swaminathan · Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik · John Langford · Damien Jose · Imed Zitouni -
2017 Oral: Off-policy evaluation for slate recommendation »
Adith Swaminathan · Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik · John Langford · Damien Jose · Imed Zitouni -
2017 Poster: Elementary Symmetric Polynomials for Optimal Experimental Design »
Zelda Mariet · Suvrit Sra -
2017 Poster: Polynomial time algorithms for dual volume sampling »
Chengtao Li · Stefanie Jegelka · Suvrit Sra -
2016 Workshop: OPT 2016: Optimization for Machine Learning »
Suvrit Sra · Francis Bach · Sashank J. Reddi · Niao He -
2016 : Taming non-convexity via geometry »
Suvrit Sra -
2016 Poster: Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling »
Chengtao Li · Suvrit Sra · Stefanie Jegelka -
2016 Poster: Kronecker Determinantal Point Processes »
Zelda Mariet · Suvrit Sra -
2016 Demonstration: Project Malmo - Minecraft for AI Research »
Katja Hofmann · Matthew A Johnson · Fernando Diaz · Alekh Agarwal · Tim Hutton · David Bignell · Evelyne Viegas -
2016 Poster: Efficient Second Order Online Learning by Sketching »
Haipeng Luo · Alekh Agarwal · Nicolò Cesa-Bianchi · John Langford -
2016 Poster: Contextual semibandits via supervised learning oracles »
Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik -
2016 Poster: PAC Reinforcement Learning with Rich Observations »
Akshay Krishnamurthy · Alekh Agarwal · John Langford -
2016 Poster: Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization »
Sashank J. Reddi · Suvrit Sra · Barnabas Poczos · Alexander Smola -
2016 Poster: Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds »
Hongyi Zhang · Sashank J. Reddi · Suvrit Sra -
2016 Tutorial: Large-Scale Optimization: Beyond Stochastic Gradient Descent and Convexity »
Suvrit Sra · Francis Bach -
2015 Workshop: Optimization for Machine Learning (OPT2015) »
Suvrit Sra · Alekh Agarwal · Leon Bottou · Sashank J. Reddi -
2015 Poster: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Spotlight: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Poster: Matrix Manifold Optimization for Gaussian Mixtures »
Reshad Hosseini · Suvrit Sra -
2015 Poster: On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants »
Sashank J. Reddi · Ahmed Hefny · Suvrit Sra · Barnabas Poczos · Alexander Smola -
2015 Poster: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire -
2015 Oral: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire -
2014 Workshop: OPT2014: Optimization for Machine Learning »
Zaid Harchaoui · Suvrit Sra · Alekh Agarwal · Martin Jaggi · Miro Dudik · Aaditya Ramdas · Jean Lasserre · Yoshua Bengio · Amir Beck -
2014 Poster: Efficient Structured Matrix Rank Minimization »
Adams Wei Yu · Wanli Ma · Yaoliang Yu · Jaime Carbonell · Suvrit Sra -
2014 Poster: Scalable Non-linear Learning with Adaptive Polynomial Expansions »
Alekh Agarwal · Alina Beygelzimer · Daniel Hsu · John Langford · Matus J Telgarsky -
2013 Workshop: Learning Faster From Easy Data »
Peter Grünwald · Wouter M Koolen · Sasha Rakhlin · Nati Srebro · Alekh Agarwal · Karthik Sridharan · Tim van Erven · Sebastien Bubeck -
2013 Workshop: OPT2013: Optimization for Machine Learning »
Suvrit Sra · Alekh Agarwal -
2013 Poster: Geometric optimisation on positive definite matrices for elliptically contoured distributions »
Suvrit Sra · Reshad Hosseini -
2013 Poster: Reflection methods for user-friendly submodular optimization »
Stefanie Jegelka · Francis Bach · Suvrit Sra -
2012 Poster: A new metric on the manifold of kernel matrices with application to matrix geometric means »
Suvrit Sra -
2012 Poster: Stochastic optimization and sparse statistical recovery: Optimal algorithms for high dimensions »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright -
2012 Poster: Scalable nonconvex inexact proximal splitting »
Suvrit Sra -
2011 Workshop: Computational Trade-offs in Statistical Learning »
Alekh Agarwal · Sasha Rakhlin -
2011 Workshop: Optimization for Machine Learning »
Suvrit Sra · Stephen Wright · Sebastian Nowozin -
2011 Poster: Distributed Delayed Stochastic Optimization »
Alekh Agarwal · John Duchi -
2011 Poster: Stochastic convex optimization with bandit feedback »
Alekh Agarwal · Dean P Foster · Daniel Hsu · Sham M Kakade · Sasha Rakhlin -
2010 Workshop: Learning on Cores, Clusters, and Clouds »
Alekh Agarwal · Lawrence Cayton · Ofer Dekel · John Duchi · John Langford -
2010 Workshop: Numerical Mathematics Challenges in Machine Learning »
Matthias Seeger · Suvrit Sra -
2010 Workshop: Optimization for Machine Learning »
Suvrit Sra · Sebastian Nowozin · Stephen Wright -
2010 Spotlight: Distributed Dual Averaging In Networks »
John Duchi · Alekh Agarwal · Martin J Wainwright -
2010 Poster: Distributed Dual Averaging In Networks »
John Duchi · Alekh Agarwal · Martin J Wainwright -
2010 Oral: Fast global convergence rates of gradient methods for high-dimensional statistical recovery »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright -
2010 Poster: Fast global convergence rates of gradient methods for high-dimensional statistical recovery »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright -
2009 Workshop: Optimization for Machine Learning »
Sebastian Nowozin · Suvrit Sra · S.V.N Vishwanthan · Stephen Wright -
2009 Poster: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2009 Spotlight: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2008 Workshop: Optimization for Machine Learning »
Suvrit Sra · Sebastian Nowozin · Vishwanathan S V N -
2007 Poster: An Analysis of Inference with the Universum »
Fabian H Sinz · Olivier Chapelle · Alekh Agarwal · Bernhard Schölkopf -
2007 Spotlight: An Analysis of Inference with the Universum »
Fabian H Sinz · Olivier Chapelle · Alekh Agarwal · Bernhard Schölkopf