Timezone: »
Since its early days, the field of Machine Learning has focused on
developing computationally tractable algorithms with good learning
guarantees. The vast literature on statistical learning theory has led
to a good understanding of how the predictive performance of different
algorithms improves as a function of the number of training
samples. By the same token, the well-developed theories of
optimization and sampling methods have yielded efficient computational
techniques at the core of most modern learning methods. The separate
developements in these fields mean that given an algorithm we have a
sound understanding of its statistical and computational
beahvior. However, there hasn't been much joint study of the
computational and statistical complexities of learinng, as a
consequence of which, little is known about the interaction and
trade-offs between statistical accuracy and computational
complexity. Indeed a systematic joint treatment can answer some very
interesting questions: what is the best attainable statistical error
given a finite computational budget? What is the best learning method
to use given different computational constraints and desired
statistical yardsticks? Is it the case that simple methods outperform
complex ones in computationally impoverished scenarios?
At its core, the PAC framework aims to study learning through the lens
of computation. However, the thrust is on separating polynomial-time
from computationally intractable algorithms. However all
polynomial-time computations are hardly equivalent, and the difference
between linear vs quadratic dependence on problem parameters can have
a profound effect on the applicability of an algorithm. Understanding
the trade-offs between statistical accuracy and computational demands
in this situation is of paramount importance.
The need for such a theory is more compelling now than ever before
since we routinely face training corpuses with billions of examples,
and often, an even larger number of parameters to be estimated. The
emergence of web and mechanical turk as sources of training data often
stretches learning algorithms to the point that the bottleneck is no
longer the number of examples, but the amount of computation available
to process the examples. A theory to principally choose from a
multitude of learning methods based on the properties of training
examples as well as the computational resources available would be of
clear interest. Another way to pose the same problem would be to
design algorithms that can take as input a computational constraint
and try to learn the best hypothesis they can based on the available
budget and data.
There have been some works that try to address different facets of the above problem. Researchers working on massive datasets in the CS theory community look at streaming methods that aim to impose constraints on both the computational and storage requirements of the algorithms. Online learning presents one particular way of dealing with a computational budget, by processing as many samples as possible with the computational budget.
There have been some more relevant works in the machine learning community in the last few years. Bottou and Bousquet (2008)
compare the amount of computation needed to attain a certain
statistical error for a few routinely used optimization
algorithms. Shalev-Shwartz and Srebro (2009) show how stochastic gradient descent applied to SVM optimization can experience an inverse dependence on number of training sample in the regime of large
datasets. In some more recent works, Shalev-Shwartz and co-authors have also used cryptographic conjectures to establish the computational hardness of certain learning problems. On the algorithmic front, coarse-to-fine learning provides a nice framework to systematically incorporate computational considerations, using computational cost as a regularization term in the objective of the learning method. Other budgeted algorithms such as budgeted SVMs and budgeted perceptrons try to admit hard budget constraints on the running time and storage of the algorithm.
The goals of our workshop are:
* To draw the attention of machine learning researchers to this rich and emerging area of problems and to establish a community of researchers that are interested in understanding these tradeoffs.
* To define a number of common problems in this area and to encourage future research that is comparable and compatible.
* To expose the learning community to relevant work in fields such as CS theory and convex optimization.
We will call for papers on the following topics:
* Fundamental statistical limits with bounded computation
Trade-offs between statistical accuracy and computational costs
Algorithms to learn under budget constraints
* Budget constraints on other resources (like bounded memory)
* Computationally aware approaches such as coarse-to-fine learning
Author Information
Alekh Agarwal (Microsoft Research)
Sasha Rakhlin (University of Pennsylvania)
More from the Same Authors
-
2022 : Provable Benefits of Representational Transfer in Reinforcement Learning »
Alekh Agarwal · Yuda Song · Kaiwen Wang · Mengdi Wang · Wen Sun · Xuezhou Zhang -
2023 Poster: Ordering-based Conditions for Global Convergence of Policy Gradient Methods »
Jincheng Mei · Bo Dai · Alekh Agarwal · Mohammad Ghavamzadeh · Csaba Szepesvari · Dale Schuurmans -
2023 Oral: Ordering-based Conditions for Global Convergence of Policy Gradient Methods »
Jincheng Mei · Bo Dai · Alekh Agarwal · Mohammad Ghavamzadeh · Csaba Szepesvari · Dale Schuurmans -
2022 Poster: On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL »
Jinglin Chen · Aditya Modi · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal -
2022 Poster: Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity »
Alekh Agarwal · Tong Zhang -
2021 Poster: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2021 Oral: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2020 Poster: Policy Improvement via Imitation of Multiple Oracles »
Ching-An Cheng · Andrey Kolobov · Alekh Agarwal -
2020 Spotlight: Policy Improvement via Imitation of Multiple Oracles »
Ching-An Cheng · Andrey Kolobov · Alekh Agarwal -
2020 Poster: FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs »
Alekh Agarwal · Sham Kakade · Akshay Krishnamurthy · Wen Sun -
2020 Poster: PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning »
Alekh Agarwal · Mikael Henaff · Sham Kakade · Wen Sun -
2020 Oral: FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs »
Alekh Agarwal · Sham Kakade · Akshay Krishnamurthy · Wen Sun -
2020 Poster: Safe Reinforcement Learning via Curriculum Induction »
Matteo Turchetta · Andrey Kolobov · Shital Shah · Andreas Krause · Alekh Agarwal -
2020 Poster: Provably Good Batch Reinforcement Learning Without Great Exploration »
Yao Liu · Adith Swaminathan · Alekh Agarwal · Emma Brunskill -
2020 Spotlight: Safe Reinforcement Learning via Curriculum Induction »
Matteo Turchetta · Andrey Kolobov · Shital Shah · Andreas Krause · Alekh Agarwal -
2019 Poster: Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting »
Aditya Grover · Jiaming Song · Ashish Kapoor · Kenneth Tran · Alekh Agarwal · Eric Horvitz · Stefano Ermon -
2018 Poster: On Oracle-Efficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2018 Spotlight: On Oracle-Efficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2017 Workshop: OPT 2017: Optimization for Machine Learning »
Suvrit Sra · Sashank J. Reddi · Alekh Agarwal · Benjamin Recht -
2017 Poster: Off-policy evaluation for slate recommendation »
Adith Swaminathan · Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik · John Langford · Damien Jose · Imed Zitouni -
2017 Oral: Off-policy evaluation for slate recommendation »
Adith Swaminathan · Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik · John Langford · Damien Jose · Imed Zitouni -
2016 Workshop: Time Series Workshop »
Oren Anava · Marco Cuturi · Azadeh Khaleghi · Vitaly Kuznetsov · Sasha Rakhlin -
2016 Demonstration: Project Malmo - Minecraft for AI Research »
Katja Hofmann · Matthew A Johnson · Fernando Diaz · Alekh Agarwal · Tim Hutton · David Bignell · Evelyne Viegas -
2016 Poster: Efficient Second Order Online Learning by Sketching »
Haipeng Luo · Alekh Agarwal · Nicolò Cesa-Bianchi · John Langford -
2016 Poster: Contextual semibandits via supervised learning oracles »
Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik -
2016 Poster: PAC Reinforcement Learning with Rich Observations »
Akshay Krishnamurthy · Alekh Agarwal · John Langford -
2015 Workshop: Optimization for Machine Learning (OPT2015) »
Suvrit Sra · Alekh Agarwal · Leon Bottou · Sashank J. Reddi -
2015 Workshop: Time Series Workshop »
Oren Anava · Azadeh Khaleghi · Vitaly Kuznetsov · Alexander Rakhlin -
2015 Poster: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Spotlight: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Poster: Adaptive Online Learning »
Dylan Foster · Alexander Rakhlin · Karthik Sridharan -
2015 Spotlight: Adaptive Online Learning »
Dylan Foster · Alexander Rakhlin · Karthik Sridharan -
2015 Poster: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire -
2015 Oral: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire -
2014 Workshop: Modern Nonparametrics 3: Automating the Learning Pipeline »
Eric Xing · Mladen Kolar · Arthur Gretton · Samory Kpotufe · Han Liu · Zoltán Szabó · Alan Yuille · Andrew G Wilson · Ryan Tibshirani · Sasha Rakhlin · Damian Kozbur · Bharath Sriperumbudur · David Lopez-Paz · Kirthevasan Kandasamy · Francesco Orabona · Andreas Damianou · Wacha Bounliphone · Yanshuai Cao · Arijit Das · Yingzhen Yang · Giulia DeSalvo · Dmitry Storcheus · Roberto Valerio -
2014 Workshop: OPT2014: Optimization for Machine Learning »
Zaid Harchaoui · Suvrit Sra · Alekh Agarwal · Martin Jaggi · Miro Dudik · Aaditya Ramdas · Jean Lasserre · Yoshua Bengio · Amir Beck -
2014 Poster: Scalable Non-linear Learning with Adaptive Polynomial Expansions »
Alekh Agarwal · Alina Beygelzimer · Daniel Hsu · John Langford · Matus J Telgarsky -
2013 Workshop: Learning Faster From Easy Data »
Peter Grünwald · Wouter M Koolen · Sasha Rakhlin · Nati Srebro · Alekh Agarwal · Karthik Sridharan · Tim van Erven · Sebastien Bubeck -
2013 Workshop: Perturbations, Optimization, and Statistics »
Tamir Hazan · George Papandreou · Sasha Rakhlin · Danny Tarlow -
2013 Workshop: OPT2013: Optimization for Machine Learning »
Suvrit Sra · Alekh Agarwal -
2013 Poster: Optimization, Learning, and Games with Predictable Sequences »
Sasha Rakhlin · Karthik Sridharan -
2013 Poster: Online Learning of Dynamic Parameters in Social Networks »
Shahin Shahrampour · Sasha Rakhlin · Ali Jadbabaie -
2012 Workshop: Optimization for Machine Learning »
Suvrit Sra · Alekh Agarwal -
2012 Poster: Relax and Randomize : From Value to Algorithms »
Sasha Rakhlin · Ohad Shamir · Karthik Sridharan -
2012 Poster: Stochastic optimization and sparse statistical recovery: Optimal algorithms for high dimensions »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright -
2012 Oral: Relax and Randomize : From Value to Algorithms »
Sasha Rakhlin · Ohad Shamir · Karthik Sridharan -
2011 Session: Oral Session 12 »
Sasha Rakhlin -
2011 Poster: Distributed Delayed Stochastic Optimization »
Alekh Agarwal · John Duchi -
2011 Poster: Lower Bounds for Passive and Active Learning »
Maxim Raginsky · Sasha Rakhlin -
2011 Poster: Stochastic convex optimization with bandit feedback »
Alekh Agarwal · Dean P Foster · Daniel Hsu · Sham M Kakade · Sasha Rakhlin -
2011 Spotlight: Lower Bounds for Passive and Active Learning »
Maxim Raginsky · Sasha Rakhlin -
2011 Poster: Online Learning: Stochastic, Constrained, and Smoothed Adversaries »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari -
2010 Workshop: Learning on Cores, Clusters, and Clouds »
Alekh Agarwal · Lawrence Cayton · Ofer Dekel · John Duchi · John Langford -
2010 Spotlight: Distributed Dual Averaging In Networks »
John Duchi · Alekh Agarwal · Martin J Wainwright -
2010 Poster: Distributed Dual Averaging In Networks »
John Duchi · Alekh Agarwal · Martin J Wainwright -
2010 Poster: Random Walk Approach to Regret Minimization »
Hariharan Narayanan · Sasha Rakhlin -
2010 Oral: Fast global convergence rates of gradient methods for high-dimensional statistical recovery »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright -
2010 Oral: Online Learning: Random Averages, Combinatorial Parameters, and Learnability »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari -
2010 Poster: Fast global convergence rates of gradient methods for high-dimensional statistical recovery »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright -
2010 Poster: Online Learning: Random Averages, Combinatorial Parameters, and Learnability »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari -
2009 Poster: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2009 Spotlight: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2007 Poster: An Analysis of Inference with the Universum »
Fabian H Sinz · Olivier Chapelle · Alekh Agarwal · Bernhard Schölkopf -
2007 Spotlight: An Analysis of Inference with the Universum »
Fabian H Sinz · Olivier Chapelle · Alekh Agarwal · Bernhard Schölkopf -
2007 Oral: Adaptive Online Gradient Descent »
Peter Bartlett · Elad Hazan · Sasha Rakhlin -
2007 Poster: Adaptive Online Gradient Descent »
Peter Bartlett · Elad Hazan · Sasha Rakhlin -
2006 Poster: Stability of $K$-Means Clustering »
Sasha Rakhlin · Andrea Caponnetto