Resource-Efficient Machine Learning

Workshop

Resource-Efficient Machine Learning

Yevgeny Seldin · Yasin Abbasi Yadkori · Yacov Crammer · Ralf Herbrich · Peter Bartlett

Harvey's Emerald Bay 3

[ Abstract ] Workshop Website

Resource efficiency is key for making ideas practical. It is crucial in many tasks, ranging from large-scale learning ("big data'') to small-scale mobile devices. Understanding resource efficiency is also important for understanding of biological systems, from individual cells to complex learning systems, such as the human brain. The goal of this workshop is to improve our fundamental theoretical understanding and link between various applications of learning under constraints on the resources, such as computation, observations, communication, and memory. While the founding fathers of machine learning were mainly concerned with characterizing the sample complexity of learning (the observations resource) [VC74] it now gets realized that fundamental understanding of other resource requirements, such as computation, communication, and memory is equally important for further progress [BB11].

The problem of resource-efficient learning is multidimensional and we already see some parts of this puzzle being assembled. One question is the interplay between the requirements on different resources. Can we use more of one resource to save on a different resource? For example, the dependence between computation and observations requirements was studied in [SSS08,SSST12,SSB12]. Another question is online learning under various budget constraints [AKKS12,BKS13,CKS04,DSSS05,CBG06]. One example that Badanidiyuru et al. provide is dynamic pricing with limited supply, where we have a limited number of items to sell and on each successful sale transaction we lose one item. A related question of online learning under constraints on information acquisition was studied in [SBCA13], where the constraints could be computational (information acquisition required computation) or monetary. Yet another direction is adaptation of algorithms to the complexity of operation environment. Such adaptation can allow resource consumption to reflect the hardness of a situation being faced. An example of such adaptation in multiarmed bandits with side information was given in [SAL+11]. Another way of adaptation is interpolation between stochastic and adversarial environments. At the moment there are two prevailing formalisms for modeling the environment, stochastic and adversarial (also known as the average case'' andthe worst case''). But in reality the environment is often neither stochastic, nor adversarial, but something in between. It is, therefore, crucial to understand the intermediate regime. First steps in this direction were done in [BS12]. And, of course, one of the flagman problems nowadays is ``big data'', where the constraint shifts from the number of observations to computation. We strongly believe that there are deep connections between problems at various scales and with various resource constraints and there are basic principles of learning under resource constraints that are yet to be discovered. We invite researchers to share their practical challenges and theoretical insights on this problem.

One additional important direction is design of resource-dependent performance measures. In the past, algorithms were compared in terms of predictive accuracy (classification errors, AUC, F-measures, NDCG, etc.), yet there is a need to evaluate them with additional metrics related to resources, such as memory, CPU time, and even power. For example, reward per computation operation. This theme will also be discussed at the workshop.

References:
[AKKS12] Kareem Amin, Michael Kearns, Peter Key and Anton Schwaighofer. Budget Optimization for Sponsored Search: Censored Learning in MDPs. UAI 2012.
[BB11] Leon Bottou and Olivier Bousquet. The trade-offs of large scale learning. In Suvrit Sra, Sebastian Nowozin, and Stephen J. Wright, editors, Optimization for Machine Learning. MIT Press, 2011.
[BKS13] Ashwinkumar Badanidiyuru, Robert Kleinberg and Aleksandrs Slivkins. Bandits with Knapsacks. FOCS, 2013.
[BS12] Sebastien Bubeck and Aleksandrs Slivkins. The best of both worlds: stochastic and adversarial bandits. COLT, 2012.
[CBG06] Nicolò Cesa-Bianchi and Claudio Gentile. Tracking the best hyperplane with a simple budget perceptron. COLT 2006.
[CKS04] Koby Crammer, Jaz Kandola and Yoram Singer. Online Classification on a Budget. NIPS 2003.
[DSSS05] Ofer Dekel, Shai Shalev-shwartz and Yoram Singer. The Forgetron: A kernel-based perceptron on a fixed budget. NIPS 2004.
[SAL+11] Yevgeny Seldin, Peter Auer, François Laviolette, John Shawe-Taylor, and Ronald Ortner. PAC-Bayesian Analysis of Contextual Bandits. NIPS, 2011.
[SBCA13] Yevgeny Seldin, Peter Bartlett, Koby Crammer, and Yasin Abbasi-Yadkori. Prediction with Limited Advice and Multiarmed Bandits with Paid Observations. 2013.
[SSB12] Shai Shalev-Shwartz and Aharon Birnbaum. Learning halfspaces with the zero-one loss: Time-accuracy trade-offs. NIPS, 2012.
[SSS08] Shai Shalev-Shwartz and Nathan Srebro. SVM Optimization: Inverse Dependence on Training Set Size. ICML, 2008.
[SSST12] Shai Shalev-Shwartz, Ohad Shamir, and Eran Tromer. Using more data to speed-up training time. AISTATS, 2012.
[VC74] Vladimir N. Vapnik and Alexey Ya. Chervonenkis. Theory of pattern recognition. Nauka, Moscow (in Russian), 1974. German translation: W.N.Wapnik, A.Ya.Tschervonenkis (1979), Theorie der Zeichenerkennug, Akademia, Berlin.

Live content is unavailable. Log in and register to view live content