Skip to yearly menu bar Skip to main content



Workshops
Manik Varma · John Langford

[ Harrah's Fallen+Marla ]

Extreme classification, where one needs to deal with multi-class and multi-label problems involving a very large number of categories, has opened up a new research frontier in machine learning. Many challenging applications, such as photo and video annotation and web page categorization, can benefit from being formulated as supervised learning tasks with millions, or even billions, of categories. Extreme classification can also give a fresh perspective on core learning problems such as ranking and recommendation by reformulating them as multi-class/label tasks where each item to be ranked or recommended is a separate category.

Extreme classification raises a number of interesting research questions including those related to:

* Large scale learning and distributed and parallel training
* Efficient sub-linear prediction and prediction on a test-time budget
* Crowd sourcing and other efficient techniques for harvesting training data
* Dealing with training set biases and label noise
* Fine-grained classification
* Tackling label polysemy, synonymy and correlations
* Structured output prediction and multi-task learning
* Learning from highly imbalanced data
* Learning from very few data points per category
* Learning from missing and incorrect labels
* Feature extraction, feature sharing, lazy feature evaluation, etc.
* Performance evaluation
* Statistical analysis and …

Georg Langs · Brian Murphy · Kai-min K Chang · Paolo Avesani · James Haxby · Nikolaus Kriegeskorte · Susan Whitfield-Gabrieli · Irina Rish · Guillermo Cecchi · Raif Rustamov · Marius Kloft · Jonathan Young · Sina Ghiassian · Michael Coen

[ Harvey's Sierra ]

Aim of the workshop

We propose a two-day workshop on the topic of machine learning approaches in neuroscience, neuroimaging, with a specific extension to behavioral experiments and psychology. We believe that machine learning has a prominent role in shaping how questions in neuroscience are framed, and that the machine-learning mind set is now entering modern psychology and behavioral studies. It is also equally important that practical applications in these fields motivate a rapidly evolving line or research in the machine learning community. In this context, many controversies and open questions exist.

The goal of the workshop is to pinpoint the most pressing issues and common challenges across the fields, and to sketch future directions and open questions in the light of novel methodology. The proposed workshop is aimed at offering a forum that joins machine learning, neuroscience, and psychology community, and should facilitate formulating and discussing the issues at their interface.

Motivated by two previous workshops, MLINI ‘11 and MLINI’12, we will center this workshop around invited talks, and two panel discussions. Triggered by these discussions, this year we plan to adapt the workshop topics to a less traditional scope that investigates the role of machine learning in neuroimaging of …

Thomas Walsh · Alborz Geramifard · Marc Deisenroth · Jonathan How · Jan Peters

[ Harvey's Emerald Bay 1 ]

Closed-loop control of systems based on sensor readings in uncertain domains is a hallmark of research in the Control, Artificial Intelligence, and Neuroscience communities. Various sensorimotor frameworks have been effective at controlling physical and biological systems, from flying airplanes to moving artificial limbs, but many techniques rely on accurate models or other concrete domain knowledge to derive useful policies. In systems where such specifications are not available, the task of generating usable models or even directly deriving controllers from data often falls in the purview of machine learning algorithms.

Advances in machine learning, including non-parametric Bayesian modeling/inference and reinforcement learning have increased the range, accuracy, and speed of deriving models and policies from data. However, incorporating modern machine learning techniques into real-world sensorimotor control systems can still be challenging due to the learner's underlying assumptions, the need to model uncertainty, and the scale of such problems. More specifically, many advanced machine learning algorithms rely either on strong distributional assumptions or random access to all possible data points, neither of which may be guaranteed when used with a specific control algorithm on a physical or biological system. In addition, planners need to consider, and learners need to indicate, uncertainty in the …

Suvrit Sra · Alekh Agarwal

[ Harrah's Tahoe C ]

Dear NIPS Workshop Chairs,

We propose to organize the workshop

OPT2013: Optimization for Machine Learning.

As the sixth in its series, OPT 2013 stands on significant precedent established by OPT 2008--OPT 2012 which were all very well-received NIPS workshops.

The previous OPT workshops enjoyed packed (to overpacked) attendance, and this enthusiastic reception underscores the strong interest, relevance, and importance enjoyed by optimization in the ML community.

This interest has grown remarkably strongly every year, no wonder, since optimization lies at the heart of most ML algorithms. Although classical textbook algorithms might sometimes suffice, the majority of ML problems require tailored methods based on a deeper understanding of learning task. Indeed, ML applications and researchers are driving some of the most cutting-edge developments in optimization today. This intimate relation of optimization with ML is the key motivation for our workshop, which aims to foster discussion, discovery, and dissemination of the state-of-the-art in optimization as relevant to machine learning.

Dilan Gorur · Romer Rosales · Olivier Chapelle · Dorota Glowacka

[ Harvey's Emerald Bay 4 ]

Location: Harvey's Emerald Bay 4

---------------------------------------------------
Morning Session
---------------------------------------------------
07:30 - 07:40 Welcome and introduction

07:40 - 08:20 Kilian Weinberger - Feature Hashing for Large Scale Classifier Personalization

08:20 - 08:45 Laurent Charlin - Leveraging user libraries to bootstrap collaborative filtering

08:45 - 09:00 Poster spotlight presentations

09:00 - 09:30 Coffee break and poster session

09:30 - 10:10 Susan Dumais - Personalized Search: Potential and Pitfalls

10:10 - 10:30 Discussion, followed by poster session

---------------------------------------------------
10:30 – 15:30 Lunch + Skiing
---------------------------------------------------

---------------------------------------------------
Afternoon Session
---------------------------------------------------
15:30 - 16:10 Deepak Agarwal - Personalization and Computational Advertising at LinkedIn

16:10 - 16:35 Jason Weston - Nonlinear Latent Factorization by Embedding Multiple User Interests

16:35 - 17:05 Impromptu talks (new discussion topics and ideas encouraged)

17:05 - 17:45 Coffee break + Posters

17:45 - 18:25 Nando de Freitas - Recommendation and personalization: A startup perspective

18:25 - 19:00 Panel Discussion and wrap up

---------------------------------------------------
---------------------------------------------------


Personalization has become an important research topic in machine learning fueled in part by its major significance in e-comerce and other businesses that try to tailor to user-specific preferences. Online products, news, search, media, and advertisement are some of the areas that have depended on some form …

David Lopez-Paz · Quoc V Le · Alexander Smola

[ Harvey's Emerald Bay 5 ]

As we enter the era of “big-data”, Machine Learning algorithms that resort in heavy optimization routines rapidly become prohibitive. Perhaps surprisingly, randomization (Raghavan and Motwani, 1995) arises as a computationally cheaper, simpler alternative to optimization that in many cases leads to smaller and faster models with little or no loss in performance. Although randomized algorithms date back to the probabilistic method (Erdős, 1947, Alon & Spencer, 2000), these techniques only recently started finding their way into Machine Learning. The most notable exceptions being stochastic methods for optimization, and Markov Chain Monte Carlo methods, both of which have become well-established in the past two decades. This workshop aims to accelerate this process by bringing together researchers in this area and exposing them to recent developments.

The targeted audience are researchers and practitioners looking for scalable, compact and fast solutions to learn in the large-scale setting.

Specific questions of interest include, but are not limited to:

- Randomized projections: locality sensitive hashing, hash kernels, counter braids, count sketches, optimization.
- Randomized function classes: Fourier features, Random Kitchen Sinks, Nystrom methods, Fastfood, Random Basis Neural networks.
- Sparse reconstructions: compressed sensing, error correcting output codes, reductions of inference problems to binary.
- Compressive …

Yuhong Guo · Dale Schuurmans · Richard Zemel · Samy Bengio · Yoshua Bengio · Li Deng · Dan Roth · Kilian Q Weinberger · Jason Weston · Kihyuk Sohn · Florent Perronnin · Gabriel Synnaeve · Pablo R Strasser · julien audiffren · Carlo Ciliberto · Dan Goldwasser

[ Harrah's Sand Harbor III ]

Modern data analysis is increasingly facing prediction problems that have complex and high dimensional output spaces. For example, document tagging problems regularly consider large (and sometimes hierarchical) sets of output tags; image tagging problems regularly consider tens of thousands of possible output labels; natural language processing tasks have always considered complex output spaces. In such complex and high dimensional output spaces the candidate labels are often too specialized---leading to sparse data for individual labels---or too generalized---leading to complex prediction maps being required. In such cases, it is essential to identify an alternative output representation that can provide latent output categories that abstract overly specialized labels, specialize overly abstract labels, or reveal the latent dependence between labels.

There is a growing body of work on learning output representations, distinct from current work on learning input representations. For example, in machine learning, work on multi-label learning, and particularly output dimensionality reduction in high dimensional label spaces, has begun to address the specialized label problem, while work on output kernel learning has begun to address the abstracted label problem. In computer vision, work on image categorization and tagging has begun to investigate simple forms of latent output representation learning to cope with abstract …

Hilbert J Kappen · Naftali Tishby · Jan Peters · Evangelos Theodorou · David H Wolpert · Pedro Ortega

[ Harrah's Tahoe D ]

How do you make decisions when there are way more possibilities than you can analyze? How do you decide under such information constraints?

Planning and decision-making with information constraints is at the heart of adaptive control, reinforcement learning, robotic path planning, experimental design, active learning, computational neuroscience and games. In most real-world problems, perfect planning is either impossible (computational intractability, lack of information, diminished control) or sometimes even undesirable (distrust, risk sensitivity, level of cooperation of the others). Recent developments have shown that a single method, based on the free energy functional borrowed from thermodynamics, provides a principled way of designing systems with information constraints that parallels Bayesian inference. This single method -known in the literature under various labels such as KL-control, path integral control, linearly-solvable stochastic control, information-theoretic bounded rationality- is proving itself very general and powerful as a foundation for a novel class of probabilistic planning problems.

The goal of this workshop is twofold:

1) Give a comprehensive introduction to planning with information constraints targeted to a wide audience with machine learning background. Invited speakers will give an overview of the theoretical results and talk about their experience in applications to control, reinforcement learning, computational neuroscience and robotics. …

Neil D Lawrence · Joaquin Quiñonero-Candela · Tianshi Gao · James Hensman · Zoubin Ghahramani · Max Welling · David Blei · Ralf Herbrich

[ Harvey's Emerald Bay A ]

Processing of web scale data sets has proven its worth in a range of applications, from ad-click prediction to large recommender systems. In most cases, learning needs to happen real-time, and the latency allowance for predictions is restrictive. Probabilistic predictions are critical in practice on web applications because optimizing the user experience requires being able to compute the expected utilities of mutually exclusive pieces of content. The quality of the knowledge extracted from the information available is restricted by complexity of the model.

One framework that enables complex modelling of data is probabilistic modelling. However, its applicability to big data is restricted by the difficulties of inference in complex probabilistic models, and by computational constraints.

This workshop will focus on applying probabilistic models to big data. Of interest will be algorithms that allow for inference in probabilistic models for big data such as stochastic variational inference and stochastic Monte Carlo. A particular focus will be on existing applications in big data and future applications that would benefit from such approaches.

This workshop brings together leading academic and industrial researchers in probabilistic modelling and large scale data sets.

Stefanie Jegelka · Andreas Krause · Pradeep Ravikumar · Kazuo Murota · Jeffrey A Bilmes · Yisong Yue · Michael Jordan

[ Harrah's Sand Harbor I ]

Solving optimization problems with ultimately discrete solutions is becoming increasingly important in machine learning. At the core of statistical machine learning is to make inferences from data, and when the variables underlying the data are discrete, both the tasks of inferring the model from data as well as performing predictions using the estimated model are inherently discrete optimization problems. Many of these optimization problems are notoriously hard. As a result, abundant and steadily increasing amounts of data -- despite being statistically beneficial -- quickly render standard off-the-shelf optimization procedures either intractable, or at the very least impractical.

However, while many problems are hard in the worst case, the problems of practical interest are often much more well-behaved, or are well modeled by assuming properties that make them so. Indeed, many discrete problems in machine learning can possess beneficial structure; such structure has been an important ingredient in many successful (approximate) solution strategies. Examples include the marginal polytope, which is determined by the graph structure of the model, or sparsity that makes it possible to handle high dimensions. Symmetry and exchangeability are further exploitable characteristics. In addition, functional properties such as submodularity, a discrete analog of convexity, are proving to be …

Yoshua Bengio · Hugo Larochelle · Russ Salakhutdinov · Tomas Mikolov · Matthew D Zeiler · David Mcallester · Nando de Freitas · Josh Tenenbaum · Jian Zhou · Volodymyr Mnih

[ Harrah's Sand Harbor II ]

Deep Learning algorithms attempt to discover good representations, at multiple levels of abstraction. There has been rapid progress in this area in recent years, both in terms of algorithms and in terms of applications, but many challenges remain. The workshop aims at bringing together researchers in that field and discussing these challenges, brainstorming about new solutions.

Edo M Airoldi · David S Choi · Aaron Clauset · Khalid El-Arini · Jure Leskovec

[ Harvey's Emerald Bay 3 ]

Modern technology, including the World Wide Web, telecommunication devices and services, and large-scale data storage, has completely transformed the scale and concept of data in the sciences. Modern data sets are often enormous in size, detail, and heterogeneity, and are often best represented as highly annotated sequences of graphs. Although much progress has been made on developing rigorous tools for analyzing and modeling some types of large, complex, real-world networks, much work still remains and a principled, coherent framework remains elusive, in part because the analysis of networks is a growing and highly cross-disciplinary field.

This workshop aims to bring together a diverse and cross-disciplinary set of researchers in order to both describe recent advances and to discuss future directions for developing new network methods in statistics and machine learning. By network methods, we broadly include those models and algorithms whose goal is to learn the patterns of interaction, flow of information, or propagation of effects in social, biological, and economic systems. We will also welcome empirical studies in applied domains such as the social sciences, biology, medicine, neuroscience, physics, finance, social media, and economics.

While this research field is already broad and diverse, there are emerging signs of convergence, …

Alyson Fletcher · Dmitri B Chklovskii · Fritz Sommer · Ian H Stevenson

[ Harvey's Emerald Bay 6 ]

High-dimensional Statistical Inference in the Brain

Overview:
Understanding high-dimensional phenomena is at the heart of many fundamental questions in neuroscience. How does the brain process sensory data? How can we model the encoding of the richness of the inputs, and how do these representations lead to perceptual capabilities and higher level cognitive function? Similarly, the brain itself is a vastly complex nonlinear, highly-interconnected network and neuroscience requires tractable, generalizable models for these inherently high-dimensional neural systems.

Recent years have seen tremendous progress in high-dimensional statistics and methods for big data" that may shed light on these fundamental questions. This workshop seeks to leverage these advances and bring together researchers in mathematics, machine learning, computer science, statistics and neuroscience to explore the roles of dimensionality reduction and machine learning in neuroscience.

Call for Papers
We invite high quality submissions of extended abstracts on topics including, but not limited to, the following fundamental questions:

-- How is high-dimensional sensory data encoded in neural systems? What insights can be gained from statistical methods in dimensionality reduction including sparse and overcomplete representations? How do we understand the apparent dimension expansion from thalamic to cortical representations from a machine learning and statistical perspective?

-- What …

Reza Zadeh · Gunnar Carlsson · Michael Mahoney · Manfred K. Warmuth · Wouter M Koolen · Nati Srebro · Satyen Kale · Malik Magdon-Ismail · Ashish Goel · Matei A Zaharia · David Woodruff · Ioannis Koutis · Benjamin Recht

[ Harvey's Tallac ]

Much of Machine Learning is based on Linear Algebra.
Often, the prediction is a function of a dot product between
the parameter vector and the feature vector. This essentially
assumes some kind of independence between the features.
In contrast matrix parameters can be used to learn interrelations
between features: The (i,j)th entry of the parameter matrix
represents how feature i is related to feature j.

This richer modeling has become very popular. In some applications,
like PCA and collaborative filtering, the explicit goal is inference
of a matrix parameter. Yet in others, like direction learning and
topic modeling, the matrix parameter instead pops up in the algorithms
as the natural tool to represent uncertainty.

The emergence of large matrices in many applications has
brought with it a slew of new algorithms and tools.
Over the past few years, matrix analysis and numerical linear
algebra on large matrices has become a thriving field.
Also manipulating such large matrices makes it necessary to
to think about computer systems issues.

This workshop aims to bring closer researchers in large
scale machine learning and large scale numerical linear
algebra to foster cross-talk between the two fields. The
goal is to encourage machine learning researchers …

Xinghao Pan · Haijie Gu · Joseph Gonzalez · Sameer Singh · Yucheng Low · Joseph Hellerstein · Derek G Murray · Raghu Ramakrishnan · Michael Jordan · Christopher Ré

[ Harvey's Emerald Bay B ]

Explosive growth in data and availability of cheap computing resources has sparked increasing interest in Big Learning within the Machine Learning community. Researchers are now taking on the challenge of parallelizing richly structured models with inherently serial dependencies and do not admit straightforward solutions.

Database researchers, however, have a history of developing high performance systems that allow concurrent access while providing theoretical guarantees on correctness. In recent years, database systems have been developed specifically to tackle Big Learning tasks.

This workshop aims to bring together the two communities and facilitate the cross-pollination of ideas. Rather than passively using DB systems, ML researchers can apply major DB concepts to their work; DB researchers stand to gain an understanding of the ML challenges and better guide the development of their Big Learning systems.

The goals of the workshop are
- Identify challenges faced by ML practitioners in Big Learning setting
- Showcase recent and ongoing progress towards parallel ML algorithms
- Highlight recent and significant DB research in addressing Big Learning problems
- Introduce DB implementations of Big Learning systems, and the principle considerations and concepts underlying their designs

Focal points for discussions and solicited submissions include but are not limited to: …

Jennifer Wortman Vaughan · Greg Stoddard · Chien-Ju Ho · Adish Singla · Michael Bernstein · Devavrat Shah · Arpita Ghosh · Evgeniy Gabrilovich · Denny Zhou · Nikhil Devanur · Xi Chen · Alexander Ihler · Qiang Liu · Genevieve Patterson · Ashwinkumar Badanidiyuru Varadaraja · Hossein Azari Soufiani · Jacob Whitehill

[ Harrah's Tahoe A+B ]

All machine learning systems are an integration of data that store human or physical knowledge, and algorithms that discover knowledge patterns and make predictions to new instances. Even though most research attention has been focused on developing more efficient learning algorithms, it is the quality and amount of training data that predominately govern the performance of real-world systems. This is only amplified by the recent popularity of large scale and complicated learning systems such as deep networks, which require millions to billions of training data to perform well. Unfortunately, the traditional methods of collecting data from specialized workers are usually expensive and slow. In recent years, however, the situation has dramatically changed with the emergence of crowdsourcing, where huge amounts of labeled data are collected from large groups of (usually online) workers for low or no cost. Many machine learning tasks, such as computer vision and natural language processing are increasingly benefitting from data crowdsourced platforms such as Amazon Mechanical Turk and CrowdFlower. On the other hand, tools in machine learning, game theory and mechanism design can help to address many challenging problems in crowdsourcing systems, such as making them more reliable, efficient and less expensive.

In this workshop, we …

Arthur Gretton · Mladen Kolar · Samory Kpotufe · John Lafferty · Han Liu · Bernhard Schölkopf · Alexander Smola · Rob Nowak · Mikhail Belkin · Lorenzo Rosasco · peter bickel · Yue Zhao

[ Harvey's Zephyr ]

Modern data acquisition routinely produces massive and complex datasets. Examples are data from high throughput genomic experiments, climate data from worldwide data centers, robotic control data collected overtime in adversarial settings, user-behavior data from social networks, user preferences on online markets, and so forth. Modern pattern recognition problems arising in such disciplines are characterized by large data sizes, large number of observed variables, and increased pattern complexity. Therefore, nonparametric methods which can handle generally complex patterns are ever more relevant for modern data analysis. However, the larger data sizes and number of variables constitute new challenges for nonparametric methods in general. The aim of this workshop is to bring together both theoretical and applied researchers to discuss these modern challenges in detail, share insight on existing solutions, and lay out some of the important future directions.

Through a number of invited and contributed talks and a focused panel discussion, we plan to emphasize the importance of nonparametric methods and present challenges for modern nonparametric methods. In particular, we focus on the following aspect of nonparametric methods:

A. General motivations for nonparametric methods:

* the abundance of modern applications where little is known about data generating mechanisms (e.g., robotics, biology, social …

Tamir Hazan · George Papandreou · Sasha Rakhlin · Danny Tarlow

[ Harvey's Emerald Bay 2 ]

In nearly all machine learning tasks, decisions must be made given current knowledge (e.g., choose which label to predict). Perhaps surprisingly, always making the best decision is not always the best strategy, particularly while learning. Recently, there is an emerging body of work on learning under different rules that apply perturbations to the decision procedure. These works provide simple and efficient learning rules with improved theoretical guarantees. This workshop will bring together the growing community of researchers interested in different aspects of this area, and it will broaden our understanding of why and how perturbation methods can be useful.

Last year, at the highly successful NIPS workshop on Perturbations, Optimization, and Statistics, we looked at how injecting perturbations (whether it be random or adversarial “noise”) into learning and inference procedures can be beneficial. The focus was on two angles: first, on how stochastic perturbations can be used to construct new types of probability models for structured data; and second, how deterministic perturbations affect the regularization and the generalization properties of learning algorithms.

The goal of this workshop is to expand the scope of last year and also explore different ways to apply perturbations within optimization and statistics to enhance and …

Isabelle Guyon · Leon Bottou · Bernhard Schölkopf · Alexander Statnikov · Evelyne Viegas · james m robins

[ Harrah's Glenbrook+Emerald ]

The goal of this workshop is to discuss new methods of large scale experiment design and their application to the inference of causal mechanisms and promote their evaluation via a series of challenges. Emphasis will be put on capitalizing on massive amounts of available observational data to cut down the number of experiments needed, pseudo- or quasi-experiments, iterative designs, and the on-line acquisition of data with minimal perturbation of the system under study. The participants of the cause-effect pairs challenge http://www.causality.inf.ethz.ch/cause-effect.php will be encouraged to submit papers.

The problem of attributing causes to effects is pervasive in science, medicine, economy and almost every aspects of our everyday life involving human reasoning and decision making. What affects your health? the economy? climate changes? The gold standard to establish causal relationships is to perform randomized controlled experiments. However, experiments are costly while non-experimental "observational" data collected routinely around the world are readily available. Unraveling potential cause-effect relationships from such observational data could save a lot of time and effort by allowing us to prioritize confirmatory experiments. This could be complemented by new strategies of incremental experimental design combining observational and experimental data.

Much of machine learning has been so far concentrating on …

Yevgeny Seldin · Yasin Abbasi Yadkori · Yacov Crammer · Ralf Herbrich · Peter Bartlett

[ Harvey's Emerald Bay 3 ]

Resource efficiency is key for making ideas practical. It is crucial in many tasks, ranging from large-scale learning ("big data'') to small-scale mobile devices. Understanding resource efficiency is also important for understanding of biological systems, from individual cells to complex learning systems, such as the human brain. The goal of this workshop is to improve our fundamental theoretical understanding and link between various applications of learning under constraints on the resources, such as computation, observations, communication, and memory. While the founding fathers of machine learning were mainly concerned with characterizing the sample complexity of learning (the observations resource) [VC74] it now gets realized that fundamental understanding of other resource requirements, such as computation, communication, and memory is equally important for further progress [BB11].

The problem of resource-efficient learning is multidimensional and we already see some parts of this puzzle being assembled. One question is the interplay between the requirements on different resources. Can we use more of one resource to save on a different resource? For example, the dependence between computation and observations requirements was studied in [SSS08,SSST12,SSB12]. Another question is online learning under various budget constraints [AKKS12,BKS13,CKS04,DSSS05,CBG06]. One example that Badanidiyuru et al. provide is dynamic pricing with limited …

Matthew Hoffman · Jasper Snoek · Nando de Freitas · Michael A Osborne · Ryan Adams · Sebastien Bubeck · Philipp Hennig · Remi Munos · Andreas Krause

[ Harvey's Emerald Bay A ]

There have been many recent advances in the development of machine learning approaches for active decision making and optimization. These advances have occurred in seemingly disparate communities, each referring to the problem using different terminology: Bayesian optimization, experimental design, bandits, active sensing, automatic algorithm configuration, personalized recommender systems, etc. Recently, significant progress has been made in improving the methodologies used to solve high-dimensional problems and applying these techniques to challenging optimization tasks with limited and noisy feedback. This progress is particularly apparent in areas that seek to automate machine learning algorithms and website analytics. Applying these approaches to increasingly harder problems has also revealed new challenges and opened up many interesting research directions both in developing theory and in practical application.

Following on last year's NIPS workshop, "Bayesian Optimization & Decision Making", the goal of this workshop is to bring together researchers and practitioners from these diverse subject areas to facilitate cross-fertilization by discussing challenges, findings, and sharing data. This year we plan to focus on the intersection of "Theory and Practice". Specifically, we would like to carefully examine the types of problems where Bayesian optimization performs well and ask what theoretical guarantees can be made to explain this performance? …

Marko Grobelnik · Blaz Fortuna · Estevam Hruschka · Michael J Witbrock

[ Harvey's Emerald Bay 2 ]

Text understanding is an old yet-unsolved AI problem consisting of a number of nontrivial steps. The critical step in solving the problem is knowledge acquisition from text, i.e. a transition from a non-formalized text into a formalized actionable language (i.e. capable of reasoning). Other steps in the text understanding pipeline include linguistic processing, reasoning, text generation, search, question answering etc. which are more or less solved to the degree which allows composition of a text understanding service. On the other hand, we know that knowledge acquisition, as the key bottleneck, can be done by humans, while automating of the process is still out of reach in its full breadth.

After failed attempts in the past (due to a lack of theoretical and technological prerequisites), in the recent years the interest for the text understanding and knowledge acquisition form text is growing. There is a number of AI research groups dealing with the various aspects in the areas of computational linguistics, machine learning, probabilistic & logical reasoning, and semantic web. The commonality among all the newer approaches is the use of machine learning to deal with representational change. To list some of the groups working in the area:

• Carnegie Mellon …

Byron Boots · Daniel Hsu · Borja Balle

[ Harrah's Tahoe B ]

Many problems in machine learning involve collecting high-dimensional multivariate observations or sequences of observations, and then fitting a compact model which explains these observations. Recently, linear algebra techniques have given a fundamentally different perspective on how to fit and perform inference in these models. Exploiting the underlying spectral properties of the model parameters has led to fast, provably consistent methods for parameter learning that stand in contrast to previous approaches, such as Expectation Maximization, which suffer from bad local optima and slow convergence.

In the past several years, these Spectral Learning algorithms have become increasingly popular. They have been applied to learn the structure and parameters of many models including predictive state representations, finite state transducers, hidden Markov models, latent trees, latent junction trees, probabilistic context free grammars, and mixture/admixture models. Spectral learning algorithms have also been applied to a wide range of application domains including system identification, video modeling, speech modeling, robotics, and natural language processing.

The focus of this workshop will be on spectral learning algorithms, broadly construed as any method that fits a model by way of a spectral decomposition of moments of (features of) observations. We would like the workshop to be as inclusive as possible …

Peter Grünwald · Wouter M Koolen · Sasha Rakhlin · Nati Srebro · Alekh Agarwal · Karthik Sridharan · Tim van Erven · Sebastien Bubeck

[ Harrah's Tahoe A ]

Most existing theory in both online and statistical learning is centered around a worst-case analysis. For instance, in online learning data are assumed to be generated by an adversary and the goal is to minimize regret. In statistical learning the majority of theoretical results consider risk bounds for the worst-case i.i.d. data generating distribution. In both cases the worst case convergence rates (for regret/n and risk) for 0/1-type and absolute loss functions are O(1/sqrt{n}). Yet in practice simple heuristics like Follow-the-Leader (FTL) often empirically exhibit faster rates.

It has long been known that under Vovk's (1990) mixability condition on the loss function, faster rates are possible. Even without mixability or the closely related exp-concavity (Cesa-Bianchi and Lugosi 2006), in the statistical setting there exist conditions on the distribution under which faster learning rates can be obtained; the main example being Tsybakov's (2004) margin condition, which was recently shown to be intimately connected to mixability (Van Erven et al., 2012).

In practice, even if the loss is not mixable and no distributional assumptions apply, the data are nevertheless often easy enough to allow accelerated learning. Initial promising steps in this direction have been made recently, including parameterless algorithms that combine worst-case …

Martin Jaggi · Zaid Harchaoui · Federico Pierucci

[ Harvey's Emerald Bay 6 ]

Greedy algorithms and projection-free first-order optimization algorithms are at the core of many of the state of the art sparse methods in machine learning, signal processing, harmonic analysis, statistics and other seemingly unrelated areas, with different goals at first sight. Examples include matching pursuit, boosting, greedy methods for sub-modular optimization, with applications ranging from large-scale structured prediction to recommender systems. In the field of optimization, the recent renewed interest in Frank-Wolfe/conditional gradient algorithms opens up an interesting perspective towards a unified understanding of these methods, with a big potential to translate the rich existing knowledge about the respective greedy methods between the different fields.

The goal of this workshop is to take a step towards building a modern and consistent perspective on these related algorithms. The workshop will gather renowned experts working on those algorithms in machine learning, optimization, signal processing, statistics and harmonic analysis, in order to engender a fruitful exchange of ideas and discussions and to push further the boundaries of scalable and efficient optimization for learning problems.

David Mimno · Amr Ahmed · Jordan Boyd-Graber · Ankur Moitra · Hanna Wallach · Alexander Smola · David Blei · Anima Anandkumar

[ Harvey's Emerald Bay B ]

Since the most recent NIPS topic model workshop in 2010, interest in statistical topic modeling has continued to grow in a wide range of research areas, from theoretical computer science to English literature. The goal of this workshop, which marks the 10th anniversary of the original LDA NIPS paper, is to bring together researchers from the NIPS community and beyond to share results, ideas, and perspectives.

We will organize the workshop around the following three themes:

Computation: The computationally intensive process of training topic models has been a useful testbed for novel inference methods in machine learning, such as stochastic variational inference and spectral inference. Theoretical computer scientists have used LDA as a test case to begin to establish provable bounds in unsupervised machine learning. This workshop will provide a forum for researchers developing new inference methods and theoretical analyses to present work in progress, as well as for practitioners to learn about state of the art research in efficient and provable computing.

Applications: Topic models are now commonly used in a broad array of applications to solve real-world problems, from questions in digital humanities and computational social science to e-commerce and government science policy. This workshop will share new …

Hervé GLOTIN · Yann LeCun · Thierry Artières · Stephane Mallat · Ofer Tchernichovski · Xanadu Halkias

[ Harrah's Tahoe C ]

Bioacoustic data science aims at modeling animal sounds for neuroethology and biodiversity assessment. It has received increasing attention due to its diverse potential benefits. It is steadily required by regulatory agencies for timely monitoring of environmental impacts from human activities. Given the complexity of the collected data along with the numerous species and environmental contexts, bioacoustics requires robust information processing.

The features and biological significance of animal sounds, are constrained by the physics of sound production and propagation, and evolved through the processes of natural selection. This yields to new paradigms such as curriculum song learning, predator-prey acoustic loop, etc. NIPS4B solidifies an innovative computational framework by focusing on the principles of information processing, if possible in an inheretly hierarchical manner or with physiological parallels: Deep Belief Networks (DBN), Sparse Auto Encoders (SAE), Convolutional Networks (ConNet), Scattering transforms etc. It encourages interdisciplinary, scientific exchanges and foster collaborations, bringing together experts from machine learning and computational auditory scene analysis, within animal sound and communication systems.

One challenge concerns bird classification (on Kaggle): identify 87 species of Provence (recordings Biotope SA). It is the biggest bird song challenge according to our knowledge, more complex than ICML4B (sabiod.org/ICML4B2013proceedings.pdf). A second challenge concerns …

Jonathan Huang · Sumit Basu · Kalyan Veeramachaneni

[ Harrah's Tahoe D ]

Given the incredible technological leaps that have changed so many aspects of our lives in the last hundred years, it’s surprising that our approach to education today is much the same as it was a century ago. While successful educational technologies have been developed and deployed in some areas, we have yet to see a widespread disruption in teaching methods at the primary, secondary, or post-secondary levels. However, as more and more people gain access to broadband internet, and new technology-based learning opportunities are introduced, we may be witnessing the beginnings of a revolution in educational methods. With college tuitions rising, school funding dropping, test scores falling, and a steadily increasing world population desiring high-quality education at low cost, the impact of educational technology seems more important than ever.

With these technology-based learning opportunities, the rate at which educational data is being collected has also exploded in recent years as an increasing number of students have turned to online resources, both at traditional universities as well as massively open-access online courses (MOOCs) for formal or informal learning. This change raises exciting challenges and possibilities particularly for the machine learning and data sciences communities.

These trends and changes are the inspiration …

Urun Dogan · Marius Kloft · Tatiana Tommasi · Francesco Orabona · Massimiliano Pontil · Sinno Jialin Pan · Shai Ben-David · Arthur Gretton · Fei Sha · Marco Signoretto · Rajhans Samdani · Yun-Qian Miao · Mohammad Gheshlaghi azar · Ruth Urner · Christoph Lampert · Jonathan How

[ Harrah's Fallen+Marla ]

The main objective of the workshop is to document and discuss the recent rise of new research questions on the general problem of learning across domains and tasks. This includes the main topics of transfer [1,2,3] and multi-task learning [4], together with several related variants as domain adaptation [5,6] and dataset bias [7].

In the last years there has been an increasing boost of activity in these areas, many of them driven by practical applications, such as object categorization. Different solutions were studied for the considered topics, mainly separately and without a joint theoretical framework. On the other hand, most of the existing theoretical formulations model regimes that are rarely used in practice (e.g. adaptive methods that store all the source samples).

The workshop will focus on closing this gap by providing an opportunity for theoreticians and practitioners to get together in one place, to share and debate over current theories and empirical results. The goal is to promote a fruitful exchange of ideas and methods between the different communities, leading to a global advancement of the field.

Transfer Learning - Transfer Learning (TL) refers to the problem of retaining and applying the knowledge available for one or more source …

Georg Langs · Brian Murphy · Kai-min K Chang · Paolo Avesani · James Haxby · Nikolaus Kriegeskorte · Susan Whitfield-Gabrieli · Irina Rish · Guillermo Cecchi · Raif Rustamov · Marius Kloft · Jonathan Young · Sina Ghiassian · Michael Coen

[ Harvey's Sierra ]

Aim of the workshop

We propose a two-day workshop on the topic of machine learning approaches in neuroscience, neuroimaging, with a specific extension to behavioral experiments and psychology. We believe that machine learning has a prominent role in shaping how questions in neuroscience are framed, and that the machine-learning mind set is now entering modern psychology and behavioral studies. It is also equally important that practical applications in these fields motivate a rapidly evolving line or research in the machine learning community. In this context, many controversies and open questions exist.

The goal of the workshop is to pinpoint the most pressing issues and common challenges across the fields, and to sketch future directions and open questions in the light of novel methodology. The proposed workshop is aimed at offering a forum that joins machine learning, neuroscience, and psychology community, and should facilitate formulating and discussing the issues at their interface.

Motivated by two previous workshops, MLINI ‘11 and MLINI’12, we will center this workshop around invited talks, and two panel discussions. Triggered by these discussions, this year we plan to adapt the workshop topics to a less traditional scope that investigates the role of machine learning in neuroimaging of …

Jenna Wiens · Finale P Doshi-Velez · Can Ye · Madalina Fiterau · Shipeng Yu · Le Lu · Balaji R Krishnapuram

[ Harvey's Tallac ]

Advances in medical information technology have resulted in enormous warehouses of data that are at once overwhelming and sparse. A single patient visit may result in tens to thousands of measurements and structured information, including clinical factors, diagnostic imaging, lab tests, genomic and proteomic tests. Hospitals may see thousands of patients each year. However, each patient may have relatively few visits to any particular medical provider. The resulting data are a heterogeneous amalgam of patient demographics, vital signs, diagnoses, records of treatment and medication receipt and annotations made by nurses or doctors, each with its own idiosyncrasies.


The objective of this workshop is to discuss how advanced machine learning techniques can derive clinical and scientific impact from these messy, incomplete, and partial data. We will bring together machine learning researchers and experts in medical informatics who are involved in the development of algorithms or intelligent systems designed to improve quality of healthcare. Relevant areas include health monitoring systems, clinical data labeling and clustering, clinical outcome prediction, efficient and scalable processing of medical records, feature selection or dimensionality reduction in clinical data, tools for personalized medicine, and time-series analysis with medical applications.

Antti Honkela · Cheng Soon Ong

[ Harvey's Emerald Bay 5 ]

Machine learning open source software (MLOSS) is one of the cornerstones of open science and reproducible research. Along with open access and open data, it enables free reuse and extension of current developments in machine learning. The mloss.org site exists to support a community creating a comprehensive open source machine learning environment, mainly by promoting new software implementations. This workshop aims to enhance the environment by fostering collaboration with the goal of creating tools that work with one another. Far from requiring integration into a single package, we believe that this kind of interoperability can also be achieved in a collaborative manner, which is especially suited to open source software development practices.

The workshop is aimed at all machine learning researchers who wish to have their algorithms and implementations included as a part of the greater open source machine learning environment. Continuing the tradition of well received workshops on MLOSS at NIPS 2006, NIPS 2008 and ICML 2010, we will have a workshop that is a mix of invited speakers, contributed talks and demos as well as a discussion session. For 2013, we focus on workflows and pipelines. Many algorithms and tools have reached a level of maturity which allows …

Thomas Gaertner · Roman Garnett · Andrea Passerini

[ Harvey's Emerald Bay 1 ]

In many real-world applications, machine learning algorithms are employed as a tool in a “constructive process”. These processes are similar to the general knowledge-discovery process but have a more specific goal: the construction of one-or-more domain elements with particular properties. The most common use of machine learning algorithms in this context is to predict the properties of candidate domain elements.

In this workshop we want to bring together domain experts employing machine learning tools in constructive processes and machine learners investigating novel approaches or theories concerning constructive processes as a whole. The concerned machine learning approaches are typically interactive (e.g., online- or active-learning algorithms) and have to deal with huge, relational in- and/or output spaces.

Interesting applications include but are not limited to: de novo drug design, generation of art (e.g., music composition), construction of game levels, generation of novel food recipes, proposal of travel itineraries, etc. Interesting approaches include but are not limited to: active approaches to structured output learning, transfer or multi-task learning of generative models, active search or online optimisation over relational domains, and learning with constraints.

Many of the applications of constructive machine learning, including the ones mentioned above, are primarily considered in their respective application …

Jean-Philippe Vert · Anna Goldenberg · Sara Mostafavi · Oliver Stegle

[ Harvey's Zephyr ]

The field of computational biology has seen dramatic growth over the past few years, both in terms of new available data, new scientific questions, and new challenges for learning and inference. In particular, biological data are often relationally structured and highly diverse, well-suited to approaches that combine multiple weak evidence from heterogeneous sources. These data may include sequenced genomes of a variety of organisms, gene expression data from multiple technologies, protein expression data, protein sequence and 3D structural data, protein interactions, gene ontology and pathway databases, genetic variation data (such as SNPs), and an enormous amount of textual data in the biological and medical literature. Furthermore, next generation sequencing technologies and high-throughput imaging techniques are yielding terabyte scale data sets that require novel algorithmic solutions. New types of scientific and clinical problems require the development of novel supervised and unsupervised learning methods that can use these growing resources.

The goal of this workshop is to present emerging problems and innovative machine learning techniques in computational biology. We will invite several speakers from the biology/bioinformatics community who will present current research problems in computational biology, and we will invite contributed talks on novel learning approaches in computational biology. We encourage contributions …

Srinivas C Turaga · Lars Buesing · Maneesh Sahani · Jakob H Macke

[ Harvey's Emerald Bay 4 ]

For many years, measurements of neural activity have either been restricted to recordings from single neurons or a very small number of neurons, and anatomical reconstructions to very sparse and incomplete neural circuits. Major advances in optical imaging (e.g. 2-photon and light-sheet microscopic imaging of calcium signals) and new electrode array technologies are now beginning to provide measurements of neural activity at an unprecedented scale. High-profile initiatives such as BRAIN (Brain Research through Advancing Innovative Neurotechnologies) will fuel the development of ever more powerful techniques for mapping the structure and activity of neural circuits.

Computational tools will be important to both the high-throughput acquisition of these large-scale datasets and in the analysis. Acquiring, analyzing and integrating these sources of data raises major challenges and opportunities for computational neuroscience and machine learning:

i) What kind of data will be generated by large-scale functional measurements in the next decade? How will it be quantitatively or qualitatively different to the kind of data we have had previously?

ii) Algorithmic methods have played an important role in data acquisition, e.g. spike-sorting algorithms or spike-inference algorithms from calcium traces. In the future, what role will computational tools play in the process of high-throughput data acquistion? …

Edwin Bonilla · Thomas Dietterich · Theodoros Damoulas · Andreas Krause · Daniel Sheldon · Iadine Chades · J. Zico Kolter · Bistra Dilkina · Carla Gomes · Hugo P Simao

[ Harrah's Glenbrook+Emerald ]

Sustainability encompasses the balance of environmental, economic and societal demands. There is strong evidence suggesting that more actions need to be taken in order to achieve this balance. For example, Edward O. Wilson said in his 2002 Book The Future of Life that "at the current rates of human destruction of natural ecosystems, 50% of all species of life on earth will be extinct in 100 years". More recently, a 2012 review in Nature has stated that, similarly to localized ecological systems, "the global ecosystem as a whole can react in the same way and is approaching a planetary-scale critical transition as a result of human influence".

While the significance of the problem is apparent, more involvement from the machine learning community in sustainability problems is required. Not surprisingly, sustainability problems bring along interesting challenges and opportunities for machine learning in terms of complexity, scalability and impact in areas such as prediction, modeling and control. This workshop aims at bringing together scientists in machine learning, operations research, applied mathematics and statistics with a strong interest in sustainability to discuss how to use existing techniques and how to develop novel methods in order to address such challenges.

There are many application …