Skip to yearly menu bar Skip to main content



Workshops
Jia Deng · Samy Bengio · Yuanqing Lin · Li Fei-Fei

[ Sand Harbor 1, Harrah’s Special Events Center 2nd Floor ]

The emergence of “big data” has brought about a paradigm shift throughout computer science. Computer vision is no exception. The explosion of images and videos on the Internet and the availability of large amounts of annotated data have created unprecedented opportunities and fundamental challenges on scaling up computer vision.

Over the past few years, machine learning on big data has become a thriving field with a plethora of theories and tools developed. Meanwhile, large scale vision has also attracted increasing attention in the computer vision community. This workshop aims to bring closer researchers in large scale machine learning and large scale vision to foster cross-talk between the two fields. The goal is to encourage machine learning researchers to work on large scale vision problems, to inform computer vision researchers about new developments on large scale learning, and to identify unique challenges and opportunities.

This workshop will focus on two distinct yet closely related vision problems: recognition and retrieval. Both are inherently large scale. In particular, both must handle high dimensional features (hundreds of thousands to millions), a large variety of visual classes (tens of thousands to millions), and a large number of examples (millions to billions).

This workshop will consist …

Vikash Mansinghka · Daniel Roy · Noah Goodman

[ Tahoe A, Harrah’s Special Events Center 2nd Floor ]

An intensive, two-day workshop on PROBABILISTIC PROGRAMMING, with contributed and invited talks, poster sessions, demos, and discussions.

Probabilistic models and inference algorithms have become standard tools for interpreting ambiguous, noisy data and building systems that learn from their experience. However, even simple probabilistic models can require significant effort and specialized expertise to develop and use, frequently involving custom mathematics, algorithm design and software development. State-of-the-art models from Bayesian statistics, artificial intelligence and cognitive science --- especially those involving distributions over infinite data structures, relational structures, worlds with unknown numbers of objects, rich causal simulations of physics and psychology, and the reasoning processes of other agents --- can be difficult to even specify formally, let alone in a machine-executable fashion.

PROBABILISTIC PROGRAMMING aims to close this gap, making variations on commonly-used probabilistic models far easier to develop and use, and pointing the way towards entirely new types of models and inference. The central idea is to represent probabilistic models using ideas from programming, including functional, imperative, and logic-based languages. Most probabilistic programming systems represent distributions algorithmically, in terms of a programming language plus primitives for stochastic choice; some even support inference over Turing-universal languages. Compared with representations of models in terms …

Edo M Airoldi · David S Choi · Khalid El-Arini · Jure Leskovec

[ Glenbrook + Emerald Bay, Harrah’s Special Events Center 2nd Floor ]

Modern technology, including the World Wide Web, sensor networks, and high-throughput genetic sequencing, has completely transformed the scale and concept of data in the sciences. Data collections for a number of systems of interest have grown large and heterogeneous, and a crucial subset of the data is often represented as a collection of graphs together with node and edge attributes. Thus, the analysis and modeling of large, complex, real-world networks has become necessary in the study of phenomena across the diverse set of social, technological, and natural worlds. The aim of this workshop is to bring together researchers with these diverse sets of backgrounds and applications, as the next wave of core methodology in statistics and machine learning will have to provide theoretical and computational tools to analyze graphs in order to support scientific progress in applied domains such as social sciences, biology, medicine, neuroscience, physics, finance, and economics.

While the field remains extremely heterogeneous and diverse, there are emerging signs of convergence, maturation, and increased awareness between the disparate disciplines. One noteworthy example, arising in studies on the spread of information, is that social media researchers are beginning to use problem-specific structure to infer between social influence, homophily, and …

Georg Langs · Irina Rish · Guillermo Cecchi · Brian Murphy · Bjoern Menze · Kai-min K Chang · Moritz Grosse-Wentrup

[ Emerald Bay 5, Harveys Convention Center Floor (CC) ]

A workshop on the topic of machine learning approaches in neuroscience and neuroimaging. We believe that both machine learning and neuroimaging can learn from each other as the two communities overlap and enter an intense exchange of ideas and research questions. Methodological developments in machine learning spurn novel paradigms in neuroimaging, neuroscience motivates methodological advances in computational analysis. In this context many controversies and open questions exist. The goal of the workshop is to pinpoint these issues, sketch future directions, and tackle open questions in the light of novel methodology.

The first workshop of this series at NIPS 2011 built upon earlier events in 2006 and 2008. Last year's workshop included many invited speakers, and was centered around two panel discussions, during which 2 questions were discussed: the interpretability of machine learning findings, and the shift of paradigms in the neuroscience community. The discussion was inspiring, and made clear, that there is a tremendous amount the two communities can learn from each other benefiting from communication across the disciplines.

The aim of the workshop is to offer a forum for the overlap of these communities. Besides interpretation, and the shift of paradigms, many open questions remain. Among them:


- How …

Jean-Philippe Vert · Anna Goldenberg · Christina Leslie

[ Sand Harbor 2, Harrah’s Special Events Center 2nd Floor ]

The field of computational biology has seen dramatic growth over the past few years, both in terms of new available data, new scientific questions, and new challenges for learning and inference. In particular, biological data are often relationally structured and highly diverse, well-suited to approaches that combine multiple weak evidence from heterogeneous sources. These data may include sequenced genomes of a variety of organisms, gene expression data from multiple technologies, protein expression data, protein sequence and 3D structural data, protein interactions, gene ontology and pathway databases, genetic variation data (such as SNPs), and an enormous amount of textual data in the biological and medical literature. New types of scientific and clinical problems require the development of novel supervised and unsupervised learning methods that can use these growing resources. Furthermore, next generation sequencing technologies are yielding terabyte scale data sets that require novel algorithmic solutions.

The goal of this workshop is to present emerging problems and machine learning techniques in computational biology. We will invite several speakers from the biology/bioinformatics community who will present current research problems in bioinformatics, and we will invite contributed talks on novel learning approaches in computational biology. We encourage contributions describing either progress on new bioinformatics …

Yevgeny Seldin · Guy Lever · John Shawe-Taylor · Nicolò Cesa-Bianchi · Yacov Crammer · Francois Laviolette · Gabor Lugosi · Peter Bartlett

[ Tahoe C, Harrah’s Special Events Center 2nd Floor ]

One of the main practical goals of machine learning is to identify relevant trade-offs in different problems, formalize, and solve them. We have already achieved fairly good progress in addressing individual trade-offs, such as model order selection or exploration-exploitation. In this workshop we would like to focus on problems that involve more than one trade-off simultaneously. We are interested both in practical problems where "multi-trade-offs" arise and in theoretical approaches to their solution. Obviously, many problems in life cannot be reduced to a single trade-off and it is highly important to improve our ability to address multiple trade-offs simultaneously. Below we provide several examples of situations, where multiple trade-offs arise simultaneously. The goal of the examples is to provide a starting point for a discussion, but they are not limiting the scope and any other multi-trade-off problem is welcome to be discussed at the workshop.

Multi-trade-offs arise naturally in interaction between multiple learning systems or when a learning system faces multiple tasks simultaneously; especially when the systems or tasks share common resources, such as CPU time, memory, sensors, robot body, and so on. For a concrete example, imagine a robot riding a bicycle and balancing a pole. Each task individually …

Javad Azimi · Roman Garnett · Frank R Hutter · Shakir Mohamed

[ Emerald Bay 1 +2, Harveys Convention Center Floor (CC) ]

Recent years have brought substantial advances in sequential decision making under uncertainty. These advances have occurred in many different communities, including several subfields of computer science, statistics, and electrical/mechanical/chemical engineering. While these communities are essentially trying to solve the same problem, they develop rather independently, using different terminology: Bayesian optimization, experimental design, bandits, active sensing, personalized recommender systems, automatic algorithm configuration, reinforcement learning, and so on. Some communities focus more on theoretical aspects while others' expertise is on real-world applications. This workshop aims to bring researchers from these communities together to facilitate cross-fertilization by discussing challenges, findings, and sharing data. This workshop follows last year's NIPS workshop "Bayesian optimization, experimental design and bandits: Theory and applications", one of the most-attended workshops in 2011. This year we plan to focus somewhat more on real-world applications, to bridge the gap between theory and practice. Specifically, we plan to have a panel discussion on real-world and industrial applications of Bayesian optimization and an increased focus on real-world applications in the invited talks (covering hyperparameter tuning, configuration of algorithms for solving hard combinatorial problems, energy optimization, and optimization of MCMC). Similar to last year, we expect to highlight the most beneficial research directions and …

Naftali Tishby · Daniel Polani · Tobias Jung

[ Emerald Bay 3, Harveys Convention Center Floor (CC) ]

Since its inception for describing the laws of communication in the 1940's, information theory has been considered in fields beyond its original application area and, in particular, it was long attempted to utilize it for the description of intelligent agents. Already Attneave (1954) and Barlow (1961) suspected that neural information processing might follow principles of information theory and Laughlin (1998) demonstrated that information processing comes at a high metabolic cost; this implies that there would be evolutionary pressure pushing organismic information processing towards the optimal levels of data throughput predicted by information theory. This becomes particularly interesting when one considers the whole perception-action cycle, including feedback. In the last decade, significant progress has been made in this direction, linking information theory and control. The ensuing insights allow to address a large range of fundamental questions pertaining not only to the perception-action cycle, but to general issues of intelligence, and allow to solve classical problems of AI and machine learning in a novel way.

The workshop will present recent work on progress in AI, machine learning, control, as well as biologically plausible cognitive modeling, that is based on information theory.

Stefanie Jegelka · Andreas Krause · Jeffrey A Bilmes · Pradeep Ravikumar

[ Emerald Bay B, Harveys Convention Center Floor (CC) ]

Optimization problems with discrete solutions (e.g., combinatorial optimization) are becoming increasingly important in machine learning. The core of statistical machine learning is to infer conclusions from data, and when the variables underlying the data are discrete, both the tasks of inferring the model from data, as well as performing predictions using the estimated model are discrete optimization problems. Two factors complicate matters: first, many discrete problems are in general computationally hard, and second, machine learning applications often demand solving such problems at very large scales.

The focus of this year's workshop lies on structures that enable scalability. Examples of important structures include sparse graphs, the marginal polytope, and submodularity. Which properties of the problem make it possible to still efficiently obtain exact or decent approximate solutions? What are the challenges posed by parallel and distributed processing? Which discrete problems in machine learning are in need of more scalable algorithms? How can we make discrete algorithms scalable while retaining quality? Some heuristics perform well but as of yet are devoid of a theoretical foundation; what explains such good behavior?

Theodoros Damoulas · Thomas Dietterich · Edith Law · Serge Belongie

[ Emerald Bay 4, Harveys Convention Center Floor (CC) ]

http://www.cs.cornell.edu/~damoulas/Site/HCSCS.html

Researchers in several scientific and sustainability fields have recently achieved exciting results by involving the general public in the acquisition of scientific data and the solution of challenging computational problems. One example is the eBird project (www.ebird.org) of the Cornell Lab of Ornithology, where field observations uploaded by bird enthusiasts are providing continent-scale data on bird distributions that support the development and testing of hypotheses about bird migration. Another example is the FoldIt project (www.fold.it), where volunteers interacting with the FoldIt software have been able to solve the 3D structures of several biologically important proteins.

Despite these early successes, the involvement of the general public in these efforts poses many challenges for machine learning. Human observers can vary hugely in their degree of expertise. They conduct observations when and where they see fit, rather than following carefully designed experimental protocols. Paid participants (e.g., from Amazon Mechanical Turk) may not follow the rules or may even deliberately mislead the investigators.

A related challenge is that problem instances presented to human participants can vary in difficulty. Some instances (e.g., of visual tasks) may be impossible for most people to solve. This leads to a bias toward easy instances, which can confuse …
Ankur P Parikh · Le Song · Eric Xing

[ Tahoe B, Harrah’s Special Events Center 2nd Floor ]

Website: http://www.cs.cmu.edu/~apparikh/nips2012spectral/main.html

Recently, linear algebra techniques have given a fundamentally
different perspective for learning and inference in latent variable
models. Exploiting the underlying spectral properties of the model
parameters has led to fast, provably consistent methods for structure and parameter learning that stand in contrast to previous approaches, such as Expectation Maximization, which suffer from local optima and slow convergence. Furthermore, these techniques have given insight into the nature of latent variable models.

In this workshop, via a mix of invited talks, contributed posters, and discussion, we seek to explore the theoretical and applied aspects of spectral methods including the following major themes:

(1) How can spectral techniques help us develop fast and local minima
free solutions to real world problems involving latent variables in
natural language processing, dynamical systems, computer vision etc. where existing methods such as Expectation Maximization are unsatisfactory?

(2) How can these approaches lead to a deeper understanding and interpretation of the complexity of latent variable models?

Achim Rettinger · Marko Grobelnik · Blaz Fortuna · Xavier Carreras · Juanzi Li

[ Tahoe D, Harrah’s Special Events Center 2nd Floor ]

Workshop Motivation:
Automatic text understanding has been an unsolved research problem for many years. This partially results from the dynamic and diverging nature of human languages, which ultimately results in many different varieties of natural language. This variations range from the individual level, to regional and social dialects, and up to seemingly separate languages and language families.
However, in recent years there have been considerable achievements in data driven approaches to computational linguistics exploiting the redundancy in the encoded information and the structures used. Those approaches are mostly not language specific or can even exploit redundancies across languages.
This progress in cross-lingual technologies is largely due to the increased availability of multilingual data in the form of static repositories or streams of documents. In addition parallel and comparable corpora like Wikipedia are easily available and constantly updated. Finally, cross-lingual knowledge bases like DBpedia can be used as an Interlingua to connect structured information across languages. This helps at scaling the traditionally monolingual tasks, such as information retrieval and intelligent information access, to multilingual and cross-lingual applications.

From the application side, there is a clear need for such cross-lingual technology and services, as a) there is a huge disparity between the …

Martin Kleinsteuber · Francis Bach · Remi Gribonval · John Wright · Simon Hawe

[ Emerald Bay A, Harveys Convention Center Floor (CC) ]

Exploiting structure in data is crucial for the success of many techniques in neuroscience, machine learning, signal processing, and statistics. In this context, the fact that data of interest can be modeled via sparsity has been proven extremely valuable. As a consequence, numerous algorithms either aiming at learning sparse representations of data, or exploiting sparse representations in applications have been proposed within the machine learning and signal processing communities over the last few years.
The most common way to model sparsity in data is via the so called synthesis model, also known as sparse coding. Therein, the underlying assumption is that the data can be decomposed into a linear combination of very few atoms of some dictionary. Various previous workshops and special sessions at machine learning conferences have focused on this model and its applications, as well as on algorithms for learning suitable dictionaries.

In contrast to this, considerably less attention has been drawn up to now to an interesting alternative, the so called analysis model. Here, the data is mapped to a higher dimensional space by an analysis operator and the image of this mapping is assumed to be sparse. One of the most prominent examples of analysis sparsity …

Michael Goodrich · Pavel N Krivitsky · David M Mount · Christopher DuBois · Padhraic Smyth

[ Fallen Leaf + Marla Bay, Harrah’s Special Events Center 2nd Floor ]

Statistical models for social networks struggle with the tension between scalability - the ability to effectively and efficiently model networks with large numbers of nodes - and fidelity to social processes. Recent developments within the field have sought to address these issues in various ways, including algorithmic innovations, use of scalable latent variable models, and clever use of covariate information. This workshop will provide both a forum for presenting new innovations in this area, and a venue for debating the tradeoffs involved in differing approaches to social network modeling. The workshop will consist of a combination of invited speakers and contributed talks and posters, with ample time allowed for open discussion. Participating invited speakers include Ulrik Brandes, Carter Butts, David Eppstein, Mark Handcock, David Hunter, and David Kempe.

This workshop will be of interest to researchers working on analysis of large social network data sets, with a focus on the development of both theoretical and computational aspects of new statistical and machine learning methods for such data. Case studies and applications involving large social network data sets are also of relevance, in particular, as they impact computational and statistical issues.

Examples of specific questions of interest include:
- What are …

Sivaraman Balakrishnan · Arthur Gretton · Mladen Kolar · John Lafferty · Han Liu · Tong Zhang

[ Sand Harbor 3, Harrah’s Special Events Center 2nd Floor ]

The objective of this workshop is to bring together practitioners and theoreticians who are interested in developing scalable and principled nonparametric learning algorithms for analyzing complex and large-scale datasets. The workshop will communicate the newest research results and attack several important bottlenecks of nonparametric learning by exploring (i) new models and methods that enable high-dimensional nonparametric learning, (ii) new computational techniques that enable scalable nonparametric learning in online and parallel fashion, and (iii) new statistical theory that characterizes the performance and information-theoretic limits of nonparametric learning algorithms. The expected goals of this workshop include (i) reporting the state-of-the-art of modern nonparametrics, (ii) identifying major challenges and setting up the frontiers for nonparametric methods, (iii) connecting different disjoint communities in machine learning and statistics. The targeted application areas include genomics, cognitive neuroscience, climate science, astrophysics, and natural language processing.

Modern data acquisition routinely produces massive and complex datasets, including chip data from high throughput genomic experiments, image data from functional Magnetic Resonance Imaging (fMRI), proteomic data from tandem mass spectrometry analysis, and climate data from geographically distributed data centers. Existing high dimensional theories and learning algorithms rely heavily on parametric models, which assume the data come from an underlying distribution (e.g. …

Viren Jain · Moritz Helmstaedter

[ Emerald Bay 6, Harveys Convention Center Floor (CC) ]

The "wiring diagram" of essentially all nervous systems remains unknown due to the extreme difficulty of measuring detailed patterns of synaptic connectivity of entire neural circuits. At this point, the major bottleneck is in the analysis of tera or peta-voxel 3d electron microscopy image data in which neuronal processes need to be traced and synapses localized in order for connectivity information to be inferred. This presents an opportunity for machine learning and machine perception to have a fundamental impact on advances in neurobiology. However, it also presents a major challenge, as existing machine learning methods fall short of solving the problem.
The goal of this workshop is to bring together researchers in machine learning and neuroscience to discuss progress and remaining challenges in this exciting and rapidly evolving field. We aim to attract machine learning and computer vision specialists interested in learning about a new problem, as well as computational neuroscientists at NIPS who may be interested in modeling connectivity data. We will discuss the release of public datasets and competitions that may facilitate further activity in this area. We expect the workshop to result in a significant increase in the scope of ideas and people engaged in this field. …

Katherine Ellis · Gert Lanckriet · Tommi Jaakkola · Lenny Grokop

[ Emerald Bay 3, Harveys Convention Center Floor (CC) ]

The ubiquity of mobile phones, packed with sensors such as accelerometers, gyroscopes, light and proximity sensors, BlueTooth and WiFi radios, GPS radios, microphones, etc., has brought increased attention to the field of mobile context awareness. This field examines problems relating to inferring some aspect of a user’s behavior, such as their activity, mood, interruptibility, situation, etc., using mobile sensors. There is a wide range of applications for context-aware devices. In the healthcare industry, for example, such devices could provide support for cognitively impaired people, provide health-care professionals with simple ways of monitoring patient activity levels during rehabilitation, and perform long-term health and fitness monitoring. In the transportation industry they could be used to predict and redirect traffic flow or to provide telematics for auto-insurers. Context awareness in smartphones can aid in automating functionality such as redirecting calls to voicemail when the user is uninterruptible, automatically updating status on social networks, etc.., and can be used to provide personalized recommendations.

Existing work in mobile context-awareness has predominantly come from researchers in the human-computer interaction community. There the focus has been on building custom sensor/hardware solutions to perform social science experiments or solve application-specific problems. The goal of this workshop is to …

Vikash Mansinghka · Daniel Roy · Noah Goodman

[ Tahoe A, Harrah’s Special Events Center 2nd Floor ]

Probabilistic models and algorithmic techniques for inference have become standard tools for interpreting data and building systems that learn from their experience. Growing out of an extensive body of work in machine learning, statistics, robotics, vision, artificial intelligence, neuroscience and cognitive science, rich probabilistic models and inference techniques have more recently spread to other branches of science and engineering, from astrophysics to climate science to marketing to web site personalization. This explosion is largely due to the development of probabilistic graphical models, which provide a formal lingua franca for modeling, and a common target for efficient inference algorithms.

However, even simple probabilistic models can require significant effort and specialized expertise to develop and use, frequently involving custom mathematics, algorithm design and software development. More innovative and useful models far outstrip the representational capacity of graphical models and their associated inference techniques. They are communicated using a mix of natural language, pseudo code, and formulas, often eliding crucial aspects such as fine-grained independence, abstraction and recursion, and are fit to data via special purpose, one-off inference algorithms.

PROBABILISTIC PROGRAMMING LANGUAGES aim to close this gap, going beyond graphical models in representational capacity while providing automatic probabilistic inference. Rather than marry statistics …

Philipp Hennig · John P Cunningham · Michael A Osborne

[ Tahoe D, Harrah’s Special Events Center 2nd Floor ]

Traditionally, machine learning uses numerical algorithms as tools. However, many tasks in numerics can be viewed as learning problems. As examples:

* How can optimizers learn about the objective function, and how should they update their search direction?

* How should a quadrature method estimate an integral given observations of the integrand, and where should these methods put their evaluation nodes?

* Can approximate inference techniques be applied to numerical problems?

Many such issues can be seen as special cases of decision theory, active learning, or reinforcement learning.

We invite contribution of recent results in the development and analysis of numerical analysis methods based on probability theory. This includes, but is not limited to the areas of optimization, sampling, linear algebra, quadrature and the solution of differential equations.

Submission instructions are available at http://www.probabilistic-numerics.org/Call.html.

Dimitri Kanevsky · Tony Jebara · Li Deng · Stephen Wright · Georg Heigold · Avishy Carmi

[ Tahoe C, Harrah’s Special Events Center 2nd Floor ]

Exponential functions are core mathematical constructs that are the key to many important applications, including speech recognition, pattern-search and logistic regression problems in statistics, machine translation, and natural language processing. Exponential functions are found in exponential families, log-linear models, conditional random fields (CRF), entropy functions, neural networks involving sigmoid and soft max functions, and Kalman filter or MMIE training of hidden Markov models. Many techniques have been developed in pattern recognition to construct formulations from exponential expressions and to optimize such functions, including growth transforms, EM, EBW, Rprop, bounds for log-linear models, large-margin formulations, and regularization. Optimization of log-linear models also provides important algorithmic tools for machine learning applications (including deep learning), leading to new research in such topics as stochastic gradient methods, sparse / regularized optimization methods, enhanced first-order methods, coordinate descent, and approximate second-order methods. Specific recent advances relevant to log-linear modeling include the following.

• Effective optimization approaches, including stochastic gradient and Hessian-free methods.
• Efficient algorithms for regularized optimization problems.
• Bounds for log-linear models and recent convergence results
• Recognition of modeling equivalences across different areas, such as the equivalence between Gaussian and log-linear models/HMM and HCRF, and the equivalence between transfer entropy and Granger …

Georg Langs · Irina Rish · Guillermo Cecchi · Brian Murphy · Bjoern Menze · Kai-min K Chang · Moritz Grosse-Wentrup

[ Emerald Bay 5, Harveys Convention Center Floor (CC) ]

1.) Aim

We propose a two day workshop on the topic of machine learning approaches in neuroscience and neuroimaging. We believe that both machine learning and neuroimaging can learn from each other as the two communities overlap and enter an intense exchange of ideas and research questions. Methodological developments in machine learning spurn novel paradigms in neuroimaging, neuroscience motivates methodological advances in computational analysis. In this context many controversies and open questions exist. The goal of the workshop is to pinpoint these issues, sketch future directions, and tackle open questions in the light of novel methodology.

The first workshop of this series at NIPS 2011 built upon earlier events in 2006 and 2008. Last year’s workshop included many invited speakers, and was centered around two panel discussions, during which 2 questions were discussed: the interpretability of machine learning findings, and the shift of paradigms in the neuroscience community. The discussion was inspiring, and made clear, that there is a tremendous amount the two communities can learn from each other benefiting from communication across the disciplines.

The aim of the workshop is to offer a forum for the overlap of these communities. Besides interpretation, and the shift of paradigms, many open …

Tamir Hazan · George Papandreou · Danny Tarlow

[ Glenbrook + Emerald Bay, Harrah’s Special Events Center 2nd Floor ]

In nearly all machine learning tasks, we expect there to be randomness, or noise, in the data we observe and in the relationships encoded by the model. Usually, this noise is considered undesirable, and we would eliminate it if possible. However, there is an emerging body of work on perturbation methods, showing the benefits of explicitly adding noise into the modeling, learning, and inference pipelines. This workshop will bring together the growing community of researchers interested in different aspects of this area, and will broaden our understanding of why and how perturbation methods can be useful.

More generally, perturbation methods usually provide efficient and principled ways to reason about the neighborhood of possible outcomes when trying to make the best decision. For example, some might want to arrive at the best outcome that is robust to small changes in model parameters. Others might want to find the best choice while compensating for their lack of knowledge by averaging over the different outcomes. Recently, several works influenced by diverse fields of research such as statistics, optimization, machine learning, and theoretical computer science, use perturbation methods in similar ways. The goal of this workshop is to explore different techniques in perturbation methods …

Le Song · Arthur Gretton · Alexander Smola

[ Emerald Bay 2, Harveys Convention Center Floor (CC) ]

Website: https://sites.google.com/site/kernelgraphical/

Kernel methods and graphical models are two important families of techniques for machine learning. Our community has witnessed many major but separate advances in the theory and applications of both subfields. For kernel methods, the advances include kernels on structured data, Hilbert-space embeddings of distributions, and applications of kernel methods to multiple kernel learning, transfer learning, and multi-task learning. For graphical models, the advances include variational inference, nonparametric Bayes techniques, and applications of graphical models to topic modeling, computational biology and social network problems.

This workshop addresses two main research questions: first, how may kernel methods be used to address difficult learning problems for graphical models, such as inference for multi-modal continuous distributions on many variables, and dealing with non-conjugate priors? And second, how might kernel methods be advanced by bringing in concepts from graphical models, for instance by incorporating sophisticated conditional independence structures, latent variables, and prior information?


Kernel algorithms have traditionally had the advantage of being solved via convex optimization or eigenproblems, and having strong statistical guarantees on convergence. The graphical model literature has focused on modelling complex dependence structures in a flexible way, although approximations may be required to make inference tractable. Can we develop …

Michael Mozer · javier r movellan · Robert Lindsey · Jacob Whitehill

[ Emerald Bay 1, Harveys Convention Center Floor (CC) ]

The field of education has the potential to be transformed by the internet and intelligent computer systems. Evidence for the first stage of this transformation is abundant, from the Stanford online AI and Machine Learning courses to web sites such as Kahn Academy that offer on line lessons and drills. However, the delivery of instruction via web-connected devices is merely a precondition for what may become an even more fundamental transformation: the personalization of education.

In traditional classroom settings, teachers must divide their attention and time among many students and hence have limited ability to observe and customize instruction to individuals. Even in one-on-one tutoring sessions, teachers rely on intuition and experience to choose the material and stye of instruction that they believe would provide the greatest benefit given the student's current state of understanding.


In order both to assist human teachers in traditional classroom environments and to improve automated tutoring systems to match the capabilities of expert human tutors, one would like to develop formal approaches that can:

* exploit subtle aspects of a student's behavior---such as facial expressions, fixation sequences, response latencies, and errors---to make explicit inferences about the student's latent state of knowledge and understanding;

* leverage …

Suvrit Sra · Alekh Agarwal

[ Fallen Leaf + Marla Bay, Harrah’s Special Events Center 2nd Floor ]

Optimization lies at the heart of ML algorithms. Sometimes, classical textbook algorithms suffice, but the majority problems require tailored methods that are based on a deeper understanding of the ML requirements. ML applications and researchers are driving some of the most cutting-edge developments in optimization today. The intimate relation of optimization with ML is the key motivation for our workshop, which aims to foster discussion, discovery, and dissemination of the state-of-the-art in optimization as relevant to machine learning.

Much interest has focused recently on stochastic methods, which can be used in an online setting and in settings where data sets are extremely large and high accuracy is not required. Many aspects of stochastic gradient remain to be explored, for example, different algorithmic variants, customizing to the data set structure, convergence analysis, sampling techniques, software, choice of regularization and tradeoff parameters, distributed and parallel computation. The need for an up-to-date analysis of algorithms for nonconvex problems remains an important practical issue, whose importance becomes even more pronounced as ML tackles more and more complex mathematical models.

Finally, we do not wish to ignore the not particularly large scale setting, where one does have time to wield substantial computational resources. In this …

Yoshua Bengio · James Bergstra · Quoc V. Le

[ Emerald Bay B, Harveys Convention Center Floor (CC) ]

Machine learning algorithms are very sensitive to the representations chosen for the data so it is desirable to improve learning algorithms that can discover good representations, good features, or good explanatory latent variables. Both supervised and unsupervised learning algorithms have been proposed for this purpose, and they can be combined in semi-supervised setups in order to take advantage of vast quantities of unlabeled data. Deep learning algorithms have multiple levels of representation and the number of levels can be selected based on the available data. Great progress has been made in recent years in algorithms, their analysis, and their application both in academic benchmarks and large-scale industrial settings (such as machine vision/object recognition and NLP, including speech recognition). Many interesting open problems also remain, which should stimulate lively discussions among the participants.

Sivaraman Balakrishnan · Alessandro Rinaldo · Donald Sheehy · Aarti Singh · Larry Wasserman

[ Emerald Bay 6, Harveys Convention Center Floor (CC) ]

Topological methods and machine learning have long enjoyed fruitful interactions as evidenced by popular algorithms like ISOMAP, LLE and Laplacian Eigenmaps which have been borne out of studying point cloud data through the lens of topology/geometry. More recently several researchers have been attempting to understand the algebraic topological properties of data. Algebraic topology is a branch of mathematics which uses tools from abstract algebra to study and classify topological spaces. The machine learning community thus far has focussed almost exclusively on clustering as the main tool for unsupervised data analysis. Clustering however only scratches the surface, and algebraic topological methods aim at extracting much richer topological information from data.

The goals of our workshop are:
1. To draw the attention of machine learning researchers to a rich and emerging source of interesting and challenging problems.
2. To identify problems of interest to both topologists and machine learning researchers and areas of potential collaboration.
3. To discuss practical methods for implementing topological data analysis methods.
4. To discuss applications of topological data analysis to scientific problems.

We will also target submissions in a variety of areas, at the intersection of algebraic topology and learning, that have witnessed recent activity. Areas of …

Jonathan How · Lawrence Carin · John Fisher III · Michael Jordan · Alborz Geramifard

[ Tahoe B, Harrah’s Special Events Center 2nd Floor ]

The ability to autonomously plan a course of action to reach a desired goal in the presence of uncertainty is critical for the success of autonomous robotic systems. Autonomous planning typically involves selecting actions that maximize the mission objectives given the available information, such as models of the agent dynamics, environment, available resources, and mission constraints. However, such models are typically only approximate, and can rapidly become obsolete, thereby degrading the planner performance. Classical approaches to address this problem typically assume that the environment has a certain structure that can be captured by a parametric model that can be updated online. However, finding the right parameterization a priori for complex and uncertain domains is challenging because substantial domain knowledge is required. An alternative approach is to let the data provide the insight on the parameterization. This approach leads to Bayesian Nonparametric models (BNPMs), which is a powerful framework for reasoning about objects and relations in settings in which these objects and relations are not predefined. This feature is particularly attractive for missions, such as long-term persistent sensing, for which it is virtually impossible to specify the size of the model and the number of parameters a priori. In such scenarios, …

Sameer Singh · John Duchi · Yucheng Low · Joseph E Gonzalez

[ Emerald Bay A, Harveys Convention Center Floor (CC) ]

This workshop will address algorithms, systems, and real-world problem domains related to large-scale machine learning (“Big Learning”). With active research spanning machine learning, databases, parallel and distributed systems, parallel architectures, programming languages and abstractions, and even the sciences, Big Learning has attracted intense interest. This workshop will bring together experts across these diverse communities to discuss recent progress, share tools and software, identify pressing new challenges, and to exchange new ideas. Topics of interest include (but are not limited to):

- Big Data: Methods for managing large, unstructured, and/or streaming data; cleaning, visualization, interactive platforms for data understanding and interpretation; sketching and summarization techniques; sources of large datasets.

- Models & Algorithms: Machine learning algorithms for parallel, distributed, GPGPUs, or other novel architectures; theoretical analysis; distributed online algorithms; implementation and experimental evaluation; methods for distributed fault tolerance.

- Applications of Big Learning: Practical application studies and challenges of real-world system building; insights on end-users, common data characteristics (stream or batch); trade-offs between labeling strategies (e.g., curated or crowd-sourced).

- Tools, Software & Systems: Languages and libraries for large-scale parallel or distributed learning which leverage cloud computing, scalable storage (e.g. RDBMs, NoSQL, graph databases), and/or specialized hardware.

Behrouz Touri · Olgica Milenkovic · Faramarz Fekri

[ Emerald Bay 4, Harveys Convention Center Floor (CC) ]

The aim of the workshop is to bring together a diverse research community from social choice theory, social sciences, and psychology on one side, and mathematics, statistics, decision and information theory on the other side. These fields traditionally have little interaction, despite the fact that a number of relevant practical problems in social sciences may be successfully addressed using techniques and tools developed in signal processing and information theory.

The organizers plan to create a forum for studying traditional and emerging experimental and computational problems in social choice theory through a multifaceted analytical lens.

The topics of interest include social decision making mechanisms, voting protocols and vote aggregation, voting and voter influence over social networks, and opinion dynamics modeling and monitoring. These topics lie at the heart of modern psychology, political sciences and sociology, and in addition, they have emerged in various incarnations in Internet applications, recommender systems and computer science in general.

The targeted group of participants will be selected experts in the field of social choice theory, decision theory, cognitive neuroscience, machine learning, information theory and statistics.