Workshop
Machine Learning Systems
Aparna Lakshmiratan · Li Erran Li · Siddhartha Sen · Sarah Bird · Hussein Mehanna

Sat Dec 10th 08:00 AM -- 06:30 PM @ Room 116
Event URL: https://sites.google.com/site/mlsysnips2016/ »

A new area is emerging at the intersection of machine learning (ML) and systems design. This birth is driven by the explosive growth of diverse applications of ML in production, the continued growth in data volume, and the complexity of large-scale learning systems. Addressing the challenges in this intersection demands a combination of the right abstractions -- for algorithms, data structures, and interfaces -- as well as scalable systems capable of addressing real world learning problems.

Designing systems for machine learning presents new challenges and opportunities over the design of traditional data processing systems. For example, what is the right abstraction for data consistency in the context of parallel, stochastic learning algorithms? What guarantees of fault tolerance are needed during distributed learning? The statistical nature of machine learning offers an opportunity for more efficient systems but requires revisiting many of the challenges addressed by the systems and database communities over the past few decades. Machine learning focused developments in distributed learning platforms, programming languages, data structures, general purpose GPU programming, and a wide variety of other domains have had and will continue to have a large impact in both academia and industry.

As the relationship between the machine learning and systems communities has grown stronger, new research in using machine learning tools to solve classic systems challenges has also grown. Specifically, as we develop larger and more complex systems and networks for storing, analyzing, serving, and interacting with data, machine learning offers promise for modeling system dynamics, detecting issues, and making intelligent, data-driven decisions within our systems. Machine learning techniques have begun to play critical roles in scheduling, system tuning, and network analysis. Through working with systems and databases researchers to solve systems challenges, machine learning researchers can both improve their own learning systems as well impact the systems community and infrastructure at large.

The goal of this workshop is to bring together experts working at the crossroads of ML, system design and software engineering to explore the challenges faced when building practical large-scale machine learning systems. In particular, we aim to elicit new connections among these diverse fields, identify tools, best practices and design principles. The workshop will cover ML and AI platforms and algorithm toolkits (Caffe, Torch, TensorFlow, MXNet and parameter server, Theano, etc), as well as dive into the reality of applying ML and AI in industry with challenges of data and organization scale (with invited speakers from companies like Google, Microsoft, Facebook, Amazon, Netflix, Uber and Twitter).

The workshop will have a mix of invited speakers and reviewed papers with talks, posters and panel discussions to facilitate the flow of new ideas as well as best practices which can benefit those looking to implement large ML systems in academia or industry.

Focal points for discussions and solicited submissions include but are not limited to:
- Systems for online and batch learning algorithms
- Systems for out-of-core machine learning
- Implementation studies of large-scale distributed learning algorithms --- challenges faced and lessons learned
- Database systems for Big Learning --- models and algorithms implemented, properties (fault tolerance, consistency, scalability, etc.), strengths and limitations
- Programming languages for machine learning
- Data driven systems --- learning for job scheduling, configuration tuning, straggler mitigation, network configuration, and security
- Systems for interactive machine learning
- Systems for serving machine learning models at scale

08:45 AM Opening Remarks (Talk)
09:00 AM Invited Talk: You've been using asynchrony wrong your whole life! (Chris Re, Stanford) (Invited Talk) Chris Ré
09:20 AM Contributed Talk: Hemingway: Modeling Distributed Optimization Algorithms (Contributed Talk)
09:40 AM Invited Talk: Paleo: A Performance Model for Deep Neural Networks (Ameet Talwalkar, UCLA) (Invited Talk) Ameet S Talwalkar
10:00 AM Poster Previews (Lightening Talks)
11:30 AM Invited Talk: Scaling Machine Learning Using TensorFlow (Jeff Dean, Google Brain) (Invited Talk) Jeff Dean
11:50 AM Contributed Talk: Demitasse: SPMD Programing Implementation of Deep Neural Network Library for Mobile Devices (Invited Talk)
12:10 PM Lunch (Break)
01:30 PM ML System Updates from Caffe (Andrew Tulloch), Clipper (Daniel Crankshaw), Decision Service (Siddhartha Sen), MxNET (Tianqi Chen), Torch (Soumith Chintala), and VW (John Langford) (Invited Talks)
02:50 PM Invited Talk: Optimizing Large-Scale Machine Learning Pipelines with KeystoneML (Tomer Kaftan, UW) (Invited Talk) Tomer Kaftan
03:10 PM Invited Talk: Optimizing Machine Learning and Deep Learning (John Canny, UC Berkeley & Google Research) (Invited Talk) John Canny
03:30 PM Posters & Coffee (Poster Session)
04:30 PM Contributed Talk: Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale (Contributed Talk)
04:50 PM Contributed Talk: TensorForest: Scalable Random Forests on TensorFlow (Contributed Talk)
05:10 PM Closing Remarks (Talk)

Author Information

Aparna Lakshmiratan (Facebook)

I am the PM lead for the AI Platform in Facebook AI (PyTorch 1.0, Data Tools and Developer Ecosystem) Before Facebook, I worked in Microsoft building and shipping several products including a new Click Prediction system for Bing Ads, several enhancements to the Speller and Query Alterations engine in Bing and most recently an interactive machine learning platform for non-experts at Microsoft Research. I have a PhD in Computer Science from MIT.

Li Erran Li (Pony.ai)

Li Erran Li is the head of machine learning at Scale and an adjunct professor at Columbia University. Previously, he was chief scientist at Pony.ai. Before that, he was with the perception team at Uber ATG and machine learning platform team at Uber where he worked on deep learning for autonomous driving, led the machine learning platform team technically, and drove strategy for company-wide artificial intelligence initiatives. He started his career at Bell Labs. Li’s current research interests are machine learning, computer vision, learning-based robotics, and their application to autonomous driving. He has a PhD from the computer science department at Cornell University. He’s an ACM Fellow and IEEE Fellow.

Siddhartha Sen (Microsoft Research)
Sarah Bird (Facebook)
Hussein Mehanna (Facebook)

More from the Same Authors