Timezone: »

Deep Learning at Supercomputer Scale
Erich Elsen · Danijar Hafner · Zak Stone · Brennan Saeta

Sat Dec 09 08:00 AM -- 06:30 PM (PST) @ 101 B
Event URL: https://supercomputersfordl2017.github.io/ »

Five years ago, it took more than a month to train a state-of-the-art image recognition model on the ImageNet dataset. Earlier this year, Facebook demonstrated that such a model could be trained in an hour. However, if we could parallelize this training problem across the world’s fastest supercomputers (~100 PFlops), it would be possible to train the same model in under a minute. This workshop is about closing that gap: how can we turn months into minutes and increase the productivity of machine learning researchers everywhere?

This one-day workshop will facilitate active debate and interaction across many different disciplines. The conversation will range from algorithms to infrastructure to silicon, with invited speakers from Cerebras, DeepMind, Facebook, Google, OpenAI, and other organizations. When should synchronous training be preferred over asynchronous training? Are large batch sizes the key to reach supercomputer scale, or is it possible to fully utilize a supercomputer at batch size one? How important is sparsity in enabling us to scale? Should sparsity patterns be structured or unstructured? To what extent do we expect to customize model architectures for particular problem domains, and to what extent can a “single model architecture” deliver state-of-the-art results across many different domains? How can new hardware architectures unlock even higher real-world training performance?

Our goal is bring people who are trying to answer any of these questions together in hopes that cross pollination will accelerate progress towards deep learning at true supercomputer scale.

Sat 8:10 a.m. - 8:30 a.m.
Generalization Gap (Presentation)
Nitish Shirish Keskar
Sat 8:30 a.m. - 8:50 a.m.
Closing the Generalization Gap (Presentation)
Itay Hubara
Sat 8:50 a.m. - 9:10 a.m.
Don't Decay the Learning Rate, Increase the Batch Size (Presentation)
Sam Smith
Sat 9:10 a.m. - 9:30 a.m.
ImageNet In 1 Hour (Presentation)
Priya Goyal
Sat 9:30 a.m. - 9:50 a.m.
Training with TPUs (Presentation)
Chris Ying
Sat 9:50 a.m. - 10:10 a.m.
Coffee Break (Break)
Sat 10:10 a.m. - 10:30 a.m.
KFAC and Natural Gradients (Presentation)
Matthew Johnson · Daniel Duckworth
Sat 10:30 a.m. - 10:50 a.m.
Neumann Optimizer (Presentation)
Shankar Krishnan
Sat 10:50 a.m. - 11:10 a.m.
Evolutionary Strategies (Presentation)
Tim Salimans
Sat 11:15 a.m. - 12:00 p.m.
Future Hardware Directions (Discussion Panel)
Gregory Diamos · Jeff Dean · Simon Knowles · Michael James · Scott Gray
Sat 1:30 p.m. - 1:50 p.m.
Learning Device Placement (Presentation)
Azalia Mirhoseini
Sat 1:50 p.m. - 2:10 p.m.
Scaling and Sparsity (Presentation)
Gregory Diamos
Sat 2:10 p.m. - 2:30 p.m.
Small World Network Architectures (Presentation)
Scott Gray
Sat 2:30 p.m. - 2:50 p.m.
Scalable RL and AlphaGo (Presentation)
Timothy Lillicrap
Sat 3:20 p.m. - 3:40 p.m.
Scaling Deep Learning to 15 PetaFlops (Presentation)
Thorsten Kurth
Sat 3:40 p.m. - 4:00 p.m.
Scalable Silicon Compute (Presentation)
Simon Knowles
Sat 4:00 p.m. - 4:20 p.m.
Practical Scaling Techniques (Presentation)
Ujval Kapasi
Sat 4:20 p.m. - 4:40 p.m.
Designing for Supercompute-Scale Deep Learning (Presentation)
Michael James
Sat 5:00 p.m. - 6:00 p.m.
Adaptive Memory Networks (Poster Session)
Daniel Li
Sat 5:00 p.m. - 6:00 p.m.
Supercomputers for Deep Learning (Poster Session)
Rangan Sukumar

Author Information

Erich Elsen (Google)
Danijar Hafner (Google Brain & UCL)
Zak Stone (Google Brain)

Zak Stone is the Product Manager for TensorFlow on the Google Brain team. He contributes to product strategy, leads the TensorFlow Research Cloud program, and enjoys interacting with TensorFlow's vibrant open-source community. Prior to joining Google, Zak founded a mobile-focused deep learning startup that was acquired by Apple. While at Apple, Zak contributed to the on-device face identification technology in iOS 10 and macOS Sierra that was announced at WWDC 2016.

Brennan Saeta (Google)

More from the Same Authors