Timezone: »

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask
Hattie Zhou · Janice Lan · Rosanne Liu · Jason Yosinski

Tue Dec 10 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #111

The recent "Lottery Ticket Hypothesis" paper by Frankle & Carbin showed that a simple approach to creating sparse networks (keep the large weights) results in models that are trainable from scratch, but only when starting from the same initial weights. The performance of these networks often exceeds the performance of the non-sparse base model, but for reasons that were not well understood. In this paper we study the three critical components of the Lottery Ticket (LT) algorithm, showing that each may be varied significantly without impacting the overall results. Ablating these factors leads to new insights for why LT networks perform as well as they do. We show why setting weights to zero is important, how signs are all you need to make the re-initialized network train, and why masking behaves like training. Finally, we discover the existence of Supermasks, or masks that can be applied to an untrained, randomly initialized network to produce a model with performance far better than chance (86% on MNIST, 41% on CIFAR-10).

Author Information

Hattie Zhou (Uber)
Janice Lan (Uber AI)
Rosanne Liu (Uber AI Labs)
Jason Yosinski (Uber AI; Recursion)

Dr. Jason Yosinski is a machine learning researcher, was a founding member of Uber AI Labs, and is scientific adviser to Recursion Pharmaceuticals and several other companies. His work focuses on building more capable and more understandable AI. As scientists and engineers build increasingly powerful AI systems, the abilities of these systems increase faster than does our understanding of them, motivating much of his work on AI Neuroscience: an emerging field of study that investigates fundamental properties and behaviors of AI systems. Dr. Yosinski completed his PhD as a NASA Space Technology Research Fellow working at the Cornell Creative Machines Lab, the University of Montreal, Caltech/NASA Jet Propulsion Laboratory, and Google DeepMind. His work on AI has been featured on NPR, Fast Company, the Economist, TEDx, XKCD, and on the BBC. Prior to his academic career, Jason cofounded two web technology companies and started a program in the Los Angeles school district that teaches students algebra via hands-on robotics. In his free time, Jason enjoys cooking, sailing, motorcycling, reading, paragliding, and sometimes pretending he's an artist.

More from the Same Authors