Timezone: »

Collegial Ensembles
Etai Littwin · Ben Myara · Sima Sabah · Joshua Susskind · Shuangfei Zhai · Oren Golan

Thu Dec 10 09:00 PM -- 11:00 PM (PST) @ Poster Session 6 #1873

Modern neural network performance typically improves as model size increases. A recent line of research on the Neural Tangent Kernel (NTK) of over-parameterized networks indicates that the improvement with size increase is a product of a better conditioned loss landscape. In this work, we investigate a form of over-parameterization achieved through ensembling, where we define collegial ensembles (CE) as the aggregation of multiple independent models with identical architectures, trained as a single model. We show that the optimization dynamics of CE simplify dramatically when the number of models in the ensemble is large, resembling the dynamics of wide models, yet scale much more favorably. We use recent theoretical results on the finite width corrections of the NTK to perform efficient architecture search in a space of finite width CE that aims to either minimize capacity, or maximize trainability under a set of constraints. The resulting ensembles can be efficiently implemented in practical architectures using group convolutions and block diagonal layers. Finally, we show how our framework can be used to analytically derive optimal group convolution modules originally found using expensive grid searches, without having to train a single model.

Author Information

Etai Littwin (Apple)
Ben Myara (apple)
Sima Sabah (Apple)
Joshua Susskind (Apple Inc.)

I was an undergraduate in Cognitive Science at UCSD from 1995-2003 (with some breaks). Then I earned a PhD from UofT in machine learning and cognitive neuroscience, with Dr. Geoff Hinton and Dr. Adam Anderson. Following grad school I moved to UCSD for a post-doctoral position. Before coming to Apple I co-founded Emotient in 2012 and led the deep learning effort for facial expression and demographics recognition. Since joining Apple, I led the Face ID neural network team responsible for face recognition, and then started a machine learning research group within the hardware organization focused on fundamental ML technology.

Shuangfei Zhai (Apple)
Oren Golan (apple)

Related Events (a corresponding poster, oral, or spotlight)

  • 2020 Spotlight: Collegial Ensembles »
    Fri. Dec 11th 03:30 -- 03:40 AM Room Orals & Spotlights: Deep Learning

More from the Same Authors