Timezone: »

Towards Understanding Hierarchical Learning: Benefits of Neural Representations
Minshuo Chen · Yu Bai · Jason Lee · Tuo Zhao · Huan Wang · Caiming Xiong · Richard Socher

Tue Dec 08 09:00 AM -- 11:00 AM (PST) @ Poster Session 1 #462
Deep neural networks can empirically perform efficient hierarchical learning, in which the layers learn useful representations of the data. However, how they make use of the intermediate representations are not explained by recent theories that relate them to ``shallow learners'' such as kernels. In this work, we demonstrate that intermediate \emph{neural representations} add more flexibility to neural networks and can be advantageous over raw inputs. We consider a fixed, randomly initialized neural network as a representation function fed into another trainable network. When the trainable network is the quadratic Taylor model of a wide two-layer network, we show that neural representation can achieve improved sample complexities compared with the raw input: For learning a low-rank degree-$p$ polynomial ($p \geq 4$) in $d$ dimension, neural representation requires only $\widetilde{O}(d^{\ceil{p/2}})$ samples, while the best-known sample complexity upper bound for the raw input is $\widetilde{O}(d^{p-1})$. We contrast our result with a lower bound showing that neural representations do not improve over the raw input (in the infinite width limit), when the trainable network is instead a neural tangent kernel. Our results characterize when neural representations are beneficial, and may provide a new perspective on why depth is important in deep learning.

Author Information

Minshuo Chen (Georgia Tech)
Yu Bai (Salesforce Research)
Jason Lee (Princeton University)
Tuo Zhao (Georgia Tech)
Huan Wang (Salesforce Research)

Huan Wang is an senior research scientist at Salesforce Research. His research interests include machine learning, big data analytics, computer vision and NLP. He used to be a research scientist at Microsoft AI Research, Yahoo’s New York Labs, and an adjunct professor at the engineering school of New York University. He graduated as a Ph.D in Computer Science at Yale University in 2013. Before that, he received an M.Phil. from The Chinese University of Hong Kong and a B.Eng. from Zhejiang University, both in information engineering.

Caiming Xiong (Salesforce)
Richard Socher (Salesforce)

Richard Socher is Chief Scientist at Salesforce. He leads the company’s research efforts and brings state of the art artificial intelligence solutions into the platform. Prior, Richard was an adjunct professor at the Stanford Computer Science Department and the CEO and founder of MetaMind, a startup acquired by Salesforce in April 2016. MetaMind’s deep learning AI platform analyzes, labels and makes predictions on image and text data so businesses can make smarter, faster and more accurate decisions.

More from the Same Authors