Timezone: »
The stunning empirical successes of neural networks currently lack rigorous theoretical eplanation. What form would such an explanation take, in the face of existing complexity-theoretic lower bounds? A first step might be to show that data generated by neural networks a single hidden layer, smooth activation functions and benign input distributions can be learned efficiently. We demonstrate here a comprehensive lower bound ruling out this possibility: for a wide class of activation functions (including all currently used), and inputs drawn from any logconcave distribution, there is a family of one-hidden-layer functions whose output is a sum gate that are hard to learn in a precise sense: any statistical query algorithm (which includes all known variants of stochastic gradient descent with any loss function) needs an exponential number of queries even using tolerance inversely proportional to the input dimensionality. Moreover, this hard family of functions is realizable with a small (sublinear in dimension) number of activation units in the single hidden layer. The lower bound is also robust to small perturbations of the true weights. Systematic experiments illustrate a phase transition in the training error as predicted by the analysis.
Author Information
Le Song (Ant Financial & Georgia Institute of Technology)
Santosh Vempala (Georgia Tech)
John Wilmes (Georgia Institute of Technology)
Bo Xie (Georgia Tech)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Poster: On the Complexity of Learning Neural Networks »
Wed. Dec 6th 02:30 -- 06:30 AM Room Pacific Ballroom #206
More from the Same Authors
-
2021 : Scallop: From Probabilistic Deductive Databases to Scalable Differentiable Reasoning »
Jiani Huang · Ziyang Li · Binghong Chen · Karan Samel · Mayur Naik · Le Song · Xujie Si -
2021 : Large Scale Coordination Transfer for Cooperative Multi-Agent Reinforcement Learning »
Ethan Wang · Binghong Chen · Le Song -
2023 Poster: xTrimoGene: An Efficient and Scalable Representation Learner for Single-Cell RNA-Seq Data »
Jing Gong · Minsheng Hao · Xin Zeng · Chiming Liu · Jianzhu Ma · Xingyi Cheng · Taifeng Wang · Xuegong Zhang · Le Song -
2023 Poster: Injecting Multimodal Information into Rigid Protein Docking via Bi-level Optimization »
Ruijia Wang · YiWu Sun · Yujie Luo · Cheng Yang · Shaochuan Li · Xingyi Cheng · Hui Li · Chuan Shi · Le Song -
2023 Poster: Contrastive Moments: Unsupervised Halfspace Learning in Polynomial Time »
Xinyuan Cao · Santosh Vempala -
2022 Poster: Sampling with Riemannian Hamiltonian Monte Carlo in a Constrained Space »
Yunbum Kook · Yin-Tat Lee · Ruoqi Shen · Santosh Vempala -
2022 Poster: Uncovering the Structural Fairness in Graph Contrastive Learning »
Ruijia Wang · Xiao Wang · Chuan Shi · Le Song -
2021 Poster: A Biased Graph Neural Network Sampler with Near-Optimal Regret »
Qingru Zhang · David Wipf · Quan Gan · Le Song -
2021 Poster: Locality Sensitive Teaching »
Zhaozhuo Xu · Beidi Chen · Chaojian Li · Weiyang Liu · Le Song · Yingyan Lin · Anshumali Shrivastava -
2021 Poster: Multi-task Learning of Order-Consistent Causal Graphs »
Xinshi Chen · Haoran Sun · Caleb Ellington · Eric Xing · Le Song -
2021 Poster: RoMA: Robust Model Adaptation for Offline Model-based Optimization »
Sihyun Yu · Sungsoo Ahn · Le Song · Jinwoo Shin -
2021 Poster: Scallop: From Probabilistic Deductive Databases to Scalable Differentiable Reasoning »
Jiani Huang · Ziyang Li · Binghong Chen · Karan Samel · Mayur Naik · Le Song · Xujie Si -
2020 Poster: Understanding Deep Architecture with Reasoning Layer »
Xinshi Chen · Yufei Zhang · Christoph Reisinger · Le Song -
2020 Poster: The Devil is in the Detail: A Framework for Macroscopic Prediction via Microscopic Models »
Yingxiang Yang · Negar Kiyavash · Le Song · Niao He -
2020 Spotlight: The Devil is in the Detail: A Framework for Macroscopic Prediction via Microscopic Models »
Yingxiang Yang · Negar Kiyavash · Le Song · Niao He -
2019 : Poster Session »
Pravish Sainath · Mohamed Akrout · Charles Delahunt · Nathan Kutz · Guangyu Robert Yang · Joseph Marino · L F Abbott · Nicolas Vecoven · Damien Ernst · andrew warrington · Michael Kagan · Kyunghyun Cho · Kameron Harris · Leopold Grinberg · John J. Hopfield · Dmitry Krotov · Taliah Muhammad · Erick Cobos · Edgar Walker · Jacob Reimer · Andreas Tolias · Alexander Ecker · Janaki Sheth · Yu Zhang · Maciej Wołczyk · Jacek Tabor · Szymon Maszke · Roman Pogodin · Dane Corneil · Wulfram Gerstner · Baihan Lin · Guillermo Cecchi · Jenna M Reinen · Irina Rish · Guillaume Bellec · Darjan Salaj · Anand Subramoney · Wolfgang Maass · Yueqi Wang · Ari Pakman · Jin Hyung Lee · Liam Paninski · Bryan Tripp · Colin Graber · Alex Schwing · Luke Prince · Gabriel Ocker · Michael Buice · Benjamin Lansdell · Konrad Kording · Jack Lindsey · Terrence Sejnowski · Matthew Farrell · Eric Shea-Brown · Nicolas Farrugia · Victor Nepveu · Jiwoong Im · Kristin Branson · Brian Hu · Ramakrishnan Iyer · Stefan Mihalas · Sneha Aenugu · Hananel Hazan · Sihui Dai · Tan Nguyen · Doris Tsao · Richard Baraniuk · Anima Anandkumar · Hidenori Tanaka · Aran Nayebi · Stephen Baccus · Surya Ganguli · Dean Pospisil · Eilif Muller · Jeffrey S Cheng · Gaël Varoquaux · Kamalaker Dadi · Dimitrios C Gklezakos · Rajesh PN Rao · Anand Louis · Christos Papadimitriou · Santosh Vempala · Naganand Yadati · Daniel Zdeblick · Daniela M Witten · Nicholas Roberts · Vinay Prabhu · Pierre Bellec · Poornima Ramesh · Jakob H Macke · Santiago Cadena · Guillaume Bellec · Franz Scherr · Owen Marschall · Robert Kim · Hannes Rapp · Marcio Fonseca · Oliver Armitage · Jiwoong Im · Thomas Hardcastle · Abhishek Sharma · Wyeth Bair · Adrian Valente · Shane Shang · Merav Stern · Rutuja Patil · Peter Wang · Sruthi Gorantla · Peter Stratton · Tristan Edwards · Jialin Lu · Martin Ester · Yurii Vlasov · Siavash Golkar -
2019 Workshop: Learning with Temporal Point Processes »
Manuel Rodriguez · Le Song · Isabel Valera · Yan Liu · Abir De · Hongyuan Zha -
2019 Poster: Neural Similarity Learning »
Weiyang Liu · Zhen Liu · James Rehg · Le Song -
2019 Poster: Meta Architecture Search »
Albert Shaw · Wei Wei · Weiyang Liu · Le Song · Bo Dai -
2019 Poster: Exponential Family Estimation via Adversarial Dynamics Embedding »
Bo Dai · Zhen Liu · Hanjun Dai · Niao He · Arthur Gretton · Le Song · Dale Schuurmans -
2019 Poster: Retrosynthesis Prediction with Conditional Graph Logic Network »
Hanjun Dai · Chengtao Li · Connor Coley · Bo Dai · Le Song -
2018 Poster: Learning Loop Invariants for Program Verification »
Xujie Si · Hanjun Dai · Mukund Raghothaman · Mayur Naik · Le Song -
2018 Spotlight: Learning Loop Invariants for Program Verification »
Xujie Si · Hanjun Dai · Mukund Raghothaman · Mayur Naik · Le Song -
2018 Poster: Coupled Variational Bayes via Optimization Embedding »
Bo Dai · Hanjun Dai · Niao He · Weiyang Liu · Zhen Liu · Jianshu Chen · Lin Xiao · Le Song -
2018 Poster: Learning Temporal Point Processes via Reinforcement Learning »
Shuang Li · Shuai Xiao · Shixiang Zhu · Nan Du · Yao Xie · Le Song -
2018 Spotlight: Learning Temporal Point Processes via Reinforcement Learning »
Shuang Li · Shuai Xiao · Shixiang Zhu · Nan Du · Yao Xie · Le Song -
2018 Poster: Learning towards Minimum Hyperspherical Energy »
Weiyang Liu · Rongmei Lin · Zhen Liu · Lixin Liu · Zhiding Yu · Bo Dai · Le Song -
2017 : Learning from Conditional Distributions via Dual Embeddings (poster). »
Le Song -
2017 Poster: Predicting User Activity Level In Point Processes With Mass Transport Equation »
Yichen Wang · Xiaojing Ye · Hongyuan Zha · Le Song -
2017 Poster: Learning Combinatorial Optimization Algorithms over Graphs »
Elias Khalil · Hanjun Dai · Yuyu Zhang · Bistra Dilkina · Le Song -
2017 Spotlight: Learning Combinatorial Optimization Algorithms over Graphs »
Elias Khalil · Hanjun Dai · Yuyu Zhang · Bistra Dilkina · Le Song -
2017 Poster: Deep Hyperspherical Learning »
Weiyang Liu · Yan-Ming Zhang · Xingguo Li · Zhiding Yu · Bo Dai · Tuo Zhao · Le Song -
2017 Spotlight: Deep Hyperspherical Learning »
Weiyang Liu · Yan-Ming Zhang · Xingguo Li · Zhiding Yu · Bo Dai · Tuo Zhao · Le Song -
2017 Poster: Wasserstein Learning of Deep Generative Point Process Models »
Shuai Xiao · Mehrdad Farajtabar · Xiaojing Ye · Junchi Yan · Xiaokang Yang · Le Song · Hongyuan Zha -
2016 Poster: Multistage Campaigning in Social Networks »
Mehrdad Farajtabar · Xiaojing Ye · Sahar Harati · Le Song · Hongyuan Zha -
2016 Poster: Coevolutionary Latent Feature Processes for Continuous-Time User-Item Interactions »
Yichen Wang · Nan Du · Rakshit Trivedi · Le Song -
2015 Poster: Time-Sensitive Recommendation From Recurrent User Activities »
Nan Du · Yichen Wang · Niao He · Jimeng Sun · Le Song -
2015 Poster: Scale Up Nonlinear Component Analysis with Doubly Stochastic Gradients »
Bo Xie · Yingyu Liang · Le Song -
2015 Poster: Efficient Learning of Continuous-Time Hidden Markov Models for Disease Progression »
Yu-Ying Liu · Shuang Li · Fuxin Li · Le Song · James Rehg -
2015 Poster: COEVOLVE: A Joint Point Process Model for Information Diffusion and Network Co-evolution »
Mehrdad Farajtabar · Yichen Wang · Manuel Rodriguez · Shuang Li · Hongyuan Zha · Le Song -
2015 Oral: COEVOLVE: A Joint Point Process Model for Information Diffusion and Network Co-evolution »
Mehrdad Farajtabar · Yichen Wang · Manuel Rodriguez · Shuang Li · Hongyuan Zha · Le Song -
2015 Poster: M-Statistic for Kernel Change-Point Detection »
Shuang Li · Yao Xie · Hanjun Dai · Le Song -
2014 Poster: Active Learning and Best-Response Dynamics »
Maria-Florina F Balcan · Christopher Berlind · Avrim Blum · Emma Cohen · Kaushik Patnaik · Le Song -
2014 Poster: Learning Time-Varying Coverage Functions »
Nan Du · Yingyu Liang · Maria-Florina F Balcan · Le Song -
2014 Poster: Shaping Social Activity by Incentivizing Users »
Mehrdad Farajtabar · Nan Du · Manuel Gomez Rodriguez · Isabel Valera · Hongyuan Zha · Le Song -
2014 Poster: Scalable Kernel Methods via Doubly Stochastic Gradients »
Bo Dai · Bo Xie · Niao He · Yingyu Liang · Anant Raj · Maria-Florina F Balcan · Le Song -
2013 Poster: Robust Low Rank Kernel Embeddings of Multivariate Distributions »
Le Song · Bo Dai -
2013 Poster: Scalable Influence Estimation in Continuous-Time Diffusion Networks »
Nan Du · Le Song · Manuel Gomez Rodriguez · Hongyuan Zha -
2013 Oral: Scalable Influence Estimation in Continuous-Time Diffusion Networks »
Nan Du · Le Song · Manuel Gomez Rodriguez · Hongyuan Zha -
2012 Workshop: Confluence between Kernel Methods and Graphical Models »
Le Song · Arthur Gretton · Alexander Smola -
2012 Workshop: Spectral Algorithms for Latent Variable Models »
Ankur P Parikh · Le Song · Eric Xing -
2012 Poster: Learning Networks of Heterogeneous Influence »
Nan Du · Le Song · Alexander Smola · Ming Yuan -
2012 Spotlight: Learning Networks of Heterogeneous Influence »
Nan Du · Le Song · Alexander Smola · Ming Yuan