Timezone: »
Multilayer Neural Networks (MNNs) are commonly trained using gradient descentbased methods, such as BackPropagation (BP). Inference in probabilistic graphical models is often done using variational Bayes methods, such as Expectation Propagation (EP). We show how an EP based approach can also be used to train deterministic MNNs. Specifically, we approximate the posterior of the weights given the data using a “meanfield” factorized distribution, in an online setting. Using online EP and the central limit theorem we find an analytical approximation to the Bayes update of this posterior, as well as the resulting Bayes estimates of the weights and outputs. Despite a different origin, the resulting algorithm, Expectation BackPropagation (EBP), is very similar to BP in form and efficiency. However, it has several additional advantages: (1) Training is parameterfree, given initial conditions (prior) and the MNN architecture. This is useful for largescale problems, where parameter tuning is a major challenge. (2) The weights can be restricted to have discrete values. This is especially useful for implementing trained MNNs in precision limited hardware chips, thus improving their speed and energy efficiency by several orders of magnitude. We test the EBP algorithm numerically in eight binary text classification tasks. In all tasks, EBP outperforms: (1) standard BP with the optimal constant learning rate (2) previously reported state of the art. Interestingly, EBPtrained MNNs with binary weights usually perform better than MNNs with continuous (real) weights  if we average the MNN output using the inferred posterior.
Author Information
Daniel Soudry (Technion)
I am an assistant professor in the Department of Electrical Engineering at the Technion, working in the areas of Machine learning and theoretical neuroscience. I am especially interested in all aspects of neural networks and deep learning. I did my postdoc (as a Gruss Lipper fellow) working with Prof. Liam Paninski in the Department of Statistics, the Center for Theoretical Neuroscience the Grossman Center for Statistics of the Mind, the Kavli Institute for Brain Science, and the NeuroTechnology Center at Columbia University. I did my Ph.D. (20082013, direct track) in the Network Biology Research Laboratory in the Department of Electrical Engineering at the Technion, Israel Institute of technology, under the guidance of Prof. Ron Meir. In 2008 I graduated summa cum laude with a B.Sc. in Electrical Engineering and a B.Sc. in Physics, after studying in the Technion since 2004.
Itay Hubara (Habana Labs)
Ron Meir (Technion)
More from the Same Authors

2019 Poster: A Mean Field Theory of Quantized Deep Networks: The QuantizationDepth TradeOff »
Yaniv Blumenfeld · Dar Gilboa · Daniel Soudry 
2019 Poster: Post training 4bit quantization of convolutional networks for rapiddeployment »
Ron Banner · Yury Nahshan · Daniel Soudry 
2018 Poster: Norm matters: efficient and accurate normalization schemes in deep networks »
Elad Hoffer · Ron Banner · Itay Golan · Daniel Soudry 
2018 Spotlight: Norm matters: efficient and accurate normalization schemes in deep networks »
Elad Hoffer · Ron Banner · Itay Golan · Daniel Soudry 
2018 Poster: Implicit Bias of Gradient Descent on Linear Convolutional Networks »
Suriya Gunasekar · Jason Lee · Daniel Soudry · Nati Srebro 
2018 Poster: Scalable methods for 8bit training of neural networks »
Ron Banner · Itay Hubara · Elad Hoffer · Daniel Soudry 
2017 Poster: Train longer, generalize better: closing the generalization gap in large batch training of neural networks »
Elad Hoffer · Itay Hubara · Daniel Soudry 
2017 Oral: Train longer, generalize better: closing the generalization gap in large batch training of neural networks »
Elad Hoffer · Itay Hubara · Daniel Soudry 
2016 Poster: Binarized Neural Networks »
Itay Hubara · Matthieu Courbariaux · Daniel Soudry · Ran ElYaniv · Yoshua Bengio 
2015 Poster: A Tractable Approximation to Optimal Point Process Filtering: Application to Neural Encoding »
Yuval Harel · Ron Meir · Manfred Opper 
2015 Spotlight: A Tractable Approximation to Optimal Point Process Filtering: Application to Neural Encoding »
Yuval Harel · Ron Meir · Manfred Opper 
2014 Poster: Optimal Neural Codes for Control and Estimation »
Alex K Susemihl · Ron Meir · Manfred Opper 
2011 Poster: Analytical Results for the Error in Filtering of Gaussian Processes »
Alex K Susemihl · Ron Meir · Manfred Opper 
2008 Poster: Temporal Difference Based Actor Critic Learning  Convergence and Neural Implementation »
Dotan Di Castro · Dima Volkinshtein · Ron Meir 
2007 Oral: A neural network implementing optimal state estimation based on dynamic spike train decoding »
Omer Bobrowski · Ron Meir · Shy Shoham · Yonina Eldar 
2007 Poster: A neural network implementing optimal state estimation based on dynamic spike train decoding »
Omer Bobrowski · Ron Meir · Shy Shoham · Yonina Eldar