Timezone: »
Convolutional neural networks require significant memory bandwidth and storage for intermediate computations, apart from substantial computing resources. Neural network quantization has significant benefits in reducing the amount of intermediate results, but it often requires the full datasets and time-consuming fine tuning to recover the accuracy lost after quantization. This paper introduces the first practical 4-bit post training quantization approach: it does not involve training the quantized model (fine-tuning), nor it requires the availability of the full dataset. We target the quantization of both activations and weights and suggest three complementary methods for minimizing quantization error at the tensor level, two of whom obtain a closed-form analytical solution. Combining these methods, our approach achieves accuracy that is just a few percents less the state-of-the-art baseline across a wide range of convolutional models. The source code to replicate all experiments is available on GitHub: \url{https://github.com/submission2019/cnn-quantization}.
Author Information
Ron Banner (Intel - Artificial Intelligence Products Group (AIPG))
Yury Nahshan (Intel - Artificial Intelligence Products Group (AIPG))
Daniel Soudry (Technion)
I am an assistant professor in the Department of Electrical Engineering at the Technion, working in the areas of Machine learning and theoretical neuroscience. I am especially interested in all aspects of neural networks and deep learning. I did my post-doc (as a Gruss Lipper fellow) working with Prof. Liam Paninski in the Department of Statistics, the Center for Theoretical Neuroscience the Grossman Center for Statistics of the Mind, the Kavli Institute for Brain Science, and the NeuroTechnology Center at Columbia University. I did my Ph.D. (2008-2013, direct track) in the Network Biology Research Laboratory in the Department of Electrical Engineering at the Technion, Israel Institute of technology, under the guidance of Prof. Ron Meir. In 2008 I graduated summa cum laude with a B.Sc. in Electrical Engineering and a B.Sc. in Physics, after studying in the Technion since 2004.
More from the Same Authors
-
2021 Poster: Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks »
Itay Hubara · Brian Chmiel · Moshe Island · Ron Banner · Joseph Naor · Daniel Soudry -
2021 Poster: The Implicit Bias of Minima Stability: A View from Function Space »
Rotem Mulayoff · Tomer Michaeli · Daniel Soudry -
2021 Poster: Physics-Aware Downsampling with Deep Learning for Scalable Flood Modeling »
Niv Giladi · Zvika Ben-Haim · Sella Nevo · Yossi Matias · Daniel Soudry -
2020 Poster: Robust Quantization: One Model to Rule Them All »
moran shkolnik · Brian Chmiel · Ron Banner · Gil Shomron · Yury Nahshan · Alex Bronstein · Uri Weiser -
2020 Poster: Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy »
Edward Moroshko · Blake Woodworth · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2020 Spotlight: Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy »
Edward Moroshko · Blake Woodworth · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2019 : Lunch Break and Posters »
Xingyou Song · Elad Hoffer · Wei-Cheng Chang · Jeremy Cohen · Jyoti Islam · Yaniv Blumenfeld · Andreas Madsen · Jonathan Frankle · Sebastian Goldt · Satrajit Chatterjee · Abhishek Panigrahi · Alex Renda · Brian Bartoldson · Israel Birhane · Aristide Baratin · Niladri Chatterji · Roman Novak · Jessica Forde · YiDing Jiang · Yilun Du · Linara Adilova · Michael Kamp · Berry Weinstein · Itay Hubara · Tal Ben-Nun · Torsten Hoefler · Daniel Soudry · Hsiang-Fu Yu · Kai Zhong · Yiming Yang · Inderjit Dhillon · Jaime Carbonell · Yanqing Zhang · Dar Gilboa · Johannes Brandstetter · Alexander R Johansen · Gintare Karolina Dziugaite · Raghav Somani · Ari Morcos · Freddie Kalaitzis · Hanie Sedghi · Lechao Xiao · John Zech · Muqiao Yang · Simran Kaur · Qianli Ma · Yao-Hung Hubert Tsai · Ruslan Salakhutdinov · Sho Yaida · Zachary Lipton · Daniel Roy · Michael Carbin · Florent Krzakala · Lenka Zdeborová · Guy Gur-Ari · Ethan Dyer · Dilip Krishnan · Hossein Mobahi · Samy Bengio · Behnam Neyshabur · Praneeth Netrapalli · Kris Sankaran · Julien Cornebise · Yoshua Bengio · Vincent Michalski · Samira Ebrahimi Kahou · Md Rifat Arefin · Jiri Hron · Jaehoon Lee · Jascha Sohl-Dickstein · Samuel Schoenholz · David Schwab · Dongyu Li · Sang Keun Choe · Henning Petzka · Ashish Verma · Zhichao Lin · Cristian Sminchisescu -
2019 Poster: A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off »
Yaniv Blumenfeld · Dar Gilboa · Daniel Soudry -
2018 Poster: Norm matters: efficient and accurate normalization schemes in deep networks »
Elad Hoffer · Ron Banner · Itay Golan · Daniel Soudry -
2018 Spotlight: Norm matters: efficient and accurate normalization schemes in deep networks »
Elad Hoffer · Ron Banner · Itay Golan · Daniel Soudry -
2018 Poster: Implicit Bias of Gradient Descent on Linear Convolutional Networks »
Suriya Gunasekar · Jason Lee · Daniel Soudry · Nati Srebro -
2018 Poster: Scalable methods for 8-bit training of neural networks »
Ron Banner · Itay Hubara · Elad Hoffer · Daniel Soudry -
2017 Poster: Train longer, generalize better: closing the generalization gap in large batch training of neural networks »
Elad Hoffer · Itay Hubara · Daniel Soudry -
2017 Oral: Train longer, generalize better: closing the generalization gap in large batch training of neural networks »
Elad Hoffer · Itay Hubara · Daniel Soudry -
2016 Poster: Binarized Neural Networks »
Itay Hubara · Matthieu Courbariaux · Daniel Soudry · Ran El-Yaniv · Yoshua Bengio -
2015 : Spotlight Part II »
Alex Gibberd · Kenji Doya · Bhaswar B Bhattacharya · Sakyasingha Dasgupta · Daniel Soudry -
2014 Poster: Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights »
Daniel Soudry · Itay Hubara · Ron Meir