Invited talk
in
Workshop: NIPS 2018 workshop on Compact Deep Neural Networks with industrial applications
Efficient Computation of Deep Convolutional Neural Networks: A Quantization Perspective
Max Welling
[
Abstract
]
Abstract:
Abstract: neural network compression has become an important research area due to its great impact on deployment of large models on resource constrained devices. In this talk, we will introduce two novel techniques that allow for differentiable sparsification and quantization of deep neural networks; both of these are achieved via appropriate smoothing of the overall objective. As a result, we can directly train architectures to be highly compressed and hardware-friendly via off-the-self stochastic gradient descent optimizers.
Live content is unavailable. Log in and register to view live content