Skip to yearly menu bar Skip to main content


Invited talk
in
Workshop: NIPS 2018 workshop on Compact Deep Neural Networks with industrial applications

Neural network compression in the wild: why aiming for high compression factors is not enough

Tim Genewein


Abstract:

Abstract: the widespread use of state-of-the-art deep neural network models in the mobile, automotive and embedded domains is often hindered by the steep computational resources that are required for running such models. However, the recent scientific literature proposes a plethora of of ways to alleviate the problem, either on the level of efficient network architectures, efficiency-optimized hardware or via network compression methods. Unfortunately, the usefulness of a network compression method strongly depends on the other aspects (network architecture and target hardware) as well as the task itself (classification, regression, detection, etc.), but very few publications consider this interplay. This talk highlights some of the issues that arise from the strong interplay between network architecture, target hardware, compression algorithm and target task. Additionally some shortcomings in the current literature on network compression methods are pointed-out, such as incomparability of results (different base-line networks, different training-/data-augmentation schemes, etc.), lack of results on tasks other than classification, or use of very different (and perhaps not very informative) quantitative performance indicators such as naive compression rate, operations-per-second, size of stored weight matrices, etc. The talk concludes by proposing some guidelines and best-practices for increasing practical applicability of network compression methods and a call for standardizing network compression benchmarks.

Live content is unavailable. Log in and register to view live content