Timezone: »

NIPS 2018 workshop on Compact Deep Neural Networks with industrial applications
Lixin Fan · Zhouchen Lin · Max Welling · Yurong Chen · Werner Bailer

Fri Dec 07 05:00 AM -- 03:30 PM (PST) @ Room 517 B
Event URL: https://openreview.net/group?id=NIPS.cc/2018/Workshop/CDNNRIA »

This workshop aims to bring together researchers, educators, practitioners who are interested in techniques as well as applications of making compact and efficient neural network representations. One main theme of the workshop discussion is to build up consensus in this rapidly developed field, and in particular, to establish close connection between researchers in Machine Learning community and engineers in industry. We believe the workshop is beneficial to both academic researchers as well as industrial practitioners.

News and announcements:

. For authors of spotlight posters, please send your one-minute slides (preferably with recorded narrative) to lixin.fan01@gmail.com, or copy it to a UPS stick. See you at the workshop then.

. Please note the change of workshop schedule. Due to visa issues, some speakers are unfortunately unable to attend the workshop.

. There are some reserve NIPS/NeurIPS tickets available now, on a first come first serve basis, for co-authors of workshop accepted papers! Please create NIPS acocunts, and inform us the email addresses if reserve tickets are needed.

. For authors included in the spot light session, please prepare short slides with presentation time stictly within 1 minute. It is preferably to record your presentation with audio & video (as instructed e.g. at https://support.office.com/en-us/article/record-a-slide-show-with-narration-and-slide-timings-0b9502c6-5f6c-40ae-b1e7-e47d8741161c?ui=en-US&rs=en-US&ad=US#OfficeVersion=Windows).

. For authors included in the spot light session, please also prepare a poster for your paper, and make sure either yourself or your co-authors will present the poster after the coffee break.

. Please make your poster 36W x 48H inches or 90 x 122 cm. Make sure your poster is in portrait orientation and does not exceed the maximal size, since we have limited space for the poster session.

For authors of following accepted papers, please revise your submission as per reviewers comments to address raised issues. If there are too much contents to be included in 3 page limit, you may use appendix for supporting contents such as proofs or detailed experimental results. The camera ready abstract should be prepared with authors information (name, email address, affiliation) using the NIPS camera ready template.

Please submit the camera ready abstract through OpenReview (https://openreview.net/group?id=NIPS.cc/2018/Workshop/CDNNRIA) by Nov. 12th. Use your previous submission page to update the abstract. In case you have to postpone the submission, please inform us immeidately. Otherwise, the abstract will be removed from the workshop schedule.

We invite you to submit original work in, but not limited to, following areas:

Neural network compression techniques:
. Binarization, quantization, pruning, thresholding and coding of neural networks
. Efficient computation and acceleration of deep convolutional neural networks
. Deep neural network computation in low power consumption applications (e.g., mobile or IoT devices)
. Differentiable sparsification and quantization of deep neural networks
. Benchmarking of deep neural network compression techniques

Neural network representation and exchange:
. Exchange formats for (trained) neural networks
. Efficient deployment strategies for neural networks
. Industrial standardization of deep neural network representations
. Performance evaluation methods of compressed networks in application context (e.g., multimedia encoding and processing)

Video & media compression methods using DNNs such as those developed in MPEG group:
. To improve video coding standard development by using deep neural networks
. To increase practical applicability of network compression methods

An extended abstract (3 pages long using NIPS style, see https://nips.cc/Conferences/2018/PaperInformation/StyleFiles ) in PDF format should be submitted for evaluation of the originality and quality of the work. The evaluation is double-blind and the abstract must be anonymous. References may extend beyond the 3 page limit, and parallel submissions to a journal or conferences (e.g. AAAI or ICLR) are permitted.

Submissions will be accepted as contributed talks (oral) or poster presentations. Extended abstract should be submitted through OpenReview (https://openreview.net/group?id=NIPS.cc/2018/Workshop/CDNNRIA) by 20 Oct 2018.  All accepted abstracts will be posted on the workshop website and archived.  

Selection policy: all submitted abstracts will be evaluted based on their novelty, soundness and impacts. At the workshop we encourage DISCUSSION about NEW IDEAS, each submitter is thus expected to actively respond on OpenReview webpage and answer any questions about his/her ideas. The willingness to respond in OpenReview Q/A disucssions will be an important factor for the selection of accepted oral or poster presentations.

Important dates:
. Extended abstract submission deadline: 20 Oct 2018,
. Acceptance notification: 29 Oct. 2018,
. Camera ready submission: 12 November 2018,
. Workshop: 7 December 2018

Please submit ​your extended abstract through OpenReivew system (https://openreview.net/group?id=NIPS.cc/2018/Workshop/CDNNRIA).
For prospective authors: please send author information to workshop chairs (lixin.fan@nokia.com), so that you submission can be assigned to reviewers without conflict of interests.
. Reviewers comments will be released by Oct. 24th, then authors have to reply by Oct. 27th, which leaving us two days for decision-making.
. It is highly recommended for authors submit abstracts early, in case you need more time to address reviewers' comments.

NIPS Complimentary workshop registration
We will help authors of accepted submissions to get access to a reserve pool of NIPS tickets. So please register to the workshop early.

Accepted papers & authors:

1. Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters,
Marton Havasi, Robert Peharz, José Miguel Hernández-Lobato

2. Neural Network Compression using Transform Coding and Clustering,
Thorsten Laude, Jörn Ostermann

3. Pruning neural networks: is it time to nip it in the bud?,
Elliot J. Crowley, Jack Turner, Amos Storkey, Michael O'Boyle

4. Compressing Recurrent Neural Networks with Tensor Ring for Action Recognition,
Yu Pan, Jing Xu, Maolin Wang, Fei Wang, Kun Bai, Zenglin Xu

5. Efficient Inference on Deep Neural Networks by Dynamic Representations and Decision Gates,
Mohammad Saeed Shafiee, Mohammad Javad Shafiee, Alexander Wong

6. Iteratively Training Look-Up Tables for Network Quantization,
Fabien Cardinaux, Stefan Uhlich, Kazuki Yoshiyama, Javier Alonso García, Stephen Tiedemann, Thomas Kemp, Akira Nakamura

7. Hybrid Pruning: Thinner Sparse Networks for Fast Inference on Edge Devices,
Xiaofan Xu, Mi Sun Park, Cormac Brick

8. Compression of Acoustic Event Detection Models with Low-rank Matrix Factorization and Quantization Training,
Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang

9. On Learning Wire-Length Efficient Neural Networks,
Christopher Blake, Luyu Wang, Giuseppe Castiglione, Christopher Srinavasa, Marcus Brubaker

10. FLOPs as a Direct Optimization Objective for Learning Sparse Neural Networks,
Raphael Tang, Ashutosh Adhikari, Jimmy Lin

11. Three Dimensional Convolutional Neural Network Pruning with Regularization-Based Method,
Yuxin Zhang, Huan Wang, Yang Luo, Roland Hu

12. Differentiable Training for Hardware Efficient LightNNs,
Ruizhou Ding, Zeye Liu, Ting-Wu Chin, Diana Marculescu, R.D. (Shawn) Blanton

13. Structured Pruning for Efficient ConvNets via Incremental Regularization,
Huan Wang, Qiming Zhang, Yuehai Wang, Haoji Hu

14. Block-wise Intermediate Representation Training for Model Compression,
Animesh Koratana, Daniel Kang, Peter Bailis, Matei Zahaira

15. Targeted Dropout,
Aidan N. Gomez, Ivan Zhang, Kevin Swersky, Yarin Gal, Geoffrey E. Hinton

16. Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling,
Ting Chen, Ji Lin, Tian Lin, Song Han, Chong Wang, Denny Zhou

17. Differentiable Fine-grained Quantization for Deep Neural Network Compression,
Hsin-Pai Cheng, Yuanjun Huang, Xuyang Guo, Yifei Huang, Feng Yan, Hai Li, Yiran Chen

18. Transformer to CNN: Label-scarce distillation for efficient text classification,
Yew Ken Chia, Sam Witteveen, Martin Andrews

19. EnergyNet: Energy-Efficient Dynamic Inference,
Yue Wang, Tan Nguyen, Yang Zhao, Zhangyang Wang, Yingyan Lin, Richard Baraniuk

20. Recurrent Convolutions: A Model Compression Point of View,
Zhendong Zhang, Cheolkon Jung

21, Rethinking the Value of Network Pruning,
Zhuang Liu, Mingjie Sun, Tinghui Zhou, Gao Huang, Trevor Darrell

22. Linear Backprop in non-linear networks,
Mehrdad Yazdani

23. Bayesian Sparsification of Gated Recurrent Neural Networks,
Ekaterina Lobacheva, Nadezhda Chirkova, Dmitry Vetrov

24. Demystifying Neural Network Filter Pruning,
Zhuwei Qin, Fuxun Yu, Chenchen Liu, Xiang Chen

25. Learning Compact Networks via Adaptive Network Regularization,
Sivaramakrishnan Sankarapandian, Anil Kag, Rachel Manzelli, Brian Kulis

26. Pruning at a Glance: A Structured Class-Blind Pruning Technique for Model Compression
Abdullah Salama, Oleksiy Ostapenko, Moin Nabi, Tassilo Klein

27. Succinct Source Coding of Deep Neural Networks
Sourya Basu, Lav R. Varshney

28. Fast On-the-fly Retraining-free Sparsification of Convolutional Neural Networks
Amir H. Ashouri, Tarek Abdelrahman, Alwyn Dos Remedios

29. PocketFlow: An Automated Framework for Compressing and Accelerating Deep Neural Networks
Jiaxiang Wu, Yao Zhang, Haoli Bai, Huasong Zhong, Jinlong Hou, Wei Liu, Junzhou Huang

30. Universal Deep Neural Network Compression
Yoojin Choi, Mostafa El-Khamy, Jungwon Lee

31. Compact and Computationally Efficient Representations of Deep Neural Networks
Simon Wiedemann, Klaus-Robert Mueller, Wojciech Samek

32. Dynamic parameter reallocation improves trainability of deep convolutional networks
Hesham Mostafa, Xin Wang

33. Compact Neural Network Solutions to Laplace's Equation in a Nanofluidic Device
Martin Magill, Faisal Z. Qureshi, Hendrick W. de Haan

34. Distilling Critical Paths in Convolutional Neural Networks
Fuxun Yu, Zhuwei Qin, Xiang Chen

35. SeCSeq: Semantic Coding for Sequence-to-Sequence based Extreme Multi-label Classification
Wei-Cheng Chang, Hsiang-Fu Yu, Inderjit S. Dhillon, Yiming Yang

A best paper award will be presented to the contribution selected by reviewers, who will also take into account active disucssions on OpenReview. One FREE NIPS ticket will be awarded to the best paper presenter.

The best paper award is given to the authors of "Rethinking the Value of Network Pruning",
Zhuang Liu, Mingjie Sun, Tinghui Zhou, Gao Huang, Trevor Darrell

Acknowledgement to reviewers

The workshop organizers gratefully acknowledge the assistance of the following people, who reviewed submissions and actively disucssed with authors:

Zhuang Liu, Ting-Wu Chin, Fuxun Yu, Huan Wang, Mehrdad Yazdani, Qigong Sun, Tim Genewein, Abdullah Salama, Anbang Yao, Chen Xu, Hao Li, Jiaxiang Wu, Zhisheng Zhong, Haoji Hu, Hesham Mostafa, Seunghyeon Kim, Xin Wang, Yiwen Guo, Yu Pan, Fereshteh Lagzi, Martin Magill, Wei-Cheng Chang, Yue Wang, Caglar Aytekin, Hannes Fassold, Martin Winter, Yunhe Wang, Faisal Qureshi, Filip Korzeniowski, jianguo Li, Jiashi Feng, Mingjie Sun, Shiqi Wang, Tinghuai Wang, Xiangyu Zhang, Yibo Yang, Ziqian Chen, Francesco Cricri, Jan Schlüter, Jing Xu, Lingyu Duan, Maoin Wang, Naiyan Wang, Stephen Tyree, Tianshui Chen, Vasileios Mezaris, Christopher Blake, Chris Srinivasa, Giuseppe Castiglione, Amir Khoshamam, Kevin Luk, Luyu Wang, Jian Cheng, Pavlo Molchanov, Yihui He, Sam Witteveen, Peng Wang,

with special thanks to Ting-Wu Chin who contributed 7 reviewer comments.


Workshop meeting room: 517B
Workshop schedule on December 7th, 2018:

Fri 6:00 a.m. - 6:05 a.m. [iCal]
Opening and Introduction (Talk)
Fri 6:05 a.m. - 6:30 a.m. [iCal]
Rethinking the Value of Network Pruning (Oral presentation)
Zhuang Liu
Fri 6:30 a.m. - 6:50 a.m. [iCal]

n the post-ImageNet era, computer vision and machine learning researchers are solving more complicated AI problems using larger datasets driving the demand for more computation. However, we are in the post-Moore’s Law world where the amount of computation per unit cost and power is no longer increasing at its historic rate. This mismatch between supply and demand for computation highlights the need for co-designing efficient algorithms and hardware. In this talk, I will talk about bandwidth efficient deep learning by model compression, together with efficient hardware architecture support, saving memory bandwidth, networking bandwidth, and engineer bandwidth.

Song Han
Fri 6:55 a.m. - 7:20 a.m. [iCal]

Abstract: the widespread use of state-of-the-art deep neural network models in the mobile, automotive and embedded domains is often hindered by the steep computational resources that are required for running such models. However, the recent scientific literature proposes a plethora of of ways to alleviate the problem, either on the level of efficient network architectures, efficiency-optimized hardware or via network compression methods. Unfortunately, the usefulness of a network compression method strongly depends on the other aspects (network architecture and target hardware) as well as the task itself (classification, regression, detection, etc.), but very few publications consider this interplay. This talk highlights some of the issues that arise from the strong interplay between network architecture, target hardware, compression algorithm and target task. Additionally some shortcomings in the current literature on network compression methods are pointed-out, such as incomparability of results (different base-line networks, different training-/data-augmentation schemes, etc.), lack of results on tasks other than classification, or use of very different (and perhaps not very informative) quantitative performance indicators such as naive compression rate, operations-per-second, size of stored weight matrices, etc. The talk concludes by proposing some guidelines and best-practices for increasing practical applicability of network compression methods and a call for standardizing network compression benchmarks.

Tim Genewein
Fri 7:20 a.m. - 7:45 a.m. [iCal]
Linear Backprop in non-linear networks (Oral presentation)
Mehrdad Yazdani
Fri 7:45 a.m. - 8:00 a.m. [iCal]
Coffee break (morning) (break)
Fri 8:00 a.m. - 8:25 a.m. [iCal]

Deploying state-of-the-art deep neural networks into high-performance production system comes with many challenges. There is a plethora of deep learning frameworks with different operator designs and model format. As a deployment platform developer, having a portable model format to parse, instead of developing parsers for every single framework seems very attractive.

As a pioneer in the deep learning inference platform, NVIDIA TensorRT introduced UFF as a proposed solution last year, and now there are more exchange format available such as ONNX and NNEF.

In this talk, we will share the lessons learned from the TensorRT use cases in various production environment working with a portable format, with consideration of optimizations such as pruning, quantization, and auto-tuning on different target accelerators. We will also discuss some of the open challenges.

Joohoon Lee
Fri 8:30 a.m. - 9:15 a.m. [iCal]
Bayesian Sparsification of Gated Recurrent Neural Networks (Oral presentation)
Nadia Chirkova
Fri 9:00 a.m. - 11:00 a.m. [iCal]
Lunch break (on your own) (break)
Fri 11:00 a.m. - 11:25 a.m. [iCal]

Abstract: neural network compression has become an important research area due to its great impact on deployment of large models on resource constrained devices. In this talk, we will introduce two novel techniques that allow for differentiable sparsification and quantization of deep neural networks; both of these are achieved via appropriate smoothing of the overall objective. As a result, we can directly train architectures to be highly compressed and hardware-friendly via off-the-self stochastic gradient descent optimizers.

Max Welling
Fri 11:25 a.m. - 11:50 a.m. [iCal]

In the past several years, Deep Neural Networks (DNNs) have demonstrated record-breaking accuracy on a variety of artificial intelligence tasks. However, the intensive storage and computational costs of DNN models make it difficult to deploy them on the mobile and embedded systems for real-time applications. In this technical talk, Dr. Yao will introduce their recent works on deep neural network compression and acceleration, showing how they achieve impressive compression performance without noticeable loss of model prediction accuracy, from the perspective of pruning and quantization.

Anbang Yao
Fri 11:50 a.m. - 12:20 p.m. [iCal]
Poster spotlight session. (Spotlight presentation)
Abdullah Salama, Wei-Cheng Chang, Aidan Gomez, Raphael Tang, FUXUN YU, Zhendong Zhang, Yuxin Zhang, Ji Lin, Stephen Tiedemann, Kun Bai, Siva Sankarapandian, Marton Havasi, Jack Turner, Dave Cheng, Yue Wang, Xiaofan Xu, Ruizhou Ding, Haoji Hu, Mohammad Shafiee, Christopher Blake, Chieh-Chi Kao, Daniel Kang, Ken Chia, Amir Ashouri, Sourya Basu, Simon Wiedemann, Thorsten Laude
Fri 12:20 p.m. - 12:30 p.m. [iCal]
Coffee break (afternoon) (break)
Fri 12:30 p.m. - 1:30 p.m. [iCal]
Poster presentations (Poster session)
Simon Wiedemann, Tonny Wang, Ivan Zhang, Chong Wang, Mohammad Javad Shafiee, Rachel Manzelli, Wenbing Huang, Tassilo Klein, Lifu Zhang, Ashutosh Adhikari, Faisal Qureshi, Giuseppe Castiglione
Fri 1:45 p.m. - 2:45 p.m. [iCal]
Panel disucssion
Max Welling, Tim Genewein, Edwin Park, Song Han
Fri 2:45 p.m. - 3:00 p.m. [iCal]
Closing (Talk)

Author Information

Lixin Fan (Nokia Technologies)

Dr. Lixin Fan is a Principal Scientist affiliated with WeBank, China. His research areas of interests include machine learning & deep learning, computer vision & pattern recognition, image and video processing, 3D big data processing, data visualization & rendering, augmented and virtual reality, mobile ubiquitous and pervasive computing, and intelligent human-computer interface. Dr. Fan is the (co-)author of more than 60 international journal & conference publications. He also (co-)invented more than a hundred granted and pending patents filed in US, Europe, and China. Before joining WeBank, Dr. Fan was affiliated with Nokia Technologies and Xerox Research Center Europe (XRCE). His research work included the well-recognized bag of key-points method for image categorization.

Zhouchen Lin (Peking University)
Max Welling (University of Amsterdam / Qualcomm AI Research)
Yurong Chen (Intel Labs China)

More from the Same Authors