Skip to yearly menu bar Skip to main content


Invited talk
in
Workshop: NIPS 2018 workshop on Compact Deep Neural Networks with industrial applications

Challenges and lessons learned in DNN portability in production

Joohoon Lee


Abstract:

Deploying state-of-the-art deep neural networks into high-performance production system comes with many challenges. There is a plethora of deep learning frameworks with different operator designs and model format. As a deployment platform developer, having a portable model format to parse, instead of developing parsers for every single framework seems very attractive.

As a pioneer in the deep learning inference platform, NVIDIA TensorRT introduced UFF as a proposed solution last year, and now there are more exchange format available such as ONNX and NNEF.

In this talk, we will share the lessons learned from the TensorRT use cases in various production environment working with a portable format, with consideration of optimizations such as pruning, quantization, and auto-tuning on different target accelerators. We will also discuss some of the open challenges.

Live content is unavailable. Log in and register to view live content