NeuSym-HLS: Learning-Driven Symbolic Distillation in High-Level Synthesis of Hardware Accelerators
Abstract
Domain-specific hardware accelerators for deep neural network (DNN) inference have been widely adopted. Traditional DNN compression techniques such as pruning and quantization help but can fall short when aggressive hardware efficiency is required. We present \textit{NeuSym-HLS}, a partial symbolic distillation and high-level hardware synthesis flow to compress and accelerate DNN inference for edge computing. NeuSym-HLS replaces a portion of the layers of a trained DNN model with compact analytic expressions obtained via symbolic regression, and generates efficient hardware accelerators. The resulting hardware accelerator of the hybrid DNN-symbolic model provides well balanced performance in algorithmic accuracy, hardware resource, and inference latency. Our evaluation on vision tasks showed that NeuSym-HLS reduces hardware resource usage, reduces latency, while maintaining model inference accuracy.