NeurIPS 2019 Expo Talk
Nov. 28, 2022
Intel® Nervana™ NNP: domain specific architectures for inference & training
Sponsor: Intel AI
Domain specific accelerators provide more efficient computation for increasingly complex models. With the Intel® Nervana™ NNP for Inference (NNP-I), we've designed inference compute engines and coupled them with general purpose Intel CPU cores on a single die to enable fast in-lining of deep learning and non-deep learning compute. This can unlock opportunities for heterogeneous algorithms for researchers. As the field moves towards training larger models, the NNP for Training (NNP-T) is designed with 4x 2D-mesh networks to connect our tensor cores and scale models across systems. This talk will cover how we designed (1) flexibility without sacrificing performance with NNP-I, (2) scalability with NNP-T for the most complex models, and (3) software stacks to enable programmability through standard frameworks.