Timezone: »

Compression and Acceleration of Pre-trained Language Models
Lu Hou

Recently, Transformer-based pre-trained models like BERT and GPT have achieved remarkable results on various natural language understanding tasks and even some computer vision and multi-modal tasks. However, these models have many parameters, hindering their deployment on edge devices or the cloud. In this talk, I will introduce some recent progress on how we alleviate the concerns in various deployment scenarios during the inference and training period. Specifically, compression and acceleration methods using knowledge distillation, dynamic networks, and network quantization will be discussed.

Author Information

Lu Hou (Huawei Technologies Co., Ltd)

More from the Same Authors