NeurIPS Strategies for Applying Low Rank Decomposition to Transformer-Based Models

Poster
in
Workshop: Second Workshop on Efficient Natural Language and Speech Processing (ENLSP-II)

Strategies for Applying Low Rank Decomposition to Transformer-Based Models

Habib Hajimolahoseini · Walid Ahmed · Mehdi Rezaghoizadeh · Vahid Partovi Nia · Yang Liu

[ Abstract ]

[ Poster]

Abstract:

Low rank decomposition decomposes each fully-connected layer of the transformer modules into two smaller layers using Singular Value Decomposition. The state-of-the-art techniques usually apply LRD in a single-shot, where all of thelayers are decomposed simultaneously. In this paper, we propose and compare different strategies for applying low rank decomposition to compress pre-trained transformer based models. These strategies include: layer-by-layer and progressive decomposition. We observe that progressive low rank decomposition, in which the rank is decreased incrementally results in a higher accuracy after decomposition comparing to single-shot and layer-by-layer low rank decomposition. Furthermore, in contrast with many of state-of-the-art compression methods where intensive pre-training of the compressed model is necessary, we show that progressive LRD can provide promising performance by compressing the model in the fine-tuning stage.

Chat is not available.

Poster in Workshop: Second Workshop on Efficient Natural Language and Speech Processing (ENLSP-II)

Strategies for Applying Low Rank Decomposition to Transformer-Based Models

Habib Hajimolahoseini · Walid Ahmed · Mehdi Rezaghoizadeh · Vahid Partovi Nia · Yang Liu

Poster
in
Workshop: Second Workshop on Efficient Natural Language and Speech Processing (ENLSP-II)