OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization
Advait Gadhikar · Riccardo Grazzi · James Hensman
Abstract
We introduce OptRot, a data-free preprocessing method to learn fusible rotations for post-training quantization of language models. OptRot reduces weight outliers by finding rotations which minimize the element-wise fourth power of the rotated weights. We show how reducing weight outliers can provably improve weight quantization performance and how OptRot rotations can outperform both Hadamard rotations and rotations learned by the data-dependent method SpinQuant.
Chat is not available.
Successful Page Load