Skip to yearly menu bar Skip to main content


Poster
in
Workshop: ML for Systems

QAQ: Query-adaptive Mixed-precision Quantization for Large Language Models

Shuxing Li · Huanrong Liu · Zelin Wang · Ruoyang Du · S Lee · Chunlin Tian · Qingbiao Li

Abstract

Chat is not available.