Skip to yearly menu bar Skip to main content


CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing

Wenhao Zheng · Yixiao Chen · Weitong Zhang · Souvik Kundu · Yun Li · Zhengzhong Liu · Eric Xing · Hongyi Wang · Huaxiu Yao

Abstract

Chat is not available.