Skip to yearly menu bar Skip to main content


S2D: Sorted Speculative Decoding For More Efficient Deployment of Large Language Models

Parsa Kavehzadeh ⋅ Mohammadreza Pourreza ⋅ Mojtaba Valipour ⋅ Tianshu Zhu ⋅ Haoli Bai ⋅ Ali Ghodsi ⋅ Boxing Chen ⋅ Mehdi Rezaghoizadeh

Abstract

Video

Chat is not available.