Skip to yearly menu bar Skip to main content


Poster
in
Workshop: ML for Systems

Learning to Shard: RL for Co-optimizing the Parallelism Degrees and Per-operator Sharding Dimensions in Distributed LLM Inference

Ruokai Yin · Sattwik Mishra · Xuan Zuo · Hokchhay Tann · Apala Guha

Abstract

Chat is not available.