Spotlight
in
Workshop: ML for Systems

Learning to Shard: RL for Co-optimizing the Parallelism Degrees and Per-operator Sharding Dimensions in Distributed LLM Inference (Spotlight Paper)

2025 Spotlight
in
Workshop: ML for Systems

Abstract

Spotlight: Ruokai Yin (Yale) Learning to Shard: RL for Co-optimizing the Parallelism Degrees and Per-operator Sharding Dimensions in Distributed LLM Inference

Video

Chat is not available.