Skip to yearly menu bar Skip to main content


LightSeq: : Sequence Level Parallelism for Distributed Training of Long Context Transformers

Dacheng Li ⋅ Rulin Shao ⋅ Anze Xie ⋅ Eric Xing ⋅ Joseph Gonzalez ⋅ Ion Stoica ⋅ Xuezhe Ma ⋅ Hao Zhang

Abstract

Video

Chat is not available.