Skip to yearly menu bar Skip to main content


Scaling Laws for Upcycling Mixture-of-Experts Language Models

Seng Pei Liew ⋅ Takuya Kato ⋅ Sho Takase

Abstract

Chat is not available.