Skip to yearly menu bar Skip to main content


AdaGrad Meets Muon: Adaptive Stepsizes for Orthogonal Updates

Minxin Zhang ⋅ Yuxuan Liu ⋅ Hayden Schaeffer

Abstract

Chat is not available.