Skip to yearly menu bar Skip to main content


Why RL Updates Look Sparse: An Implicit Compass Drives Optimization Bias

Hanqing Zhu · Zhenyu Zhang · Hanxian Huang · DiJia Su · Zechun Liu · Jiawei Zhao · Igor Fedorov · Hamed Pirsiavash · Jinwon Lee · David Z. Pan · Zhangyang "Atlas" Wang · Yuandong Tian · Kai Sheng Tai

Abstract

Chat is not available.