Skip to yearly menu bar Skip to main content


Poster

Online Iterative Reinforcement Learning from Human Feedback with General Preference Model

Chenlu Ye ⋅ Wei Xiong ⋅ Yuheng Zhang ⋅ Hanze Dong ⋅ Nan Jiang ⋅ Tong Zhang
2024 Poster

Abstract

Video

Chat is not available.