Skip to yearly menu bar Skip to main content


Poster

Multi-turn Reinforcement Learning with Preference Human Feedback

Lior Shani ⋅ Aviv Rosenberg ⋅ Asaf Cassel ⋅ Oran Lang ⋅ Daniele Calandriello ⋅ Avital Zipori ⋅ Hila Noga ⋅ Orgad Keller ⋅ Bilal Piot ⋅ Idan Szpektor ⋅ Avinatan Hassidim ⋅ Yossi Matias ⋅ Remi Munos
2024 Poster
[ Paper [ Poster [ OpenReview

Abstract

Video

Chat is not available.