Skip to yearly menu bar Skip to main content


(4 events)   Timezone:  
Show all
Toggle Poster Visibility
Oral
Thu Dec 14 01:20 PM -- 01:35 PM (PST) @ La Nouvelle Orleans Ballroom A-C (level 2) None
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment
Tianwei Ni · Michel Ma · Benjamin Eysenbach · Pierre-Luc Bacon
[ OpenReview
Oral
Thu Dec 14 01:35 PM -- 01:50 PM (PST) @ La Nouvelle Orleans Ballroom A-C (level 2) None
Bridging RL Theory and Practice with the Effective Horizon
Cassidy Laidlaw · Stuart J Russell · Anca Dragan
[ OpenReview
Oral
Thu Dec 14 01:50 PM -- 02:05 PM (PST) @ La Nouvelle Orleans Ballroom A-C (level 2) None
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov · Archit Sharma · Eric Mitchell · Christopher D Manning · Stefano Ermon · Chelsea Finn
[ OpenReview
Oral
Thu Dec 14 02:05 PM -- 02:20 PM (PST) @ La Nouvelle Orleans Ballroom A-C (level 2) None
MetaBox: A Benchmark Platform for Meta-Black-Box Optimization with Reinforcement Learning
Zeyuan Ma · Hongshu Guo · Jiacheng Chen · Zhenrui Li · Guojun Peng · Yue-Jiao Gong · Yining Ma · Zhiguang Cao
[ Slides [ OpenReview