Toggle Poster Visibility
Oral
Thu Dec 14 01:20 PM -- 01:35 PM (PST) @ La Nouvelle Orleans Ballroom A-C (level 2) None
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment
In
Oral 6B RL
[
OpenReview]
Oral
Thu Dec 14 01:35 PM -- 01:50 PM (PST) @ La Nouvelle Orleans Ballroom A-C (level 2) None
Bridging RL Theory and Practice with the Effective Horizon
In
Oral 6B RL
[
OpenReview]
Oral
Thu Dec 14 01:50 PM -- 02:05 PM (PST) @ La Nouvelle Orleans Ballroom A-C (level 2) None
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
In
Oral 6B RL
[
OpenReview]
Oral
Thu Dec 14 02:05 PM -- 02:20 PM (PST) @ La Nouvelle Orleans Ballroom A-C (level 2) None
MetaBox: A Benchmark Platform for Meta-Black-Box Optimization with Reinforcement Learning
In
Oral 6B RL
[
Slides]
[
OpenReview]