Poster
|
Wed 8:45
|
Direct Preference-based Policy Optimization without Reward Modeling
Gaon An · Junhyeok Lee · Xingdong Zuo · Norio Kosaka · Kyung-Min Kim · Hyun Oh Song
|
|
Workshop
|
|
Group Preference Optimization: Few-Shot Alignment of Large Language Models
Siyan Zhao · John Dang · Aditya Grover
|
|
Workshop
|
|
Zero-shot Cross-task Preference Alignment for Offline RL via Optimal Transport
Runze Liu · Yali Du · Fengshuo Bai · Jiafei Lyu · Xiu Li
|
|
Workshop
|
|
Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints
Chaoqi Wang · Yibo Jiang · Chenghao Yang · Han Liu · Yuxin Chen
|
|
Workshop
|
|
Optimal Transport for Measures with Noisy Tree Metric
Tam Le · Truyen Nguyen · Kenji Fukumizu
|
|
Workshop
|
|
Group Preference Optimization: Few-Shot Alignment of Large Language Models
Siyan Zhao · John Dang · Aditya Grover
|
|
Workshop
|
|
Group Preference Optimization: Few-Shot Alignment of Large Language Models
Siyan Zhao · John Dang · Aditya Grover
|
|
Workshop
|
|
Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints
Chaoqi Wang · Yibo Jiang · Chenghao Yang · Han Liu · Yuxin Chen
|
|
Oral
|
Thu 13:50
|
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov · Archit Sharma · Eric Mitchell · Christopher D Manning · Stefano Ermon · Chelsea Finn
|
|
Poster
|
Thu 8:45
|
A fast heuristic to optimize time-space tradeoff for large models
Akifumi Imanishi · Zijian Xu · Masayuki Takagi · Sixue Wang · Emilio Castillo
|
|
Poster
|
Thu 15:00
|
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov · Archit Sharma · Eric Mitchell · Christopher D Manning · Stefano Ermon · Chelsea Finn
|
|
Workshop
|
|
Preference-Guided Bayesian Optimization for Control Policy Learning: Application to Personalized Plasma Medicine
Ketong Shao · Diego Romeres · Ankush Chakrabarty · Ali Mesbah
|
|