firstbacksecondback
13 Results
Workshop
|
Mallows-DPO: Fine-Tune Your LLM with Preference Dispersions Haoxian Chen · HANYANG ZHAO · Henry Lam · David Yao · Wenpin Tang |
||
Workshop
|
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning Yihe Deng · Paul Mineiro |
||
Poster
|
Thu 11:00 |
Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment Teng Xiao · Yige Yuan · Huaisheng Zhu · Mingxiao Li · Vasant Honavar |
|
Workshop
|
The Crucial Role of Samplers in Online Direct Preference Optimization Ruizhe Shi · Runlong Zhou · Simon Du |
||
Workshop
|
Ablation is Not Enough to Emulate DPO: A Mechanistic Analysis of Toxicity Reduction Yushi Yang · Filip Sondej · Harry Mayne · Adam Mahdi |
||
Workshop
|
Ablation is Not Enough to Emulate DPO: Attributing Toxicity Reduction to Neurons Yushi Yang · Filip Sondej · Harry Mayne · Adam Mahdi |
||
Workshop
|
Ablation is Not Enough to Emulate DPO: A Mechanistic Analysis of Toxicity Reduction Yushi Yang · Filip Sondej · Harry Mayne · Adam Mahdi |
||
Workshop
|
Best Unpacking DPO and PPO: Disentangling Practices for Learning from Preference Feedback Hamish Ivison · Yizhong Wang · Jiacheng Liu · Zeqiu Wu · Valentina Pyatkin · Nathan Lambert · Noah Smith · Yejin Choi · Hannaneh Hajishirzi |
||
Workshop
|
Sat 12:00 |
Uncertainty-Penalized Directed Preference Optimization Sam Houliston · Alexander Immer · Alizée Pace · Gunnar Rätsch |
|
Poster
|
Thu 16:30 |
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback Hamish Ivison · Yizhong Wang · Jiacheng Liu · Zeqiu Wu · Valentina Pyatkin · Nathan Lambert · Noah Smith · Yejin Choi · Hanna Hajishirzi |
|
Workshop
|
Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity Rheeya Uppaal · Apratim Dey · Yiting He · Yiqiao Zhong · Junjie Hu |
||
Poster
|
Thu 16:30 |
ββ-DPO: Direct Preference Optimization with Dynamic β Junkang Wu · Yuexiang Xie · Zhengyi Yang · Jiancan Wu · Jinyang Gao · Bolin Ding · Xiang Wang · Xiangnan He |