firstbacksecondback
12 Results
Workshop
|
A Meta-Algorithm for Aligning LLMs with General Preferences Yixin Liu · Argyris Oikonomou · Weiqiang Zheng · Yang Cai · Arman Cohan |
||
Workshop
|
Sat 11:18 |
COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences Yixin Liu · Argyris Oikonomou · Weiqiang Zheng · Yang Cai · Arman Cohan |
|
Workshop
|
Declarative characterizations of direct preference alignment algorithms Kyle Richardson · Vivek Srikumar · Ashish Sabharwal |
||
Workshop
|
Algorithmic Oversight for Deceptive Reasoning Ege Onur Taga · Mingchen Li · Yongqi Chen · Samet Oymak |
||
Workshop
|
Ablation is Not Enough to Emulate DPO: A Mechanistic Analysis of Toxicity Reduction Yushi Yang · Filip Sondej · Harry Mayne · Adam Mahdi |
||
Workshop
|
Ablation is Not Enough to Emulate DPO: A Mechanistic Analysis of Toxicity Reduction Yushi Yang · Filip Sondej · Harry Mayne · Adam Mahdi |
||
Workshop
|
Workshop Submission: Towards Making Untrainable Networks Trainable Vighnesh Subramaniam · Tomaso Poggio · Boris Katz · Brian Cheung · Andrei Barbu |
||
Workshop
|
Algorithmic Oversight for Deceptive Reasoning Ege Onur Taga · Mingchen Li · Yongqi Chen · Samet Oymak |
||
Workshop
|
MID-Space: Aligning Diverse Communities' Needs to Inclusive Public Spaces Shravan Nayak · Rashid Mushkani · Hugo Berard · Allison Cohen · Shin Koseki · Hadrien Bertrand |
||
Poster
|
Fri 11:00 |
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms Rafael Rafailov · Yaswanth Chittepu · Ryan Park · Harshit Sushil Sikchi · Joey Hejna · Brad Knox · Chelsea Finn · Scott Niekum |
|
Poster
|
Fri 11:00 |
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms Miaosen Zhang · Yixuan Wei · Zhen Xing · Yifei Ma · Zuxuan Wu · Ji Li · Zheng Zhang · Qi Dai · Chong Luo · Xin Geng · Baining Guo |
|
Workshop
|
Rethinking Message Passing for Algorithmic Alignment Joël Mathys · Florian Grötschla · Kalyan Nadimpalli · Roger Wattenhofer |