firstbacksecondback
4 Results
Poster
|
Wed 16:30 |
Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment Jiaxiang Li · Siliang Zeng · Hoi-To Wai · Chenliang Li · Alfredo Garcia · Mingyi Hong |
|
Poster
|
Fri 16:30 |
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer Zhihan Liu · Miao Lu · Shenao Zhang · Boyi Liu · Hongyi Guo · Yingxiang Yang · Jose Blanchet · Zhaoran Wang |
|
Workshop
|
Automatically Generating Custom Context-Driven SFT Data for LLMs with Multi-Granularity Shanghaoran Quan |
||
Workshop
|
vTune: Verifiable Fine-Tuning Through Backdooring Eva Zhang · Akilesh Potti · Micah Goldblum |