firstbacksecondback
3 Results
Workshop
|
Targeted Manipulation and Deception Emerge in LLMs Trained on User* Feedback Marcus Williams · Micah Carroll · Constantin Weisser · Adhyyan Narang · Brendan Murphy · Anca Dragan |
||
Workshop
|
Targeted Manipulation and Deception Emerge in LLMs Trained on User* Feedback Marcus Williams · Micah Carroll · Constantin Weisser · Brendan Murphy · Adhyyan Narang · Anca Dragan |
||
Workshop
|
WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback Taiwei Shi · Zhuoer Wang · Longqi Yang · Ying-Chun Lin · Zexue He · Mengting Wan · Pei Zhou · Sujay Kumar Jauhar · Xiaofeng Xu · XIA SONG · Jennifer Neville |