firstbacksecondback
8 Results
Expo Demonstration
|
Tue 15:00 |
EvalAssist - An LLM-as-a-Judge Framework Werner Geyer |
|
Workshop
|
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge Jiayi Ye · Yanbo Wang · Yue Huang · Dongping Chen · Qihui Zhang · Nuno Moniz · Tian Gao · Werner Geyer · Chao Huang · Pin-Yu Chen · Nitesh Chawla · Xiangliang Zhang |
||
Workshop
|
Sat 12:00 |
Black-box Uncertainty Quantification Method for LLM-as-a-Judge Nico Wagner · Michael Desmond · Rahul Nair · Zahra Ashktorab · Elizabeth Daly · Qian Pan · Martín Santillán Cooper · J Johnson · Werner Geyer |
|
Workshop
|
Self-Preference Bias in LLM-as-a-Judge Koki Wataoka · Tsubasa Takahashi · Ryokan Ri |
||
Workshop
|
Decoding Biases: An Analysis of Automated Methods and Metrics for Gender Bias Detection in Language Models Shachi H. Kumar · Saurav Sahay · Sahisnu Mazumder · Eda Okur · Ramesh Manuvinakurike · Nicole Beckage · Hsuan Su · Hung-yi Lee · Lama Nachman |
||
Poster
|
Wed 16:30 |
On scalable oversight with weak LLMs judging strong LLMs Zachary Kenton · Noah Siegel · Janos Kramar · Jonah Brown-Cohen · Samuel Albanie · Jannis Bulian · Rishabh Agarwal · David Lindner · Yunhao Tang · Noah Goodman · Rohin Shah |
|
Workshop
|
Sat 15:45 |
Conversational Question-Answering for process task guidance in manufacturing Ramesh Manuvinakurike · Elizabeth Watkins · Celal Savur · Anthony Rhodes · Sovan Biswas · Richard Beckwith · Gesem Mejia · Saurav Sahay · Giuseppe Raffa · Lama Nachman |
|
Workshop
|
Applying Multi-Fidelity Bayesian Optimization in Chemistry: Open Challenges and Major Considerations Edmund Judge · Mohammed Azzouzi · Austin Mroz · Antonio del Rio Chanona · Kim Jelfs |