firstbacksecondback
10 Results
Workshop
|
Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy Zhenyu Guan · Xiangyu Kong · Fangwei Zhong · Yizhou Wang |
||
Workshop
|
Sat 15:45 |
Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation Siyuan Wang · Zhuohan Long · Zhihao Fan · Xuanjing Huang · zhongyu wei |
|
Poster
|
Fri 11:00 |
Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy Zhenyu Guan · Xiangyu Kong · Fangwei Zhong · Yizhou Wang |
|
Workshop
|
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y |
||
Workshop
|
Sat 10:10 |
Evolving Alignment via Asymmetric Self-Play Ziyu Ye · Rishabh Agarwal · Tianqi Liu · Rishabh Joshi · Sarmishta Velury · Quoc V Le · Qijun Tan · Yuan Liu |
|
Poster
|
Fri 16:30 |
EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations Jia Li · Ge Li · Xuanming Zhang · YunFei Zhao · Yihong Dong · Zhi Jin · Binhua Li · Fei Huang · Yongbin Li |
|
Workshop
|
Memorization Detection Benchmark for Generative Image models Marc Molina · Felice Burn |
||
Workshop
|
Beyond Benchmarking: Automated Capability Discovery via Model Self-Exploration Cong Lu · Shengran Hu · Jeff Clune |
||
Workshop
|
Benchmarking Self-Supervised Learning for Single-Cell Data Philip Toma · Olga Ovcharenko · Imant Daunhawer · Julia Vogt · Florian Barkmann · Valentina Boeva |
||
Poster
|
Wed 11:00 |
SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models Dan Zhang · Ziniu Hu · Sining Zhoubian · Zhengxiao Du · Kaiyu Yang · Zihan Wang · Yisong Yue · Yuxiao Dong · Jie Tang |