firstbacksecondback
530 Results
Workshop
|
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y |
||
Workshop
|
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y |
||
Workshop
|
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y |
||
Affinity Event
|
Benchmark on Peer Review Toxic Detection: A Challenging Task with a New Dataset Man Luo · Bradley Peterson · Rafael Gan · Hari Ramalingame · Navya Gangrade · Ariadne Dimarogona · Imon Banerjee · Phillip Howard |
||
Poster
|
SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types Yutao Mou · Shikun Zhang · Wei Ye |
||
Poster
|
Thu 11:00 |
MM-WLAuslan: Multi-View Multi-Modal Word-Level Australian Sign Language Recognition Dataset Xin Shen · Heming Du · Hongwei Sheng · Shuyun Wang · Hui Chen · Huiqiang Chen · Zhuojie Wu · Xiaobiao Du · Jiaying Ying · Ruihan Lu · Qingzheng Xu · Xin Yu |
|
Oral
|
Thu 16:10 |
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low Resource and Extinct Languages Andrew M. Bean · Simi Hellsten · Harry Mayne · Jabez Magomere · Ethan Chi · Ryan Chi · Scott Hale · Hannah Rose Kirk |
|
Poster
|
Wed 16:30 |
FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models Ruinan Jin · Zikang Xu · Yuan Zhong · Qingsong Yao · DOU QI · S. Kevin Zhou · Xiaoxiao Li |
|
Poster
|
Fri 11:00 |
Beyond Aesthetics: Cultural Competence in Text-to-Image Models Nithish Kannen Senthilkumar · Arif Ahmad · Marco Andreetto · Vinodkumar Prabhakaran · Utsav Prabhu · Adji Bousso Dieng · Pushpak Bhattacharyya · Shachi Dave |
|
Poster
|
RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization Bing Yang · Changsheng Quan · Yabo Wang · Pengyu Wang · Yujie Yang · Ying Fang · Nian Shao · Hui Bu · Xin Xu · Xiaofei Li |
||
Poster
|
Wed 16:30 |
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Zhen Huang · Zengzhi Wang · Shijie Xia · Xuefeng Li · Haoyang Zou · Ruijie Xu · Run-Ze Fan · Lyumanshan Ye · Ethan Chern · Yixin Ye · Yikai Zhang · Yuqing Yang · Ting Wu · Binjie Wang · Shichao Sun · Yang Xiao · Yiyuan Li · Fan Zhou · Steffi Chern · Yiwei Qin · Yan Ma · Jiadi Su · Yixiu Liu · Yuxiang Zheng · Shaoting Zhang · Dahua Lin · Yu Qiao · Pengfei Liu |
|
Poster
|
Fri 16:30 |
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition Edoardo Debenedetti · Javier Rando · Daniel Paleka · Silaghi Florin · Dragos Albastroiu · Niv Cohen · Yuval Lemberg · Reshmi Ghosh · Rui Wen · Ahmed Salem · Giovanni Cherubin · Santiago Zanella-Beguelin · Robin Schmid · Victor Klemm · Takahiro Miki · Chenhao Li · Stefan Kraft · Mario Fritz · Florian Tramer · Sahar Abdelnabi · Lea Schönherr |