Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

34 Results

<<   <   Page 3 of 3   >>   >
Poster
Wed 16:30 StreamBench: Towards Benchmarking Continuous Improvement of Language Agents
Cheng-Kuang Wu · Zhi Rui Tam · Chieh-Yen Lin · Yun-Nung (Vivian) Chen · Hung-yi Lee
Poster
Thu 11:00 Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Ruisheng Cao · Fangyu Lei · Haoyuan Wu · Jixuan Chen · Yeqiao Fu · Hongcheng Gao · Xinzhuang Xiong · Hanchong Zhang · Wenjing Hu · Yuchen Mao · Tianbao Xie · Hongshen Xu · Danyang Zhang · Sida Wang · Ruoxi Sun · Pengcheng Yin · Caiming Xiong · Ansong Ni · Qian Liu · Victor Zhong · Lu Chen · Kai Yu · Tao Yu
Workshop
Can VLMs Play Action Role-Playing Games? Take Black Myth Wukong as a Study Case
Peng Chen · Pi Bu · Jun Song · Yuan Gao · Bo Zheng
Workshop
SPA-BENCH: A COMPREHENSIVE BENCHMARK FOR SMARTPHONE AGENT EVALUATION
Jingxuan Chen · Derek Yuen · Bin Xie · Yuhao Yang · Gongwei Chen · Zhihao Wu · Li Yixing · Xurui Zhou · Weiwen Liu · Shuai Wang · Rui Shao · Liqiang Nie · Yasheng Wang · Jianye Hao · Jun Wang · Kun Shao
Workshop
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents
Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y
Workshop
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents
Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y
Workshop
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents
Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y
Poster
Wed 11:00 SustainDC: Benchmarking for Sustainable Data Center Control
Avisek Naug · Antonio Guillen-Perez · Ricardo Luna Gutierrez · Vineet Gundecha · Cullen Bash · Sahand Ghorbanpour · Sajad Mousavi · Ashwin Ramesh Babu · Dejan Markovikj · Lekhapriya Dheeraj Kashyap · Desik Rengarajan · Soumyendu Sarkar
Poster
Wed 16:30 RoleAgent: Building, Interacting, and Benchmarking High-quality Role-Playing Agents from Scripts
Jiaheng Liu · Zehao Ni · Haoran Que · Sun · Noah Wang · Jian Yang · JiakaiWang · Hongcheng Guo · Zhongyuan Peng · Ge Zhang · Jiayi Tian · Xingyuan Bu · Ke Xu · Wenge Rong · Junran Peng · ZHAO-XIANG ZHANG
Poster
Fri 11:00 DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
Peter Jansen · Marc-Alexandre Côté · Tushar Khot · Erin Bransom · Bhavana Dalvi Mishra · Bodhisattwa Prasad Majumder · Oyvind Tafjord · Peter Clark