Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

6 Results

<<   <   Page 1 of 1   >>   >
Workshop
Sat 15:45 Auto-Evaluation with Few Labels through Post-hoc Regression
Benjamin Eyre · David Madras
Workshop
Multimodal Auto Validation For Self-Refinement in Web Agents
Ruhana Azam · Tamer Abuelsaad · Aditya Vempaty · Ashish Jagmohan
Workshop
Report Cards: Qualitative Evaluation of LLMs Using Natural Language Summaries
Blair Yang · Fuyang Cui · Keiran Paster · Jimmy Ba · Pashootan Vaezipoor · Silviu Pitis · Michael Zhang
Workshop
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents
Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y
Workshop
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents
Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y
Workshop
Auto-Enhance: Towards a Meta-Benchmark to Evaluate AI Agents' Ability to Improve Other Agents
Samuel Brown · Basil Labib · Codruta Lugoj · Sai Sasank Y