NeurIPS 2024

Workshop

SEAL: Suite for Evaluating API-use of LLMs
Woojeong Kim · Ashish Jagmohan · Aditya Vempaty

Workshop

StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows
Yiran Wu · Tianwei Yue · Shaokun Zhang · Chi Wang · Qingyun Wu

Workshop

Advancing Agentic Systems: Dynamic Task Decomposition, Tool Integration and Evaluation using Novel Metrics and Dataset
Shankar Kumar Jeyakumar · Alaa Ahmad · Adrian Gabriel

Affinity Event

Towards unearthing neglected climate innovative solutions using an LLM-based search tool
César Quilodrán-Casas · Christopher Waite · Nicole Alhadeff · Diyona Dsouza · Cathal Hughes · Larissa Kunstel-Tabet · Alyssa Gilbert

Workshop

Your Theory Is Wrong: Using Linguistic Frameworks for LLM Probing
Victoria Firsanova

Workshop

Library Learning Doesn’t: The Curious Case of the Single-Use “Library”
Ian Berlot-Attwell · Frank Rudzicz · Xujie Si

Workshop

Activation Monitoring: Advantages of Using Internal Representations for LLM Oversight
Oam Patel · Rowan Wang

Workshop

GTA: A Benchmark for General Tool Agents
Jize Wang · Ma Zerun · Yining Li · Songyang Zhang · Cailian Chen · Kai Chen · Xinyi Le

Affinity Event

LLM Unlearning EKG: Evaluations using Knowledge Graphs
Rushali Mohbe · Samuel Scarpino

Workshop

Sun 12:25

VISTA: Visual Integrated System for Tailored Automation in Math Problem Generation Using LLM
Jeongwoo Lee · KWANGSUK PARK · Jihyeon Park

Workshop

Agentic Anomaly Detection for Shipping
Alexander Timms · Abigail Langbridge · Fearghal O'Donncha

Workshop

Agent S: An Open Agentic Framework that Uses Computers Like a Human
Saaket Agashe · Jiuzhou Han · Shuyu Gan · Jiachen Yang · Ang Li · Xin Eric Wang

Main Navigation