Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Foundation Models for Decision Making

Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control

Longtao Zheng · Rundong Wang · Xinrun Wang · Bo An

[ ] [ Project Page ]
 
presentation: Foundation Models for Decision Making
Fri 15 Dec 6:15 a.m. PST — 3:30 p.m. PST

Abstract:

Building agents using large language models (LLMs) to control computers is an emerging research field. In this setting, the agent processes computer states and performs actions to accomplish tasks specified in natural language. Previous computer agents have demonstrated the benefits of in-context learning (ICL), i.e., prompting LLMs with a few exemplars; however, their performance is hindered by several issues. First, the limited context length of LLMs and complex computer states restrict the number of exemplars, as a single webpage can consume the entire context. Second, the exemplars in current methods, such as high-level plans and multi-choice questions, cannot represent complete trajectories, leading to suboptimal performance in tasks that require many steps or repeated actions. Third, existing computer agents rely on task-specific exemplars and overlook the similarity among tasks, resulting in poor generalization to novel tasks. To address these challenges, we introduce Synapse, a computer agent that incorporates trajectory-as-exemplar prompting and exemplar memory. Specifically, Synapse has three key components: i) state abstraction, which filters out task-irrelevant information from raw states, allowing more exemplars within the limited context, ii) trajectory-as-exemplar prompting, which prompts the LLM with complete trajectories of the abstracted states and actions for improved multi-step decision-making, and iii) exemplar memory, which stores the embeddings of exemplars and retrieves them via similarity search for generalization to novel tasks. We evaluate Synapse on MiniWoB++, a standard task suite, and Mind2Web, a real-world website benchmark. In MiniWoB++, Synapse achieves a 99.2% average success rate (a 10% relative improvement) across 64 tasks using demonstrations from only 48 tasks. Notably, Synapse is the first ICL method to solve the book-flight task in MiniWoB++. Synapse also exhibits a 53% relative improvement in average step success rate over the previous state-of-the-art prompting scheme in Mind2Web.

Chat is not available.