Poster
in
Workshop: LAW 2025: Bridging Language, Agent, and World Models for Reasoning and Planning

Bridging Symbols from Language and Hierarchical Reinforcement Learning with Active Imitation

Ziqi Ma · Sao Mai Nguyen · Philippe Xu

Project Page [ OpenReview]

Abstract

Large Language Models (LLMs) exhibit their potential for interacting with reinforcement learning (RL) agents, the main challenge is to align the world model learned by the agent with a representation compatible with LLMs. We solve this problem by proposing an algorithm named SGIM-STAR that creates online a discrete world representation by reinforcement learning exploration, it is a hierarchical RL method that augments STAR with a partition-wise, learning-progress–driven switch between a learned Q-learning Navigator and an LLM Navigator. The agent builds a discrete reachability-based partition online and uses intrinsic motivation to query the LLM only when beneficial, defaulting to the learned navigator otherwise. This yields usage cost-aware: the learned navigator dominates early and the LLM is leveraged as the representation matures. On AntMaze, SGIM-STAR achieves the best and most stable success among STAR, LLM-only, and a non-partitioned adaptive variant, avoiding mid-training collapses while reducing LLM calls. The result demonstrates a practical fusion of LLMs with emerging symbolic world models for long-horizon tasks.

Chat is not available.