Skip to yearly menu bar Skip to main content


Poster

Hierarchical Programmatic Option Framework for Solving Long and Repetitive Tasks

Yu-An Lin · Chen-Tao Lee · Chih-Han Yang · Guan-Ting Liu · Shao-Hua Sun

West Ballroom A-D #6309
[ ]
Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Deep reinforcement learning (deep RL) aims at learning policies to solve decision-making problems. However, previous approaches employed neural network policies for policy learning, making it hard to interpret the decision-making process. On the other hand, prior works (Trivedi et al., 2021; Liu et al., 2023; Carvalho et al., 2024) proposed to use human-readable programs as policies to increase the interpretability of the decision-making pipeline. However, programmatic policies generated by (Trivedi et al., 2021; Liu et al., 2023; Carvalho et al., 2024) can not effectively solve long and repetitive RL tasks and can not generalize to even longer horizon during testing. To solve these problems, we propose the Hierarchical Programmatic Option framework (HIPO), which aims to solve long and repetitive RL problems with human-readable programs as options (low-level policies). Specifically, we proposed a method that retrieves a set of effective, diverse, and compatible programs as options (programmatic options). Then, we learn a high-level policy to effectively reuse these programmatic options to solve reoccurring subtasks. Our proposed framework outperforms programmatic RL and deep RL baselines on various tasks. Ablation studies justify the effectiveness of our proposed search algorithm for retrieving a set of programmatic options.

Live content is unavailable. Log in and register to view live content