Timezone: »
Building agents capable of understanding language instructions is critical to effective and robust human-AI collaboration. Recent work focuses on training these agents via reinforcement learning in environments with synthetic language; however, instructions often define long-horizon, sparse-reward tasks, and learning policies requires many episodes of experience. We introduce ELLA: Exploration through Learned Language Abstraction, a reward shaping approach geared towards boosting sample efficiency in sparse reward environments by correlating high-level instructions with simpler low-level constituents. ELLA has two key elements: 1) A termination classifier that identifies when agents complete low-level instructions, and 2) A relevance classifier that correlates low-level instructions with success on high-level tasks. We learn the termination classifier offline from pairs of instructions and terminal states. Notably, in departure from prior work in language and abstraction, we learn the relevance classifier online, without relying on an explicit decomposition of high-level instructions to low-level instructions. On a suite of complex BabyAI environments with varying instruction complexities and reward sparsity, ELLA shows gains in sample efficiency relative to language-based shaping and traditional RL methods.
Author Information
Suvir Mirchandani (Stanford University)
Siddharth Karamcheti (Brown University)
Dorsa Sadigh (Stanford)
More from the Same Authors
-
2021 : When Humans Aren’t Optimal: Robots that Collaborate with Risk-Aware Humans »
Minae Kwon · Erdem Biyik · Aditi Talati · Karan Bhasin · Dylan Losey · Dorsa Sadigh -
2022 : Panel Discussion »
Kamalika Chaudhuri · Been Kim · Dorsa Sadigh · Huan Zhang · Linyi Li -
2022 : Invited Talk: Dorsa Sadigh »
Dorsa Sadigh -
2022 : Dorsa Sadigh: Aligning Robot Representations with Humans »
Dorsa Sadigh -
2022 : Aligning Humans and Robots: Active Elicitation of Informative and Compatible Queries »
Dorsa Sadigh -
2022 : Invited Talk: Dorsa Sadigh »
Dorsa Sadigh · Siddharth Karamcheti -
2022 Poster: Assistive Teaching of Motor Control Tasks to Humans »
Megha Srivastava · Erdem Biyik · Suvir Mirchandani · Noah Goodman · Dorsa Sadigh -
2022 Poster: Training and Inference on Any-Order Autoregressive Models the Right Way »
Andy Shih · Dorsa Sadigh · Stefano Ermon -
2021 : Invited Talk: Dorsa Sadigh (Stanford University) on The Role of Conventions in Adaptive Human-AI Interaction »
Dorsa Sadigh -
2021 Poster: HyperSPNs: Compact and Expressive Probabilistic Circuits »
Andy Shih · Dorsa Sadigh · Stefano Ermon -
2021 Poster: Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality »
Songyuan Zhang · ZHANGJIE CAO · Dorsa Sadigh · Yanan Sui -
2020 : Discussion Panel »
Pete Florence · Dorsa Sadigh · Carolina Parada · Jeannette Bohg · Roberto Calandra · Peter Stone · Fabio Ramos -
2020 : Invited Talk - "Walking the Boundary of Learning and Interaction" »
Dorsa Sadigh · Erdem Biyik -
2018 : Panel »
Yimeng Zhang · Alfredo Canziani · Marco Pavone · Dorsa Sadigh · Kurt Keutzer -
2018 : Invited Talk: Dorsa Sadigh, Stanford »
Dorsa Sadigh -
2018 : Dorsa Sadigh »
Dorsa Sadigh -
2018 Poster: Multi-Agent Generative Adversarial Imitation Learning »
Jiaming Song · Hongyu Ren · Dorsa Sadigh · Stefano Ermon