Timezone: »
Text-based games present a unique class of sequential decision making problem in which agents interact with a partially observable, simulated environment via actions and observations conveyed through natural language. Such observations typically include instructions that, in a reinforcement learning (RL) setting, can directly or indirectly guide a player towards completing reward-worthy tasks. In this work, we study the ability of RL agents to follow such instructions. We conduct experiments that show that the performance of state-of-the-art text-based game agents is largely unaffected by the presence or absence of such instructions, and that these agents are typically unable to execute tasks to completion. To further study and address the task of instruction following, we equip RL agents with an internal structured representation of natural language instructions in the form of Linear Temporal Logic (LTL), a formal language that is increasingly used for temporally extended reward specification in RL. Our framework both supports and highlights the benefit of understanding the temporal semantics of instructions and in measuring progress towards achievement of such a temporally extended behaviour. Experiments with 500+ games in TextWorld demonstrate the superior performance of our approach.
Author Information
Mathieu Tuli (University of Toronto and Vector Institute)
Andrew Li (University of Toronto)

I am a second-year PhD student in Computer Science at the University of Toronto and the Vector Institute for Artificial Intelligence, supervised by Sheila McIlraith. My research interests lie at the intersection of Machine Learning (particularly Reinforcement Learning), AI Planning, and Knowledge Representation & Reasoning. I aim to develop AI which learns over a long lifetime by acquiring knowledge from its interactions with the world, abstracting knowledge into generalizable concepts, and reasoning at a high-level to robustly handle new situations.
Pashootan Vaezipoor (University of Toronto)
Working in the intersection of Machine Learning and Symbolic AI. Currently working on improvement of SAT solvers via Reinforcement Learning.
Toryn Klassen (University of Toronto)
Scott Sanner (University of Toronto)
Sheila McIlraith (University of Toronto and Vector Institute)
More from the Same Authors
-
2020 : Poster #6 »
Sheila McIlraith -
2020 : Poster #18 »
Pashootan Vaezipoor -
2021 : Avoiding Negative Side Effects by Considering Others »
Parand Alizadeh Alamdari · Toryn Klassen · Rodrigo Toro Icarte · Sheila McIlraith -
2021 : Towards Robust and Automatic Hyper-Parameter Tunning »
Mathieu Tuli · Mahdi Hosseini · Konstantinos N Plataniotis -
2021 : BLAST: Latent Dynamics Models from Bootstrapping »
Keiran Paster · Lev McKinney · Sheila McIlraith · Jimmy Ba -
2022 : Graphs, Constraints, and Search for the Abstraction and Reasoning Corpus »
Yudong Xu · Elias Khalil · Scott Sanner -
2022 : Return Augmentation gives Supervised RL Temporal Compositionality »
Keiran Paster · Silviu Pitis · Sheila McIlraith · Jimmy Ba -
2022 : Noisy Symbolic Abstractions for Deep RL: A case study with Reward Machines »
Andrew Li · Zizhao Chen · Pashootan Vaezipoor · Toryn Klassen · Rodrigo Toro Icarte · Sheila McIlraith -
2022 : Return Augmentation gives Supervised RL Temporal Compositionality »
Keiran Paster · Silviu Pitis · Sheila McIlraith · Jimmy Ba -
2022 : Epistemic Side Effects & Avoiding Them (Sometimes) »
Toryn Klassen · Parand Alizadeh Alamdari · Sheila McIlraith -
2022 Poster: You Can’t Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments »
Keiran Paster · Sheila McIlraith · Jimmy Ba -
2021 : Poster Session 2 (gather.town) »
Wenjie Li · Akhilesh Soni · Jinwuk Seok · Jianhao Ma · Jeffery Kline · Mathieu Tuli · Miaolan Xie · Robert Gower · Quanqi Hu · Matteo Cacciola · Yuanlu Bai · Boyue Li · Wenhao Zhan · Shentong Mo · Junhyung Lyle Kim · Sajad Fathi Hafshejani · Chris Junchi Li · Zhishuai Guo · Harshvardhan Harshvardhan · Neha Wadia · Tatjana Chavdarova · Difan Zou · Zixiang Chen · Aman Gupta · Jacques Chen · Betty Shea · Benoit Dherin · Aleksandr Beznosikov -
2021 Poster: Risk-Aware Transfer in Reinforcement Learning using Successor Features »
Michael Gimelfarb · Andre Barreto · Scott Sanner · Chi-Guhn Lee -
2021 Poster: Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models »
Yi Sui · Ga Wu · Scott Sanner -
2020 : Contributed Talk: Planning from Pixels using Inverse Dynamics Models »
Keiran Paster · Sheila McIlraith · Jimmy Ba -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Spotlight 2 »
Aaron Sidford · Mengdi Wang · Lin Yang · Yinyu Ye · Zuyue Fu · Zhuoran Yang · Yongxin Chen · Zhaoran Wang · Ofir Nachum · Bo Dai · Ilya Kostrikov · Dale Schuurmans · Ziyang Tang · Yihao Feng · Lihong Li · Denny Zhou · Qiang Liu · Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Simon Du · Sham Kakade · Ruosong Wang · Minshuo Chen · Tianyi Liu · Xingguo Li · Zhaoran Wang · Tuo Zhao · Philip Amortila · Doina Precup · Prakash Panangaden · Marc Bellemare -
2019 : Poster Spotlights B (13 posters) »
Alberto Camacho · Chris Percy · Vaishak Belle · Beliz Gunel · Toryn Klassen · Tillman Weyde · Mohamed Ghalwash · Siddhant Arora · León Illanes · Jonathan Raiman · Qing Wang · Alexander Lew · So Yeon Min -
2019 Poster: Learning Reward Machines for Partially Observable Reinforcement Learning »
Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Sheila McIlraith -
2019 Spotlight: Learning Reward Machines for Partially Observable Reinforcement Learning »
Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Sheila McIlraith -
2018 : Poster Session »
Carl Trimbach · Mennatullah Siam · Rodrigo Toro Icarte · Zhongtian Dai · Sheila McIlraith · Matthew Rahtz · Robert Sheline · Christopher MacLellan · Carolin Lawrence · Stefan Riezler · Dylan Hadfield-Menell · Fang-I Hsiao -
2018 : Teaching Multiple Tasks to an RL Agent using LTL »
Rodrigo Toro Icarte · Sheila McIlraith -
2012 Poster: Symbolic Dynamic Programming for Continuous State and Observation POMDPs »
Zahra Zamani · Scott Sanner · Pascal Poupart · Kristian Kersting -
2011 Workshop: Choice Models and Preference Learning »
Jean-Marc Andreoli · Cedric Archambeau · Guillaume Bouchard · Shengbo Guo · Kristian Kersting · Scott Sanner · Martin Szummer · Paolo Viappiani · Onno Zoeter -
2010 Poster: Gaussian Process Preference Elicitation »
Edwin Bonilla · Shengbo Guo · Scott Sanner