Timezone: »

 
Noisy Symbolic Abstractions for Deep RL: A case study with Reward Machines
Andrew Li · Zizhao Chen · Pashootan Vaezipoor · Toryn Klassen · Rodrigo Toro Icarte · Sheila McIlraith
Event URL: https://openreview.net/forum?id=LYYhPFpcv95 »

Natural and formal languages provide an effective mechanism for humans to specify instructions and reward functions. We investigate how to generate policies via RL when reward functions are specified in a symbolic language captured by Reward Machines, an increasingly popular automaton-inspired structure. We are interested in the case where the mapping of environment state to the symbolic Reward Machine vocabulary is noisy. We formulate the problem of policy learning in Reward Machines with noisy symbolic abstractions as a special class of POMDP optimization problem, and investigate several methods to address the problem building on existing and new techniques, the latter focused on predicting Reward Machine state, rather than on grounding of individual symbols. We analyze these methods and evaluate them experimentally under varying degrees of uncertainty in the correct interpretation of the symbolic vocabulary. We verify the strength of our approach and the limitation of existing methods via an empirical investigation on both illustrative, toy domains and partially observable, deep RL domains.

Author Information

Andrew Li (University of Toronto)
Andrew Li

I am a second-year PhD student in Computer Science at the University of Toronto and the Vector Institute for Artificial Intelligence, supervised by Sheila McIlraith. My research interests lie at the intersection of Machine Learning (particularly Reinforcement Learning), AI Planning, and Knowledge Representation & Reasoning. I aim to develop AI which learns over a long lifetime by acquiring knowledge from its interactions with the world, abstracting knowledge into generalizable concepts, and reasoning at a high-level to robustly handle new situations.

Zizhao Chen (University of Toronto)
Pashootan Vaezipoor (University of Toronto)

Working in the intersection of Machine Learning and Symbolic AI. Currently working on improvement of SAT solvers via Reinforcement Learning.

Toryn Klassen (University of Toronto)
Rodrigo Toro Icarte (University of Toronto and Vector Institute)

I am a Ph.D. student in the knowledge representation group at the University of Toronto. I am also a member of the Canadian Artificial Intelligence Association and the Vector Institute. My supervisor is Sheila McIlraith. I did my undergrad in Computer Engineering and MSc in Computer Science at Pontificia Universidad Catolica de Chile (PUC). My master's degree was co-supervised by Alvaro Soto and Jorge Baier. While I was at PUC, I taught the undergraduate course "Introduction to Computer Programming Languages."

Sheila McIlraith (University of Toronto and Vector Institute)

More from the Same Authors

  • 2020 : Poster #6 »
    Sheila McIlraith
  • 2020 : Poster #18 »
    Pashootan Vaezipoor
  • 2021 : Avoiding Negative Side Effects by Considering Others »
    Parand Alizadeh Alamdari · Toryn Klassen · Rodrigo Toro Icarte · Sheila McIlraith
  • 2021 : BLAST: Latent Dynamics Models from Bootstrapping »
    Keiran Paster · Lev McKinney · Sheila McIlraith · Jimmy Ba
  • 2022 : Return Augmentation gives Supervised RL Temporal Compositionality »
    Keiran Paster · Silviu Pitis · Sheila McIlraith · Jimmy Ba
  • 2022 : Return Augmentation gives Supervised RL Temporal Compositionality »
    Keiran Paster · Silviu Pitis · Sheila McIlraith · Jimmy Ba
  • 2022 : Epistemic Side Effects & Avoiding Them (Sometimes) »
    Toryn Klassen · Parand Alizadeh Alamdari · Sheila McIlraith
  • 2022 Poster: Learning to Follow Instructions in Text-Based Games »
    Mathieu Tuli · Andrew Li · Pashootan Vaezipoor · Toryn Klassen · Scott Sanner · Sheila McIlraith
  • 2022 Poster: You Can’t Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments »
    Keiran Paster · Sheila McIlraith · Jimmy Ba
  • 2020 : Contributed Talk: Planning from Pixels using Inverse Dynamics Models »
    Keiran Paster · Sheila McIlraith · Jimmy Ba
  • 2019 : Poster and Coffee Break 2 »
    Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall
  • 2019 : Poster Spotlight 2 »
    Aaron Sidford · Mengdi Wang · Lin Yang · Yinyu Ye · Zuyue Fu · Zhuoran Yang · Yongxin Chen · Zhaoran Wang · Ofir Nachum · Bo Dai · Ilya Kostrikov · Dale Schuurmans · Ziyang Tang · Yihao Feng · Lihong Li · Denny Zhou · Qiang Liu · Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Simon Du · Sham Kakade · Ruosong Wang · Minshuo Chen · Tianyi Liu · Xingguo Li · Zhaoran Wang · Tuo Zhao · Philip Amortila · Doina Precup · Prakash Panangaden · Marc Bellemare
  • 2019 : Break / Poster Session 1 »
    Antonia Marcu · Yao-Yuan Yang · Pascale Gourdeau · Chen Zhu · Thodoris Lykouris · Jianfeng Chi · Mark Kozdoba · Arjun Nitin Bhagoji · Xiaoxia Wu · Jay Nandy · Michael T Smith · Bingyang Wen · Yuege Xie · Konstantinos Pitas · Suprosanna Shit · Maksym Andriushchenko · Dingli Yu · Gaël Letarte · Misha Khodak · Hussein Mozannar · Chara Podimata · James Foulds · Yizhen Wang · Huishuai Zhang · Ondrej Kuzelka · Alexander Levine · Nan Lu · Zakaria Mhammedi · Paul Viallard · Diana Cai · Lovedeep Gondara · James Lucas · Yasaman Mahdaviyeh · Aristide Baratin · Rishi Bommasani · Alessandro Barp · Andrew Ilyas · Kaiwen Wu · Jens Behrmann · Omar Rivasplata · Amir Nazemi · Aditi Raghunathan · Will Stephenson · Sahil Singla · Akhil Gupta · YooJung Choi · Yannic Kilcher · Clare Lyle · Edoardo Manino · Andrew Bennett · Zhi Xu · Niladri Chatterji · Emre Barut · Flavien Prost · Rodrigo Toro Icarte · Arno Blaas · Chulhee Yun · Sahin Lale · YiDing Jiang · Tharun Kumar Reddy Medini · Ashkan Rezaei · Alexander Meinke · Stephen Mell · Gary Kazantsev · Shivam Garg · Aradhana Sinha · Vishnu Lokhande · Geovani Rizk · Han Zhao · Aditya Kumar Akash · Jikai Hou · Ali Ghodsi · Matthias Hein · Tyler Sypherd · Yichen Yang · Anastasia Pentina · Pierre Gillot · Antoine Ledent · Guy Gur-Ari · Noah MacAulay · Tianzong Zhang
  • 2019 : Poster Spotlights B (13 posters) »
    Alberto Camacho · Chris Percy · Vaishak Belle · Beliz Gunel · Toryn Klassen · Tillman Weyde · Mohamed Ghalwash · Siddhant Arora · León Illanes · Jonathan Raiman · Qing Wang · Alexander Lew · So Yeon Min
  • 2019 Poster: Learning Reward Machines for Partially Observable Reinforcement Learning »
    Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Sheila McIlraith
  • 2019 Spotlight: Learning Reward Machines for Partially Observable Reinforcement Learning »
    Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Sheila McIlraith
  • 2018 : Poster Session »
    Carl Trimbach · Mennatullah Siam · Rodrigo Toro Icarte · Zhongtian Dai · Sheila McIlraith · Matthew Rahtz · Robert Sheline · Christopher MacLellan · Carolin Lawrence · Stefan Riezler · Dylan Hadfield-Menell · Fang-I Hsiao
  • 2018 : Teaching Multiple Tasks to an RL Agent using LTL »
    Rodrigo Toro Icarte · Sheila McIlraith