Timezone: »
Spotlight
Repeated Inverse Reinforcement Learning
Kareem Amin · Nan Jiang · Satinder Singh
Wed Dec 06 05:15 PM -- 05:20 PM (PST) @ Hall A
We introduce a novel repeated Inverse Reinforcement Learning problem: the agent has to act on behalf of a human in a sequence of tasks and wishes to minimize the number of tasks that it surprises the human by acting suboptimally with respect to how the human would have acted. Each time the human is surprised the agent is provided a demonstration of the desired behavior by the human. We formalize this problem, including how the sequence of tasks is chosen, in a few different ways and provide some foundational results.
Author Information
Kareem Amin (Google Research)
Nan Jiang (University of Illinois at Urbana-Champaign)
Satinder Singh (University of Michigan)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Poster: Repeated Inverse Reinforcement Learning »
Thu. Dec 7th 02:30 -- 06:30 AM Room Pacific Ballroom #199
More from the Same Authors
-
2021 : Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning »
Cameron Voloshin · Hoang Le · Nan Jiang · Yisong Yue -
2022 Poster: Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret »
Jiawei Huang · Li Zhao · Tao Qin · Wei Chen · Nan Jiang · Tie-Yan Liu -
2022 : Trajectory-based Explainability Framework for Offline RL »
Shripad Deshmukh · Arpan Dasgupta · Chirag Agarwal · Nan Jiang · Balaji Krishnamurthy · Georgios Theocharous · Jayakumar Subramanian -
2022 : AMORE: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data »
Tengyang Xie · Mohak Bhardwaj · Nan Jiang · Ching-An Cheng -
2022 Spotlight: Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret »
Jiawei Huang · Li Zhao · Tao Qin · Wei Chen · Nan Jiang · Tie-Yan Liu -
2022 Spotlight: Lightning Talks 4A-1 »
Jiawei Huang · Su Jia · Abdurakhmon Sadiev · Ruomin Huang · Yuanyu Wan · Denizalp Goktas · Jiechao Guan · Andrew Li · Wei-Wei Tu · Li Zhao · Amy Greenwald · Jiawei Huang · Dmitry Kovalev · Yong Liu · Wenjie Liu · Peter Richtarik · Lijun Zhang · Zhiwu Lu · R Ravi · Tao Qin · Wei Chen · Hu Ding · Nan Jiang · Tie-Yan Liu -
2022 Poster: Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions »
Audrey Huang · Nan Jiang -
2022 Poster: Interaction-Grounded Learning with Action-Inclusive Feedback »
Tengyang Xie · Akanksha Saran · Dylan J Foster · Lekan Molu · Ida Momennejad · Nan Jiang · Paul Mineiro · John Langford -
2022 Poster: A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation »
Philip Amortila · Nan Jiang · Dhruv Madeka · Dean Foster -
2022 Poster: On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL »
Jinglin Chen · Aditya Modi · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal -
2021 : Retrospective Panel »
Sergey Levine · Nando de Freitas · Emma Brunskill · Finale Doshi-Velez · Nan Jiang · Rishabh Agarwal -
2021 Workshop: Offline Reinforcement Learning »
Rishabh Agarwal · Aviral Kumar · George Tucker · Justin Fu · Nan Jiang · Doina Precup · Aviral Kumar -
2021 Poster: Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning »
Siyuan Zhang · Nan Jiang -
2021 Poster: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2021 Poster: Learning with Labeling Induced Abstentions »
Kareem Amin · Giulia DeSalvo · Afshin Rostamizadeh -
2021 Poster: Learning with User-Level Privacy »
Daniel Levy · Ziteng Sun · Kareem Amin · Satyen Kale · Alex Kulesza · Mehryar Mohri · Ananda Theertha Suresh -
2021 Oral: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2021 Poster: Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning »
Tengyang Xie · Nan Jiang · Huan Wang · Caiming Xiong · Yu Bai -
2020 : Towards Reliable Validation and Evaluation for Offline RL »
Nan Jiang -
2020 : Panel »
Emma Brunskill · Nan Jiang · Nando de Freitas · Finale Doshi-Velez · Sergey Levine · John Langford · Lihong Li · George Tucker · Rishabh Agarwal · Aviral Kumar -
2020 Workshop: Deep Reinforcement Learning »
Pieter Abbeel · Chelsea Finn · Joelle Pineau · David Silver · Satinder Singh · Coline Devin · Misha Laskin · Kimin Lee · Janarthanan Rajendran · Vivek Veeriah -
2020 Poster: Minimax Value Interval for Off-Policy Evaluation and Policy Optimization »
Nan Jiang · Jiawei Huang -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Pan-Private Uniformity Testing »
Kareem Amin · Matthew Joseph -
2019 Workshop: Deep Reinforcement Learning »
Pieter Abbeel · Chelsea Finn · Joelle Pineau · David Silver · Satinder Singh · Joshua Achiam · Carlos Florensa · Christopher Grimm · Haoran Tang · Vivek Veeriah -
2019 Poster: Differentially Private Covariance Estimation »
Kareem Amin · Travis Dick · Alex Kulesza · Andres Munoz Medina · Sergei Vassilvitskii -
2019 Poster: Discovery of Useful Questions as Auxiliary Tasks »
Vivek Veeriah · Matteo Hessel · Zhongwen Xu · Janarthanan Rajendran · Richard L Lewis · Junhyuk Oh · Hado van Hasselt · David Silver · Satinder Singh -
2019 Poster: No-Press Diplomacy: Modeling Multi-Agent Gameplay »
Philip Paquette · Yuchen Lu · SETON STEVEN BOCCO · Max Smith · Satya O.-G. · Jonathan K. Kummerfeld · Joelle Pineau · Satinder Singh · Aaron Courville -
2019 Poster: Provably Efficient Q-Learning with Low Switching Cost »
Yu Bai · Tengyang Xie · Nan Jiang · Yu-Xiang Wang -
2018 Workshop: Deep Reinforcement Learning »
Pieter Abbeel · David Silver · Satinder Singh · Joelle Pineau · Joshua Achiam · Rein Houthooft · Aravind Srinivas -
2018 Poster: On Learning Intrinsic Rewards for Policy Gradient Methods »
Zeyu Zheng · Junhyuk Oh · Satinder Singh -
2018 Poster: On Oracle-Efficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2018 Spotlight: On Oracle-Efficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2018 Poster: Completing State Representations using Spectral Learning »
Nan Jiang · Alex Kulesza · Satinder Singh -
2017 : Afternoon Panel discussion »
Brian Skyrms · Satinder Singh · Jacob Andreas -
2017 : "Language Emergence as Boundedly Optimal Control" »
Satinder Singh -
2017 : Minimax-Regret Querying on Side Effects in Factored Markov Decision Processes »
Satinder Singh -
2017 : Invited Talk - Satindar Singh »
Satinder Singh -
2017 Symposium: Deep Reinforcement Learning »
Pieter Abbeel · Yan Duan · David Silver · Satinder Singh · Junhyuk Oh · Rein Houthooft -
2017 Poster: Value Prediction Network »
Junhyuk Oh · Satinder Singh · Honglak Lee -
2016 Workshop: Deep Reinforcement Learning »
David Silver · Satinder Singh · Pieter Abbeel · Peter Chen -
2015 Workshop: Deep Reinforcement Learning »
Pieter Abbeel · John Schulman · Satinder Singh · David Silver -
2015 Poster: Action-Conditional Video Prediction using Deep Networks in Atari Games »
Junhyuk Oh · Xiaoxiao Guo · Honglak Lee · Richard L Lewis · Satinder Singh -
2015 Spotlight: Action-Conditional Video Prediction using Deep Networks in Atari Games »
Junhyuk Oh · Xiaoxiao Guo · Honglak Lee · Richard L Lewis · Satinder Singh -
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar -
2014 Poster: Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning »
Xiaoxiao Guo · Satinder Singh · Honglak Lee · Richard L Lewis · Xiaoshi Wang -
2013 Poster: Reward Mapping for Transfer in Long-Lived Agents »
Xiaoxiao Guo · Satinder Singh · Richard L Lewis -
2013 Session: Oral Session 9 »
Satinder Singh -
2010 Poster: Reward Design via Online Gradient Ascent »
Jonathan D Sorg · Satinder Singh · Richard L Lewis -
2008 Poster: Simple Local Models for Complex Dynamical Systems »
Erik Talvitie · Satinder Singh -
2008 Oral: Simple Local Models for Complex Dynamical Systems »
Erik Talvitie · Satinder Singh -
2007 Oral: Exponential Family Predictive Representations of State »
David Wingate · Satinder Singh -
2007 Poster: Exponential Family Predictive Representations of State »
David Wingate · Satinder Singh