Timezone: »
Inverse Reinforcement Learning (IRL) is an effective approach to recover a reward function that explains the behavior of an expert by observing a set of demonstrations. This paper is about a novel model-free IRL approach that, differently from most of the existing IRL algorithms, does not require to specify a function space where to search for the expert's reward function. Leveraging on the fact that the policy gradient needs to be zero for any optimal policy, the algorithm generates a set of basis functions that span the subspace of reward functions that make the policy gradient vanish. Within this subspace, using a second-order criterion, we search for the reward function that penalizes the most a deviation from the expert's policy. After introducing our approach for finite domains, we extend it to continuous ones. The proposed approach is empirically compared to other IRL methods both in the (finite) Taxi domain and in the (continuous) Linear Quadratic Gaussian (LQG) and Car on the Hill environments.
Author Information
Alberto Maria Metelli (Politecnico di Milano)
Matteo Pirotta (Facebook AI Research)
Marcello Restelli (Politecnico di Milano)
More from the Same Authors
-
2021 Spotlight: Subgaussian and Differentiable Importance Sampling for Off-Policy Evaluation and Learning »
Alberto Maria Metelli · Alessio Russo · Marcello Restelli -
2021 : Policy Optimization via Optimal Policy Evaluation »
Alberto Maria Metelli · Samuele Meta · Marcello Restelli -
2022 : Multi-Armed Bandit Problem with Temporally-Partitioned Rewards »
Giulia Romano · Andrea Agostini · Francesco Trovò · Nicola Gatti · Marcello Restelli -
2022 : Provably Efficient Causal Model-Based Reinforcement Learning for Environment-Agnostic Generalization »
Mirco Mutti · Riccardo De Santi · Emanuele Rossi · Juan Calderon · Michael Bronstein · Marcello Restelli -
2023 Poster: Distributional Policy Evaluation: a Maximum Entropy approach to Representation Learning »
Riccardo Zamboni · Alberto Maria Metelli · Marcello Restelli -
2023 Poster: Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach »
Riccardo Poiani · Nicole Nobili · Alberto Maria Metelli · Marcello Restelli -
2022 Poster: Multi-Fidelity Best-Arm Identification »
Riccardo Poiani · Alberto Maria Metelli · Marcello Restelli -
2022 Poster: Challenging Common Assumptions in Convex Reinforcement Learning »
Mirco Mutti · Riccardo De Santi · Piersilvio De Bartolomeis · Marcello Restelli -
2022 Poster: Off-Policy Evaluation with Deficient Support Using Side Information »
Nicolò Felicioni · Maurizio Ferrari Dacrema · Marcello Restelli · Paolo Cremonesi -
2021 Poster: Learning in Non-Cooperative Configurable Markov Decision Processes »
Giorgia Ramponi · Alberto Maria Metelli · Alessandro Concetti · Marcello Restelli -
2021 Poster: Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection »
Matteo Papini · Andrea Tirinzoni · Aldo Pacchiano · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 Poster: Subgaussian and Differentiable Importance Sampling for Off-Policy Evaluation and Learning »
Alberto Maria Metelli · Alessio Russo · Marcello Restelli -
2020 Poster: An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits »
Andrea Tirinzoni · Matteo Pirotta · Marcello Restelli · Alessandro Lazaric -
2020 Poster: Inverse Reinforcement Learning from a Gradient-based Learner »
Giorgia Ramponi · Gianluca Drappo · Marcello Restelli -
2020 Session: Orals & Spotlights Track 31: Reinforcement Learning »
Dotan Di Castro · Marcello Restelli -
2019 Poster: Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters »
Alberto Maria Metelli · Amarildo Likmeta · Marcello Restelli -
2018 Poster: Policy Optimization via Importance Sampling »
Alberto Maria Metelli · Matteo Papini · Francesco Faccio · Marcello Restelli -
2018 Poster: Transfer of Value Functions via Variational Methods »
Andrea Tirinzoni · Rafael Rodriguez Sanchez · Marcello Restelli -
2018 Oral: Policy Optimization via Importance Sampling »
Alberto Maria Metelli · Matteo Papini · Francesco Faccio · Marcello Restelli -
2018 Poster: Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes »
Ronan Fruit · Matteo Pirotta · Alessandro Lazaric -
2018 Spotlight: Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes »
Ronan Fruit · Matteo Pirotta · Alessandro Lazaric -
2017 Poster: Regret Minimization in MDPs with Options without Prior Knowledge »
Ronan Fruit · Matteo Pirotta · Alessandro Lazaric · Emma Brunskill -
2017 Poster: Adaptive Batch Size for Safe Policy Gradients »
Matteo Papini · Matteo Pirotta · Marcello Restelli -
2017 Spotlight: Regret Minimization in MDPs with Options without Prior Knowledge »
Ronan Fruit · Matteo Pirotta · Alessandro Lazaric · Emma Brunskill -
2014 Poster: Sparse Multi-Task Reinforcement Learning »
Daniele Calandriello · Alessandro Lazaric · Marcello Restelli -
2013 Poster: Adaptive Step-Size for Policy Gradient Methods »
Matteo Pirotta · Marcello Restelli · Luca Bascetta -
2011 Poster: Transfer from Multiple MDPs »
Alessandro Lazaric · Marcello Restelli -
2007 Spotlight: Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods »
Alessandro Lazaric · Marcello Restelli · Andrea Bonarini -
2007 Poster: Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods »
Alessandro Lazaric · Marcello Restelli · Andrea Bonarini