Skip to yearly menu bar Skip to main content

Workshop: Workshop on Machine Learning Safety

Spectral Robustness Analysis of Deep Imitation Learning

Ezgi Korkmaz


Deep reinforcement learning algorithms enabled learning functioning policies in MDPs with complex state representations. Following these advancements deep reinforcement learning polices have been deployed in many diverse settings. However, a line of research argued that in certain settings building a reward function can be more complicated than learning it. Hence, several studies proposed different methods to learn a reward function by observing trajectories of a functioning policy (i.e. inverse reinforcement learning). Following this line of research several studies proposed to directly learn a functioning policy by solely observing trajectories of an expert (i.e. imitation learning). In this paper, we propose a novel method to analyze the spectral robustness of deep neural policies. We conduct several experiments in the Arcade Learning Environment, and demonstrate that simple vanilla trained deep reinforcement learning policies are more robust than deep inverse reinforcement learning policies. We believe that our method provides a comprehensive analysis on the policy robustness and can help understanding the fundamental properties of different training techniques.

Chat is not available.