Timezone: »

Deconfounded Imitation Learning
Risto Vuorio · Pim de Haan · Johann Brehmer · Hanno Ackermann · Daniel Dijkman · Taco Cohen
Event URL: https://openreview.net/forum?id=hgNn3n5pRKC »

Standard imitation learning can fail when the expert demonstrators have different sensory inputs than the imitating agent. This partial observability gives rise to hidden confounders in the causal graph, which lead to the failure to imitate. We break down the space of confounded imitation learning problems and identify three settings with different data requirements in which the correct imitation policy can be identified. We then introduce an algorithm for deconfounded imitation learning, which trains an inference model jointly with a latent-conditional policy. At test time, the agent alternates between updating its belief over the latent and acting under the belief. We show in theory and practice that this algorithm converges to the correct interventional policy, solves the confounding issue, and can under certain assumptions achieve an asymptotically optimal imitation performance.

Author Information

Risto Vuorio (University of Oxford)

I'm a PhD student in WhiRL at University of Oxford. I'm interested in reinforcement learning and meta-learning.

Pim de Haan (University of Amsterdam, Qualcomm AI Research)
Johann Brehmer (Qualcomm AI Research)
Hanno Ackermann (Qualcomm Inc, QualComm)
Daniel Dijkman (University of Amsterdam)
Taco Cohen (Qualcomm AI Research)

Taco Cohen is a machine learning research scientist at Qualcomm AI Research in Amsterdam and a PhD student at the University of Amsterdam, supervised by prof. Max Welling. He was a co-founder of Scyfer, a company focussed on active deep learning, acquired by Qualcomm in 2017. He holds a BSc in theoretical computer science from Utrecht University and a MSc in artificial intelligence from the University of Amsterdam (both cum laude). His research is focussed on understanding and improving deep representation learning, in particular learning of equivariant and disentangled representations, data-efficient deep learning, learning on non-Euclidean domains, and applications of group representation theory and non-commutative harmonic analysis, as well as deep learning based source compression. He has done internships at Google Deepmind (working with Geoff Hinton) and OpenAI. He received the 2014 University of Amsterdam thesis prize, a Google PhD Fellowship, ICLR 2018 best paper award for “Spherical CNNs”, and was named one of 35 innovators under 35 in Europe by MIT in 2018.

More from the Same Authors