Timezone: »

On the Practical Consistency of Meta-Reinforcement Learning Algorithms
Zheng Xiong · Luisa Zintgraf · Jacob Beck · Risto Vuorio · Shimon Whiteson
Event URL: https://openreview.net/forum?id=xwQgKphwhFA »

Consistency, the theoretical property of a meta learning algorithm of being able to adapt to any task at test time under its default settings (and various assumptions), has been frequently named as desirable in the literature. An open question is whether and how theoretical consistency translates into practice, in comparison to inconsistent algorithms. In this paper, we empirically investigate this question on a set of representative meta-RL algorithms. We find that usually, theoretically consistent algorithms can indeed adapt to out-of-distribution (OOD) tasks, while inconsistent ones cannot, although they can still fail in practice due to reasons like poor exploration. We further find that theoretically inconsistent algorithms can be made consistent by continuing to train on the OOD tasks, and adapt as well or better than consistent ones. We conclude that theoretical consistency is indeed a desirable property, albeit not as advantageous in practice as often assumed.

Author Information

Zheng Xiong (University of Oxford)
Luisa Zintgraf (University of Oxford)
Jacob Beck (Brown University)
Risto Vuorio (University of Oxford)

I'm a PhD student in WhiRL at University of Oxford. I'm interested in reinforcement learning and meta-learning.

Shimon Whiteson (University of Oxford)

More from the Same Authors