Timezone: »
How can artificial agents learn non-reinforced preferences to continuously adapt their behaviour to a changing environment? We decompose this question into two challenges: (I) encoding diverse memories and (ii) selectively attending to these for preference formation. Our proposed non-reinforced preference learning mechanism using selective attention, Nore, addresses both by leveraging the agent’s world model to collect a diverse set of experiences which are interleaved with imagined roll-outs to encode memories. These memories are selectively attended to, using attention and gating blocks, to update agent’s preferences. We validate Nore in a modified OpenAI Gym FrozenLake environment (without any external signal) with and without volatility under a fixed model of the environment—and compare its behaviour to Pepper, a Hebbian preference learning mechanism. We demonstrate that Nore provides a straightforward framework to induce exploratory preferences in the absence of external signal.
Author Information
Noor Sajid (University College London)
Panagiotis Tigas (University of Oxford)
Zafeirios Fountas (Huawei technologies)
Qinghai Guo (Huawei Technologies)
Alexey Zakharov (Huawei Technologies)
Lancelot Da Costa (Imperial College London)
More from the Same Authors
-
2020 : Spatial Assembly:Generative Architecture With Reinforcement Learning, Self Play and Tree Search »
Panagiotis Tigas -
2020 : Paper 40: Real2sim: Automatic Generation of Open Street Map Towns For Autonomous Driving Benchmarks »
Panagiotis Tigas · Yarin Gal -
2021 : Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks »
Andrey Malinin · Neil Band · Yarin Gal · Mark Gales · Alexander Ganshin · German Chesnokov · Alexey Noskov · Andrey Ploskonosov · Liudmila Prokhorenkova · Ivan Provilkov · Vatsal Raina · Vyas Raina · Denis Roginskiy · Mariya Shmatova · Panagiotis Tigas · Boris Yangel -
2022 : Leveraging Episodic Memory to Improve World Models for Reinforcement Learning »
Julian Coda-Forno · Changmin Yu · Qinghai Guo · Zafeirios Fountas · Neil Burgess -
2022 : Panel Discussion: The Bandwagon Revisited »
Michael Woodford · Noor Sajid · Chris Sims · Jessica Flack · Ryan Cotterell -
2022 : Information-based exploration under active inference »
Noor Sajid -
2022 Poster: Interventions, Where and How? Experimental Design for Causal Models at Scale »
Panagiotis Tigas · Yashas Annadani · Andrew Jesson · Bernhard Schölkopf · Yarin Gal · Stefan Bauer -
2021 : Shifts Challenge: Robustness and Uncertainty under Real-World Distributional Shift + Q&A »
Andrey Malinin · Neil Band · German Chesnokov · Yarin Gal · Alexander Ganshin · Mark Gales · Alexey Noskov · Liudmila Prokhorenkova · Mariya Shmatova · Vyas Raina · Vatsal Raina · Panagiotis Tigas · Boris Yangel -
2021 Poster: Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data »
Andrew Jesson · Panagiotis Tigas · Joost van Amersfoort · Andreas Kirsch · Uri Shalit · Yarin Gal -
2020 Poster: Deep active inference agents using Monte-Carlo methods »
Zafeirios Fountas · Noor Sajid · Pedro Mediano · Karl Friston