Workshop: Memory in Artificial and Real Intelligence (MemARI)

Leveraging Episodic Memory to Improve World Models for Reinforcement Learning

Julian Coda-Forno · Changmin Yu · Qinghai Guo · Zafeirios Fountas · Neil Burgess

Keywords: [ World Models ] [ Reinforcement Learning ] [ episodic memory ]


Poor sample efficiency plagues the practical applicability of deep reinforcement learning (RL) algorithms, especially compared to biological intelligence. In order to close the gap, previous work have proposed to augment the RL framework with an analogue of biological episodic memory, leading to the emerging field of ``episodic control". Episodic memory refers to the ability to recollect individual events independent of the slower process of learning accumulated statistics, and evidence suggests that humans can use episodic memory for planning. Existing attempts to integrate episodic memory components into RL agents have mostly focused on the model-free domain, leaving scope for investigating their roles under the model-based settings. Here we propose the Episodic Memory Module (EMM) to aid learning of world-model transitions, instead of value functions for standard Episodic-RL. The EMM stores latent state transitions that have high prediction-error under the model as memories, and uses linearly interpolated memories when the model shows high epistemic uncertainty. Memories are dynamically forgotten with a timescale reflecting their continuing surprise and uncertainty. Implemented in combination with existing world-model agents, the EMM produces a significant boost in performance over baseline agents on complex Atari games such as Montezuma's Revenge. Our results indicate that the EMM can temporarily fill in gaps while a world model is being learned, giving significant advantages in complex environments where such learning is slow.

Chat is not available.