Timezone: »
Diffusion models have emerged as powerful generative models in the text-to-image domain. This paper studies their application as observation-to-action models for imitating human behaviour in sequential environments. Human behaviour is stochastic and multimodal, with structured correlations between action dimensions. Meanwhile, standard modelling choices in behaviour cloning are limited in their expressiveness and may introduce bias into the cloned policy. We begin by pointing out the limitations of these choices. We then propose that diffusion models are an excellent fit for imitating human behaviour, since they learn an expressive distribution over the joint action space. We introduce several innovations to make diffusion models suitable for sequential environments; designing suitable architectures, investigating the role of guidance, and developing reliable sampling strategies. Experimentally, diffusion models closely match human demonstrations in a simulated robotic control task and a modern 3D gaming environment.
Author Information
Tim Pearce (Microsoft Research)
Tabish Rashid (University of Oxford)
Anssi Kanervisto (Microsoft Research)
David Bignell (Research, Microsoft)
Mingfei Sun (Microsoft Research)
Raluca Georgescu (Microsoft)
Sergio Valcarcel Macua (Microsoft Research)
Shan Zheng Tan (Research, Microsoft)

-
Ida Momennejad (Microsoft Research)
Katja Hofmann (Microsoft Research)
Dr. Katja Hofmann is a Principal Researcher at the [Game Intelligence](http://aka.ms/gameintelligence/) group at [Microsoft Research Cambridge, UK](https://www.microsoft.com/en-us/research/lab/microsoft-research-cambridge/). There, she leads a research team that focuses on reinforcement learning with applications in modern video games. She and her team strongly believe that modern video games will drive a transformation of how we interact with AI technology. One of the projects developed by her team is [Project Malmo](https://www.microsoft.com/en-us/research/project/project-malmo/), which uses the popular game Minecraft as an experimentation platform for developing intelligent technology. Katja's long-term goal is to develop AI systems that learn to collaborate with people, to empower their users and help solve complex real-world problems. Before joining Microsoft Research, Katja completed her PhD in Computer Science as part of the [ILPS](https://ilps.science.uva.nl/) group at the [University of Amsterdam](https://www.uva.nl/en). She worked with Maarten de Rijke and Shimon Whiteson on interactive machine learning algorithms for search engines.
Sam Devlin (Microsoft Research)
More from the Same Authors
-
2021 : General Characterization of Agents by States they Visit »
Anssi Kanervisto · Ville Hautamäki -
2021 : Coalitional Bargaining via Reinforcement Learning: An Application to Collaborative Vehicle Routing »
Stephen Mak · Liming Xu · Tim Pearce · Michael Ostroumov · Alexandra Brintrup -
2021 : Counter-Strike Deathmatch with Large-Scale Behavioural Cloning »
Tim Pearce · Jun Zhu -
2022 : Fifteen-minute Competition Overview Video »
Byron Galbraith · Anssi Kanervisto · Steven Wang · Stephanie Milani · Sharada Mohanty · Rohin Shah · Karolis Ramanauskas · Brandon Houghton -
2022 : Contextual Squeeze-and-Excitation »
Massimiliano Patacchiola · John Bronskill · Aliaksandra Shysheya · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2022 : Replay Buffer With Local Forgetting for Adaptive Deep Model-Based Reinforcement Learning »
Ali Rahimi-Kalahroudi · Janarthanan Rajendran · Ida Momennejad · Harm Van Seijen · Sarath Chandar -
2023 Poster: Evaluating Cognitive Maps in Large Language Models: No Emergent Planning »
Ida Momennejad · Felipe Vieira Frujeri · Hosein Hasanbeig · Hamid Palangi · Nebojsa Jojic · Robert Ness · Jonathan Larson -
2023 Poster: BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks »
Stephanie Milani · Anssi Kanervisto · Karolis Ramanauskas · Sander Schulhoff · Brandon Houghton · Rohin Shah -
2023 Poster: SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning »
Benjamin Ellis · Jonathan Cook · Skander Moalla · Mikayel Samvelyan · Mingfei Sun · Anuj Mahajan · Jakob Foerster · Shimon Whiteson -
2023 Oral: BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks »
Stephanie Milani · Anssi Kanervisto · Karolis Ramanauskas · Sander Schulhoff · Brandon Houghton · Rohin Shah -
2022 Panel: Panel 3B-1: Censored Quantile Regression… & Deconfounded Representation Similarity… »
Tim Pearce · Tianyu Cui -
2022 Competition: The MineRL BASALT Competition on Fine-tuning from Human Feedback »
Anssi Kanervisto · Stephanie Milani · Karolis Ramanauskas · Byron Galbraith · Steven Wang · Brandon Houghton · Sharada Mohanty · Rohin Shah -
2022 : Panel Discussion: Opportunities and Challenges »
Kenneth Norman · Janice Chen · Samuel J Gershman · Albert Gu · Sepp Hochreiter · Ida Momennejad · Hava Siegelmann · Sainbayar Sukhbaatar -
2022 : Ida Mommenejad: "Neuro-inspired Memory in Reinforcement Learning: State of the art, Challenges, and Opportunities" »
Ida Momennejad -
2022 : Attention in Task-sets, Planning, and the Prefrontal Cortex »
Ida Momennejad -
2022 Poster: Interaction-Grounded Learning with Action-Inclusive Feedback »
Tengyang Xie · Akanksha Saran · Dylan J Foster · Lekan Molu · Ida Momennejad · Nan Jiang · Paul Mineiro · John Langford -
2022 Poster: Uni[MASK]: Unified Inference in Sequential Decision Problems »
Micah Carroll · Orr Paradise · Jessy Lin · Raluca Georgescu · Mingfei Sun · David Bignell · Stephanie Milani · Katja Hofmann · Matthew Hausknecht · Anca Dragan · Sam Devlin -
2022 Poster: Contextual Squeeze-and-Excitation for Efficient Few-Shot Image Classification »
Massimiliano Patacchiola · John Bronskill · Aliaksandra Shysheya · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2022 Poster: Censored Quantile Regression Neural Networks for Distribution-Free Survival Analysis »
Tim Pearce · Jong-Hyeon Jeong · yichen jia · Jun Zhu -
2021 : Towards RL applications in video games and with human users »
Katja Hofmann -
2021 : Methods:: Understanding Human-like Behavior in Video Game Navigation »
Evelyn Zuniga · Stephanie Milani · Katja Hofmann -
2021 : IGLU: Interactive Grounded Language Understanding in a Collaborative Environment + Q&A »
Julia Kiseleva · Ziming Li · Mohammad Aliannejadi · Maartje Anne ter Hoeve · Mikhail Burtsev · Alexey Skrynnik · Artem Zholus · Aleksandr Panov · Katja Hofmann · Kavya Srinet · arthur szlam · Michel Galley · Ahmed Awadallah -
2021 Poster: FACMAC: Factored Multi-Agent Centralised Policy Gradients »
Bei Peng · Tabish Rashid · Christian Schroeder de Witt · Pierre-Alexandre Kamienny · Philip Torr · Wendelin Boehmer · Shimon Whiteson -
2021 Poster: Regularized Softmax Deep Multi-Agent Q-Learning »
Ling Pan · Tabish Rashid · Bei Peng · Longbo Huang · Shimon Whiteson -
2021 Poster: Grounding Spatio-Temporal Language with Transformers »
Tristan Karch · Laetitia Teodorescu · Katja Hofmann · Clément Moulin-Frier · Pierre-Yves Oudeyer -
2021 Poster: Memory Efficient Meta-Learning with Large Images »
John Bronskill · Daniela Massiceti · Massimiliano Patacchiola · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2020 Workshop: Competition Track Saturday »
Hugo Jair Escalante · Katja Hofmann -
2020 Workshop: Competition Track Friday »
Hugo Jair Escalante · Katja Hofmann -
2020 : Opening - Competition Track Session »
Katja Hofmann · Hugo Jair Escalante -
2020 Poster: Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning »
Tabish Rashid · Gregory Farquhar · Bei Peng · Shimon Whiteson -
2020 : Discussion Panel: Hugo Larochelle, Finale Doshi-Velez, Devi Parikh, Marc Deisenroth, Julien Mairal, Katja Hofmann, Phillip Isola, and Michael Bowling »
Hugo Larochelle · Finale Doshi-Velez · Marc Deisenroth · Devi Parikh · Julien Mairal · Katja Hofmann · Phillip Isola · Michael Bowling -
2019 : Multi-Task Reinforcement Learning and Generalization »
Katja Hofmann -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : The MineRL competition »
Misa Ogura · Joe Booth · Sophia Sun · Nicholay Topin · Brandon Houghton · William Guss · Stephanie Milani · Oriol Vinyals · Katja Hofmann · JIA KIM · Karolis Ramanauskas · Florian Laurent · Daichi Nishio · Anssi Kanervisto · Alexey Skrynnik · Artemij Amiranashvili · Christian Scheller · KAIXIN WANG · Yanick Schraner -
2019 Poster: Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck »
Maximilian Igl · Kamil Ciosek · Yingzhen Li · Sebastian Tschiatschek · Cheng Zhang · Sam Devlin · Katja Hofmann -
2019 Poster: MAVEN: Multi-Agent Variational Exploration »
Anuj Mahajan · Tabish Rashid · Mikayel Samvelyan · Shimon Whiteson -
2019 Poster: Better Exploration with Optimistic Actor Critic »
Kamil Ciosek · Quan Vuong · Robert Loftin · Katja Hofmann -
2019 Spotlight: Better Exploration with Optimistic Actor Critic »
Kamil Ciosek · Quan Vuong · Robert Loftin · Katja Hofmann -
2019 Poster: Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning »
David Janz · Jiri Hron · Przemysław Mazur · Katja Hofmann · José Miguel Hernández-Lobato · Sebastian Tschiatschek -
2019 Tutorial: Reinforcement Learning: Past, Present, and Future Perspectives »
Katja Hofmann -
2018 : How Players Speak to an Intelligent Game Character Using Natural Language Messages »
Katja Hofmann -
2018 : Poster Session 1 »
Stefan Gadatsch · Danil Kuzin · Navneet Kumar · Patrick Dallaire · Tom Ryder · Remus-Petru Pop · Nathan Hunt · Adam Kortylewski · Sophie Burkhardt · Mahmoud Elnaggar · Dieterich Lawson · Yifeng Li · Jongha (Jon) Ryu · Juhan Bae · Micha Livne · Tim Pearce · Mariia Vladimirova · Jason Ramapuram · Jiaming Zeng · Xinyu Hu · Jiawei He · Danielle Maddix · Arunesh Mittal · Albert Shaw · Tuan Anh Le · Alexander Sagel · Lisha Chen · Victor Gallego · Mahdi Karami · Zihao Zhang · Tal Kachman · Noah Weber · Matt Benatan · Kumar K Sricharan · Vincent Cartillier · Ivan Ovinnikov · Buu Phan · Mahmoud Hossam · Liu Ziyin · Valerii Kharitonov · Eugene Golikov · Qiang Zhang · Jae Myung Kim · Sebastian Farquhar · Jishnu Mukhoti · Xu Hu · Gregory Gundersen · Lavanya Sita Tekumalla · Paris Perdikaris · Ershad Banijamali · Siddhartha Jain · Ge Liu · Martin Gottwald · Katy Blumer · Sukmin Yun · Ranganath Krishnan · Roman Novak · Yilun Du · Yu Gong · Beliz Gokkaya · Jessica Ai · Daniel Duckworth · Johannes von Oswald · Christian Henning · Louis-Philippe Morency · Ali Ghodsi · Mahesh Subedar · Jean-Pascal Pfister · Rémi Lebret · Chao Ma · Aleksander Wieczorek · Laurence Perreault Levasseur -
2017 : Panel: "How can we characterise the landscape of intelligent systems and locate human-like intelligence in it?" »
Josh Tenenbaum · Gary Marcus · Katja Hofmann -
2017 : Katja Hofmann: 'Video games and the road to collaborative AI' »
Katja Hofmann -
2016 Demonstration: Project Malmo - Minecraft for AI Research »
Katja Hofmann · Matthew A Johnson · Fernando Diaz · Alekh Agarwal · Tim Hutton · David Bignell · Evelyne Viegas