Timezone: »
We introduce an automatic curriculum algorithm, Variational Automatic Curriculum Learning (VACL), for solving challenging goal-conditioned cooperative multi-agent reinforcement learning problems. We motivate our curriculum learning paradigm through a variational perspective, where the learning objective can be decomposed into two terms: task learning on the current curriculum, and curriculum update to a new task distribution. Local optimization over the second term suggests that the curriculum should gradually expand the training tasks from easy to hard. Our VACL algorithm implements this variational paradigm with two practical components, task expansion and entity curriculum, which produces a series of training tasks over both the task configurations as well as the number of entities in the task. Experiment results show that VACL solves a collection of sparse-reward problems with a large number of agents. Particularly, using a single desktop machine, VACL achieves 98% coverage rate with 100 agents in the simple-spread benchmark and reproduces the ramp-use behavior originally shown in OpenAI’s hide-and-seek project.
Author Information
Jiayu Chen (Tsinghua University)
Yuanxin Zhang (Tsinghua University, Tsinghua University)
Yuanfan Xu (Tsinghua University, Tsinghua University)
Huimin Ma (Tsinghua University)
Huazhong Yang
Jiaming Song (Stanford University)
I am a first year Ph.D. student in Stanford University. I think about problems in machine learning and deep learning under the supervision of Stefano Ermon. I did my undergrad at Tsinghua University, where I was lucky enough to collaborate with Jun Zhu and Lawrence Carin on scalable Bayesian machine learning.
Yu Wang (Tsinghua University)
Yu Wang received his B.S. degree in 2002 and Ph.D. degree (with honor) in 2007 from Tsinghua University, Beijing. He is currently a Tenured Associate Professor with the Department of Electronic Engineering, Tsinghua University. His research interests include brain inspired computing, application specific hardware computing, parallel circuit analysis, and power/reliability aware system design methodology. Dr. Wang has authored and coauthored over 150 papers in refereed journals and conferences. He has received Best Paper Award in FPGA 2017, ISVLSI 2012, and Best Poster Award in HEART 2012 with 8 Best Paper Nominations. He is a recipient of IBM X10 Faculty Award in 2010. He served as TPC chair for ICFPT 2011 and Finance Chair of ISLPED 2012-2016, and served as program committee member for leading conferences in these areas, including top EDA conferences such as DAC, DATE, ICCAD, ASP-DAC, and top FPGA conferences such as FPGA and FPT. Currently he serves as Co-EIC for SIGDA E-Newsletter, Associate Editor for IEEE Transactions on CAD and Journal of Circuits, Systems, and Computers. He also serves as guest editor for Integration, the VLSI Journal and IEEE Transactions on Multi-Scale Computing Systems. He is a recipient of NSFC Excellent Young Scholar,and is now serving as ACM distinguished speaker. He is an IEEE/ACM senior member.
Yi Wu (OpenAI)
More from the Same Authors
-
2020 : Paper 46: Disagreement-Regularized Imitation of Complex Multi-Agent Interactions »
Jiaming Song · Stefano Ermon -
2021 Spotlight: IQ-Learn: Inverse soft-Q Learning for Imitation »
Divyansh Garg · Shuvam Chakraborty · Chris Cundy · Jiaming Song · Stefano Ermon -
2021 : Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension »
Shusheng Xu · Yichen Liu · Xiaoyu Yi · Siyuan Zhou · Huizi Li · Yi Wu -
2021 : Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization »
Zihan Zhou · Wei Fu · Bingliang Zhang · Yi Wu -
2021 : Learning Efficient Multi-Agent Cooperative Visual Exploration »
Chao Yu · Jiaxuan Gao · Huazhong Yang · Yu Wang · Yi Wu -
2021 : Likelihood-free Density Ratio Acquisition Functions are not Equivalent to Expected Improvements »
Jiaming Song · Stefano Ermon -
2021 : Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination »
Rui Zhao · Jinming Song · Hu Haifeng · Yang Gao · Yi Wu · Zhongqian Sun · Wei Yang -
2022 : JPEG Artifact Correction using Denoising Diffusion Restoration Models »
Bahjat Kawar · Jiaming Song · Stefano Ermon · Michael Elad -
2023 Workshop: NeurIPS 2023 Workshop on Diffusion Models »
Bahjat Kawar · Valentin De Bortoli · Charlotte Bunne · James Thornton · Jiaming Song · Jong Chul Ye · Chenlin Meng -
2022 Spotlight: TA-GATES: An Encoding Scheme for Neural Network Architectures »
Xuefei Ning · Zixuan Zhou · Junbo Zhao · Tianchen Zhao · Yiping Deng · Changcheng Tang · Shuang Liang · Huazhong Yang · Yu Wang -
2022 Spotlight: Lightning Talks 4B-1 »
Alexandra Senderovich · Zhijie Deng · Navid Ansari · Xuefei Ning · Yasmin Salehi · Xiang Huang · Chenyang Wu · Kelsey Allen · Jiaqi Han · Nikita Balagansky · Tatiana Lopez-Guevara · Tianci Li · Zhanhong Ye · Zixuan Zhou · Feng Zhou · Ekaterina Bulatova · Daniil Gavrilov · Wenbing Huang · Dennis Giannacopoulos · Hans-peter Seidel · Anton Obukhov · Kimberly Stachenfeld · Hongsheng Liu · Jun Zhu · Junbo Zhao · Hengbo Ma · Nima Vahidi Ferdowsi · Zongzhang Zhang · Vahid Babaei · Jiachen Li · Alvaro Sanchez Gonzalez · Yang Yu · Shi Ji · Maxim Rakhuba · Tianchen Zhao · Yiping Deng · Peter Battaglia · Josh Tenenbaum · Zidong Wang · Chuang Gan · Changcheng Tang · Jessica Hamrick · Kang Yang · Tobias Pfaff · Yang Li · Shuang Liang · Min Wang · Huazhong Yang · Haotian CHU · Yu Wang · Fan Yu · Bei Hua · Lei Chen · Bin Dong -
2022 Poster: Concrete Score Matching: Generalized Score Matching for Discrete Data »
Chenlin Meng · Kristy Choi · Jiaming Song · Stefano Ermon -
2022 Poster: LISA: Learning Interpretable Skill Abstractions from Language »
Divyansh Garg · Skanda Vaidyanath · Kuno Kim · Jiaming Song · Stefano Ermon -
2022 Poster: Denoising Diffusion Restoration Models »
Bahjat Kawar · Michael Elad · Stefano Ermon · Jiaming Song -
2022 Poster: The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games »
Chao Yu · Akash Velu · Eugene Vinitsky · Jiaxuan Gao · Yu Wang · Alexandre Bayen · YI WU -
2022 Poster: TA-GATES: An Encoding Scheme for Neural Network Architectures »
Xuefei Ning · Zixuan Zhou · Junbo Zhao · Tianchen Zhao · Yiping Deng · Changcheng Tang · Shuang Liang · Huazhong Yang · Yu Wang -
2021 Poster: Imitation with Neural Density Models »
Kuno Kim · Akshat Jindal · Yang Song · Jiaming Song · Yanan Sui · Stefano Ermon -
2021 Poster: D2C: Diffusion-Decoding Models for Few-Shot Conditional Generation »
Abhishek Sinha · Jiaming Song · Chenlin Meng · Stefano Ermon -
2021 Poster: Evaluating Efficient Performance Estimators of Neural Architectures »
Xuefei Ning · Changcheng Tang · Wenshuo Li · Zixuan Zhou · Shuang Liang · Huazhong Yang · Yu Wang -
2021 Poster: Pseudo-Spherical Contrastive Divergence »
Lantao Yu · Jiaming Song · Yang Song · Stefano Ermon -
2021 Poster: IQ-Learn: Inverse soft-Q Learning for Imitation »
Divyansh Garg · Shuvam Chakraborty · Chris Cundy · Jiaming Song · Stefano Ermon -
2021 Poster: NovelD: A Simple yet Effective Exploration Criterion »
Tianjun Zhang · Huazhe Xu · Xiaolong Wang · Yi Wu · Kurt Keutzer · Joseph Gonzalez · Yuandong Tian -
2021 Poster: CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation »
Yusuke Tashiro · Jiaming Song · Yang Song · Stefano Ermon -
2020 Poster: Belief Propagation Neural Networks »
Jonathan Kuck · Shuvam Chakraborty · Hao Tang · Rachel Luo · Jiaming Song · Ashish Sabharwal · Stefano Ermon -
2020 Poster: Autoregressive Score Matching »
Chenlin Meng · Lantao Yu · Yang Song · Jiaming Song · Stefano Ermon -
2020 Poster: Multi-label Contrastive Predictive Coding »
Jiaming Song · Stefano Ermon -
2020 Oral: Multi-label Contrastive Predictive Coding »
Jiaming Song · Stefano Ermon -
2019 : Poster session »
Sebastian Farquhar · Erik Daxberger · Andreas Look · Matt Benatan · Ruiyi Zhang · Marton Havasi · Fredrik Gustafsson · James A Brofos · Nabeel Seedat · Micha Livne · Ivan Ustyuzhaninov · Adam Cobb · Felix D McGregor · Patrick McClure · Tim R. Davidson · Gaurush Hiranandani · Sanjeev Arora · Masha Itkina · Didrik Nielsen · William Harvey · Matias Valdenegro-Toro · Stefano Peluchetti · Riccardo Moriconi · Tianyu Cui · Vaclav Smidl · Taylan Cemgil · Jack Fitzsimons · He Zhao · · mariana vargas vieyra · Apratim Bhattacharyya · Rahul Sharma · Geoffroy Dubourg-Felonneau · Jonathan Warrell · Slava Voloshynovskiy · Mihaela Rosca · Jiaming Song · Andrew Ross · Homa Fashandi · Ruiqi Gao · Hooshmand Shokri Razaghi · Joshua Chang · Zhenzhong Xiao · Vanessa Boehm · Giorgio Giannone · Ranganath Krishnan · Joe Davison · Arsenii Ashukha · Jeremiah Liu · Sicong (Sheldon) Huang · Evgenii Nikishin · Sunho Park · Nilesh Ahuja · Mahesh Subedar · · Artyom Gadetsky · Jhosimar Arias Figueroa · Tim G. J. Rudner · Waseem Aslam · Adrián Csiszárik · John Moberg · Ali Hebbal · Kathrin Grosse · Pekka Marttinen · Bang An · Hlynur Jónsson · Samuel Kessler · Abhishek Kumar · Mikhail Figurnov · Omesh Tickoo · Steindor Saemundsson · Ari Heljakka · Dániel Varga · Niklas Heim · Simone Rossi · Max Laves · Waseem Gharbieh · Nicholas Roberts · Luis Armando Pérez Rey · Matthew Willetts · Prithvijit Chakrabarty · Sumedh Ghaisas · Carl Shneider · Wray Buntine · Kamil Adamczewski · Xavier Gitiaux · Suwen Lin · Hao Fu · Gunnar Rätsch · Aidan Gomez · Erik Bodin · Dinh Phung · Lennart Svensson · Juliano Tusi Amaral Laganá Pinto · Milad Alizadeh · Jianzhun Du · Kevin Murphy · Beatrix Benkő · Shashaank Vattikuti · Jonathan Gordon · Christopher Kanan · Sontje Ihler · Darin Graham · Michael Teng · Louis Kirsch · Tomas Pevny · Taras Holotyak -
2019 Workshop: Information Theory and Machine Learning »
Shengjia Zhao · Jiaming Song · Yanjun Han · Kristy Choi · Pratyusha Kalluri · Ben Poole · Alex Dimakis · Jiantao Jiao · Tsachy Weissman · Stefano Ermon -
2019 Poster: Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting »
Aditya Grover · Jiaming Song · Ashish Kapoor · Kenneth Tran · Alekh Agarwal · Eric Horvitz · Stefano Ermon -
2018 : Adversarial Vision Challenge: Towards More Effective Black-Box Adversarial Training »
Xuefei Ning · Wenshuo Li · Yu Wang -
2018 : Coffee Break and Poster Session I »
Pim de Haan · Bin Wang · Dequan Wang · Aadil Hayat · Ibrahim Sobh · Muhammad Asif Rana · Thibault Buhet · Nicholas Rhinehart · Arjun Sharma · Alex Bewley · Michael Kelly · Lionel Blondé · Ozgur S. Oguz · Vaibhav Viswanathan · Jeroen Vanbaar · Konrad Żołna · Negar Rostamzadeh · Rowan McAllister · Sanjay Thakur · Alexandros Kalousis · Chelsea Sidrane · Sujoy Paul · Daphne Chen · Michal Garmulewicz · Henryk Michalewski · Coline Devin · Hongyu Ren · Jiaming Song · Wen Sun · Hanzhang Hu · Wulong Liu · Emilie Wirbel -
2018 Poster: Multi-Agent Generative Adversarial Imitation Learning »
Jiaming Song · Hongyu Ren · Dorsa Sadigh · Stefano Ermon -
2018 Poster: Bias and Generalization in Deep Generative Models: An Empirical Study »
Shengjia Zhao · Hongyu Ren · Arianna Yuan · Jiaming Song · Noah Goodman · Stefano Ermon -
2018 Spotlight: Bias and Generalization in Deep Generative Models: An Empirical Study »
Shengjia Zhao · Hongyu Ren · Arianna Yuan · Jiaming Song · Noah Goodman · Stefano Ermon -
2017 Poster: A-NICE-MC: Adversarial Training for MCMC »
Jiaming Song · Shengjia Zhao · Stefano Ermon -
2017 Poster: InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations »
Yunzhu Li · Jiaming Song · Stefano Ermon -
2015 Poster: 3D Object Proposals for Accurate Object Class Detection »
Xiaozhi Chen · Kaustav Kundu · Yukun Zhu · Andrew G Berneshawi · Huimin Ma · Sanja Fidler · Raquel Urtasun