Timezone: »
In recent years, we have witnessed tremendous progress in deep reinforcement learning (RL) for tasks such as Go, Chess, video games, and robot control. Nevertheless, other combinatorial domains, such as AI planning, still pose considerable challenges for RL approaches. The key difficulty in those domains is that a positive reward signal becomes {\em exponentially rare} as the minimal solution length increases. So, an RL approach loses its training signal. There has been promising recent progress by using a curriculum-driven learning approach that is designed to solve a single hard instance. We present a novel {\em automated} curriculum approach that dynamically selects from a pool of unlabeled training instances of varying task complexity guided by our {\em difficulty quantum momentum} strategy. We show how the smoothness of the task hardness impacts the final learning results. In particular, as the size of the instance pool increases, the ``hardness gap'' decreases, which facilitates a smoother automated curriculum based learning process. Our automated curriculum approach dramatically improves upon the previous approaches. We show our results on Sokoban, which is a traditional PSPACE-complete planning problem and presents a great challenge even for specialized solvers. Our RL agent can solve hard instances that are far out of reach for any previous state-of-the-art Sokoban solver. In particular, our approach can uncover plans that require hundreds of steps, while the best previous search methods would take many years of computing time to solve such instances. In addition, we show that we can further boost the RL performance with an intricate coupling of our automated curriculum approach with a curiosity-driven search strategy and a graph neural net representation.
Author Information
Dieqiao Feng (Cornell University)
Carla Gomes (Cornell University)
Bart Selman (Cornell University)
More from the Same Authors
-
2021 : Gaussian Mixture Variational Autoencoder with Contrastive Learning for Multi-Label Classification »
Junwen Bai · Shufeng Kong · Carla Gomes -
2021 : Gaussian Mixture Variational Autoencoder with Contrastive Learning for Multi-Label Classification »
Junwen Bai · Shufeng Kong · Carla Gomes -
2021 : Resolving Super Fine-Resolution SIF via Coarsely-Supervised U-Net Regression »
Joshua Fan · Di Chen · Jiaming Wen · Ying Sun · Carla Gomes -
2021 : A GNN-RNN Approach for Harnessing Geospatial and Temporal Information: Application to Crop Yield Prediction »
Joshua Fan · Junwen Bai · Zhiyun Li · Ariel Ortiz-Bobea · Carla Gomes -
2022 : Xtal2DoS: Attention-based Crystal to Sequence Learning for Density of States Prediction »
Junwen Bai · Yuanqi Du · Yingheng Wang · Shufeng Kong · John Gregoire · Carla Gomes -
2022 : Structure-based Drug Design with Equivariant Diffusion Models »
Arne Schneuing · Yuanqi Du · Charles Harris · Arian Jamasb · Ilia Igashov · weitao Du · Tom Blundell · Pietro Lió · Carla Gomes · Max Welling · Michael Bronstein · Bruno Correia -
2022 Workshop: AI for Science: Progress and Promises »
Yi Ding · Yuanqi Du · Tianfan Fu · Hanchen Wang · Anima Anandkumar · Yoshua Bengio · Anthony Gitter · Carla Gomes · Aviv Regev · Max Welling · Marinka Zitnik -
2022 Poster: Left Heavy Tails and the Effectiveness of the Policy and Value Networks in DNN-based best-first search for Sokoban Planning »
Dieqiao Feng · Carla Gomes · Bart Selman -
2021 : A GNN-RNN Approach for Harnessing Geospatial and Temporal Information: Application to Crop Yield Prediction »
Joshua Fan · Junwen Bai · Zhiyun Li · Ariel Ortiz-Bobea · Carla Gomes -
2021 : Resolving Super Fine-Resolution SIF via Coarsely-Supervised U-Net Regression »
Joshua Fan · Di Chen · Jiaming Wen · Ying Sun · Carla Gomes -
2021 : Cooperative Multi-Agent Fairness and Equivariant Policies »
Niko Grupen · Bart Selman · Daniel Lee -
2021 Poster: Towards Deeper Deep Reinforcement Learning with Spectral Normalization »
Nils Bjorck · Carla Gomes · Kilian Weinberger -
2021 Poster: Contrastively Disentangled Sequential Variational Autoencoder »
Junwen Bai · Weiran Wang · Carla Gomes -
2019 : AI and Sustainable Development »
Fei Fang · Carla Gomes · Miguel Luengo-Oroz · Thomas Dietterich · Julien Cornebise -
2019 : Carla Gomes (Cornell) »
Carla Gomes -
2019 : Climate Change: A Grand Challenge for ML »
Yoshua Bengio · Carla Gomes · Andrew Ng · Jeff Dean · Lester Mackey -
2019 : Computational Sustainability: Computing for a Better World and a Sustainable Future »
Carla Gomes -
2018 Poster: Understanding Batch Normalization »
Johan Bjorck · Carla Gomes · Bart Selman · Kilian Weinberger -
2016 Poster: Solving Marginal MAP Problems with NP Oracles and Parity Constraints »
Yexiang Xue · zhiyuan li · Stefano Ermon · Carla Gomes · Bart Selman -
2013 Workshop: Machine Learning for Sustainability »
Edwin Bonilla · Thomas Dietterich · Theodoros Damoulas · Andreas Krause · Daniel Sheldon · Iadine Chades · J. Zico Kolter · Bistra Dilkina · Carla Gomes · Hugo P Simao -
2013 Poster: Embed and Project: Discrete Sampling with Universal Hashing »
Stefano Ermon · Carla Gomes · Ashish Sabharwal · Bart Selman -
2012 Poster: Density Propagation and Improved Bounds on the Partition Function »
Stefano Ermon · Carla Gomes · Ashish Sabharwal · Bart Selman -
2011 Poster: Accelerated Adaptive Markov Chain for Partition Function Computation »
Stefano Ermon · Carla Gomes · Ashish Sabharwal · Bart Selman -
2011 Spotlight: Accelerated Adaptive Markov Chain for Partition Function Computation »
Stefano Ermon · Carla Gomes · Ashish Sabharwal · Bart Selman -
2008 Poster: Counting Solution Clusters Using Belief Propagation »
Lukas Kroc · Ashish Sabharwal · Bart Selman -
2006 Poster: Near-Uniform Sampling of Combinatorial Spaces Using XOR Constraints »
Carla Gomes · Ashish Sabharwal · Bart Selman