Timezone: »
Policies produced by deep reinforcement learning are typically characterised by their learning curves, but they remain poorly understood in many other respects. ReLU-based policies result in a partitioning of the input space into piecewise linear regions. We seek to understand how observed region counts and their densities evolve during deep reinforcement learning using empirical results that span a range of continuous control tasks and policy network dimensions. Intuitively, we may expect that during training, the region density increases in the areas that are frequently visited by the policy, thereby affording fine-grained control. We use recent theoretical and empirical results for the linear regions induced by neural networks in supervised learning settings for grounding and comparison of our results. Empirically, we find that the region density increases only moderately throughout training, as measured along fixed trajectories coming from the final policy. However, the trajectories themselves also increase in length during training, and thus the region densities decrease as seen from the perspective of the current trajectory. Our findings suggest that the complexity of deep reinforcement learning policies does not principally emerge from a significant growth in the complexity of functions observed on-and-around trajectories of the policy.
Author Information
Setareh Cohan (University of British Columbia)
I am a Ph.D. student at the University of British Columbia supervised by Dr. Michiel van de Panne. My research area is machine learning with a focus on reinforcement learning. Prior to that, I was a M.Sc. student at the University of British Columbia supervised by Dr. Jim Little and co-supervised by Dr.Leonid Sigal. My research was focused on machine learning and computer vision. I attended NeurIPS in person back in 2019 for which I volunteered. I also attended the virtual version of the conference both in 2020 and 2021. And NeurIPS has always been my favorite conference to attend as I enjoy all the topics covered in the conference as well as the format of the conference. I'm now super excited that my very first publication is accepted at NeurIPS 2022 and I get a chance to attend in person!
Nam Hee Kim (University of British Columbia)
David Rolnick (McGill / Mila)
Michiel van de Panne (University of British Columbia)
More from the Same Authors
-
2021 Spotlight: Techniques for Symbol Grounding with SATNet »
Sever Topan · David Rolnick · Xujie Si -
2021 : ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models »
Salva Rühling Cachay · Venkatesh Ramesh · Jason Cole · Howard Barker · David Rolnick -
2021 : ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models »
Salva Rühling Cachay · Venkatesh Ramesh · Jason N. S. Cole · Howard Barker · David Rolnick -
2021 : Detecting Abandoned Oil Wells Using Machine Learning and Semantic Segmentation »
Michelle Lin · David Rolnick -
2022 : Physics-Constrained Deep Learning for Climate Downscaling »
Paula Harder · Qidong Yang · Venkatesh Ramesh · Prasanna Sattigeri · Alex Hernandez-Garcia · Campbell Watson · Daniela Szwarcman · David Rolnick -
2022 : Generating physically-consistent high-resolution climate data with hard-constrained neural networks »
Paula Harder · Qidong Yang · Venkatesh Ramesh · Prasanna Sattigeri · Alex Hernandez-Garcia · Campbell Watson · Daniela Szwarcman · David Rolnick -
2022 : PhAST: Physics-Aware, Scalable, and Task-specific GNNs for accelerated catalyst design »
ALEXANDRE DUVAL · Victor Schmidt · Alex Hernandez-Garcia · Santiago Miret · Yoshua Bengio · David Rolnick -
2021 : Curriculum-based Learning: An Effective Approach for Acquiring Dynamic Skills »
Michiel van de Panne -
2021 : ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models »
Salva Rühling Cachay · Venkatesh Ramesh · Jason N. S. Cole · Howard Barker · David Rolnick -
2021 : Detecting Abandoned Oil Wells Using Machine Learning and Semantic Segmentation »
Michelle Lin · David Rolnick -
2021 Poster: Techniques for Symbol Grounding with SATNet »
Sever Topan · David Rolnick · Xujie Si -
2020 Workshop: Tackling Climate Change with ML »
David Dao · Evan Sherwin · Priya Donti · Lauren Kuntz · Lynn Kaack · Yumna Yusuf · David Rolnick · Catherine Nakalembe · Claire Monteleoni · Yoshua Bengio -
2007 Demonstration: Robust Biped Locomotion Using Simple Low-dimensional Control Policies »
Michiel van de Panne · Kang Yin · Stelian Coros · Kevin Loken