Timezone: »
While high-return policies can be learned on a wide range of systems through reinforcement learning, actual deployment of the resulting policies is often hindered by their sensitivity to future changes in the environment. Adversarial training has shown some promise in producing policies that retain better performance under environment shifts, but existing approaches only consider robustness to specific kinds of perturbations that must be specified a priori. As possible changes in future dynamics are typically unknown in practice, we instead seek a policy that is robust to a variety of realistic changes only encountered at test-time. Towards this goal, we propose a new adversarial variant of soft actor-critic, which produces policies on Mujoco continuous control tasks that are simultaneously more robust across various environment shifts, such as changes to friction and body mass.
Author Information
Samuel Stanton (New York University)
Sam is a Ph.D. student in the NYU Center for Data Science and a NDSEG Fellow (class of 2018), working with Professor Andrew Wilson. His current research focuses on the incorporation of probabilistic state transition models in reinforcement learning algorithms. Model-based RL agents generalize from past experience very effectively, allowing the agent to evaluate policies with fewer environment interactions than their model-free counterparts. Improving the data-efficiency of RL agents is crucial for real-world applications in fields like robotics, logistics, and finance. Sam holds a Master’s degree in Operations Research from Cornell University, where he started working with Professor Wilson as a first-year Ph.D. student. Sam transferred from the Cornell doctoral program to continue his research agenda at NYU with his advisor. Prior to his studies at Cornell, Sam earned a Bachelor’s degree in Mathematics from the University of Colorado Denver, graduating summa cum laude. In addition to his dissertation research, Sam is interested in modern art and philosophy, especially epistemology and ethics. When he is not occupied with research, Sam enjoys volleyball, rock climbing, surfing, and snowboarding.
Rasool Fakoor (Amazon Web Services)
Jonas Mueller (Amazon Web Services)
Andrew Gordon Wilson (New York University)
Alexander Smola (Amazon)
More from the Same Authors
-
2021 : Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks »
Curtis Northcutt · Anish Athalye · Jonas Mueller -
2021 : Benchmarking Multimodal AutoML for Tabular Data with Text Fields »
Xingjian Shi · Jonas Mueller · Nick Erickson · Mu Li · Alexander Smola -
2021 Workshop: Bayesian Deep Learning »
Yarin Gal · Yingzhen Li · Sebastian Farquhar · Christos Louizos · Eric Nalisnick · Andrew Gordon Wilson · Zoubin Ghahramani · Kevin Murphy · Max Welling -
2021 : Evaluating Approximate Inference in Bayesian Deep Learning + Q&A »
Andrew Gordon Wilson · Pavel Izmailov · Matthew Hoffman · Yarin Gal · Yingzhen Li · Melanie F. Pradier · Sharad Vikram · Andrew Foong · Sanae Lotfi · Sebastian Farquhar -
2021 Poster: Continuous Doubly Constrained Batch Reinforcement Learning »
Rasool Fakoor · Jonas Mueller · Kavosh Asadi · Pratik Chaudhari · Alexander Smola -
2021 Poster: Deep Extended Hazard Models for Survival Analysis »
Qixian Zhong · Jonas Mueller · Jane-Ling Wang -
2021 Poster: Does Knowledge Distillation Really Work? »
Samuel Stanton · Pavel Izmailov · Polina Kirichenko · Alexander Alemi · Andrew Wilson -
2021 Poster: Conditioning Sparse Variational Gaussian Processes for Online Decision-making »
Wesley Maddox · Samuel Stanton · Andrew Wilson -
2021 Poster: Overinterpretation reveals image classification model pathologies »
Brandon Carter · Siddhartha Jain · Jonas Mueller · David Gifford -
2021 : Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks »
Curtis Northcutt · Anish Athalye · Jonas Mueller -
2020 Poster: Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation »
Rasool Fakoor · Jonas Mueller · Nick Erickson · Pratik Chaudhari · Alexander Smola -
2019 : Poster Session »
Rishav Chourasia · Yichong Xu · Corinna Cortes · Chien-Yi Chang · Yoshihiro Nagano · So Yeon Min · Benedikt Boecking · Phi Vu Tran · Seyed Kamyar Seyed Ghasemipour · Qianggang Ding · Shouvik Mani · Vikram Voleti · Rasool Fakoor · Miao Xu · Kenneth Marino · Lisa Lee · Volker Tresp · Jean-Francois Kagy · Marvin Zhang · Barnabas Poczos · Dinesh Khandelwal · Adrien Bardes · Evan Shelhamer · Jiacheng Zhu · Ziming Li · Xiaoyan Li · Dmitrii Krasheninnikov · Ruohan Wang · Mayoore Jaiswal · Emad Barsoum · Suvansh Sanjeev · Theeraphol Wattanavekin · Qizhe Xie · Sifan Wu · Yuki Yoshida · David Kanaa · Sina Khoshfetrat Pakazad · Mehdi Maasoumy -
2019 Poster: Exact Gaussian Processes on a Million Data Points »
Ke Alexander Wang · Geoff Pleiss · Jacob Gardner · Stephen Tyree · Kilian Weinberger · Andrew Gordon Wilson -
2019 Poster: Function-Space Distributions over Kernels »
Gregory Benton · Wesley Maddox · Jayson Salkey · Julio Albinati · Andrew Gordon Wilson -
2019 Poster: A Simple Baseline for Bayesian Uncertainty in Deep Learning »
Wesley Maddox · Pavel Izmailov · Timur Garipov · Dmitry Vetrov · Andrew Gordon Wilson -
2017 : Poster Session Speech: source separation, enhancement, recognition, synthesis »
Shuayb Zarar · Rasool Fakoor · SRI HARSHA DUMPALA · Minje Kim · Paris Smaragdis · Mohit Dubey · Jong Hwan Ko · Sakriani Sakti · Yuxuan Wang · Lijiang Guo · Garrett T Kenyon · Andros Tjandra · Tycho Tax · Younggun Lee -
2016 : Contributed Talk 1: Learning Optimal Interventions »
Jonas Mueller -
2015 Poster: Principal Differences Analysis: Interpretable Characterization of Differences between Distributions »
Jonas Mueller · Tommi Jaakkola