Timezone: »
While high-return policies can be learned on a wide range of systems through reinforcement learning, actual deployment of the resulting policies is often hindered by their sensitivity to future changes in the environment. Adversarial training has shown some promise in producing policies that retain better performance under environment shifts, but existing approaches only consider robustness to specific kinds of perturbations that must be specified a priori. As possible changes in future dynamics are typically unknown in practice, we instead seek a policy that is robust to a variety of realistic changes only encountered at test-time. Towards this goal, we propose a new adversarial variant of soft actor-critic, which produces policies on Mujoco continuous control tasks that are simultaneously more robust across various environment shifts, such as changes to friction and body mass.
Author Information
Samuel Stanton (New York University)
ML Scientist at Genentech Early Research and Development (gRED). Building ML systems for scientific discovery in biotech.
Rasool Fakoor (Amazon Web Services)
Jonas Mueller (Amazon Web Services)
Andrew Gordon Wilson (New York University)
Alexander Smola (Amazon)
More from the Same Authors
-
2021 : Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks »
Curtis Northcutt · Anish Athalye · Jonas Mueller -
2021 : Benchmarking Multimodal AutoML for Tabular Data with Text Fields »
Xingjian Shi · Jonas Mueller · Nick Erickson · Mu Li · Alexander Smola -
2022 : PropertyDAG: Multi-objective Bayesian optimization of partially ordered, mixed-variable properties for biological sequence design »
Ji Won Park · Samuel Stanton · Saeed Saremi · Andrew Watkins · Stephen Ra · Vladimir Gligorijevic · Kyunghyun Cho · Richard Bonneau -
2022 : Utilizing supervised models to infer consensus labels and their quality from data with multiple annotators »
Hui Wen Goh · Ulyana Tkachenko · Jonas Mueller -
2022 : On Representation Learning Under Class Imbalance »
Ravid Shwartz-Ziv · Micah Goldblum · Yucen Li · C. Bayan Bruss · Andrew Gordon Wilson -
2022 : Andrew Gordon Wilson: When Bayesian Orthodoxy Can Go Wrong: Model Selection and Out-of-Distribution Generalization »
Andrew Gordon Wilson -
2022 : Andrew Gordon Wilson: When Bayesian Orthodoxy Can Go Wrong: Model Selection and Out-of-Distribution Generalization »
Andrew Gordon Wilson -
2022 Poster: Adaptive Interest for Emphatic Reinforcement Learning »
Martin Klissarov · Rasool Fakoor · Jonas Mueller · Kavosh Asadi · Taesup Kim · Alexander Smola -
2022 Poster: Faster Deep Reinforcement Learning with Slower Online Network »
Kavosh Asadi · Rasool Fakoor · Omer Gottesman · Taesup Kim · Michael Littman · Alexander Smola -
2021 Workshop: Bayesian Deep Learning »
Yarin Gal · Yingzhen Li · Sebastian Farquhar · Christos Louizos · Eric Nalisnick · Andrew Gordon Wilson · Zoubin Ghahramani · Kevin Murphy · Max Welling -
2021 : Evaluating Approximate Inference in Bayesian Deep Learning + Q&A »
Andrew Gordon Wilson · Pavel Izmailov · Matthew Hoffman · Yarin Gal · Yingzhen Li · Melanie F. Pradier · Sharad Vikram · Andrew Foong · Sanae Lotfi · Sebastian Farquhar -
2021 Poster: Continuous Doubly Constrained Batch Reinforcement Learning »
Rasool Fakoor · Jonas Mueller · Kavosh Asadi · Pratik Chaudhari · Alexander Smola -
2021 Poster: Deep Extended Hazard Models for Survival Analysis »
Qixian Zhong · Jonas Mueller · Jane-Ling Wang -
2021 Poster: Does Knowledge Distillation Really Work? »
Samuel Stanton · Pavel Izmailov · Polina Kirichenko · Alexander Alemi · Andrew Wilson -
2021 Poster: Conditioning Sparse Variational Gaussian Processes for Online Decision-making »
Wesley Maddox · Samuel Stanton · Andrew Wilson -
2021 Poster: Overinterpretation reveals image classification model pathologies »
Brandon Carter · Siddhartha Jain · Jonas Mueller · David Gifford -
2021 : Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks »
Curtis Northcutt · Anish Athalye · Jonas Mueller -
2020 Poster: Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation »
Rasool Fakoor · Jonas Mueller · Nick Erickson · Pratik Chaudhari · Alexander Smola -
2019 : Poster Session »
Rishav Chourasia · Yichong Xu · Corinna Cortes · Chien-Yi Chang · Yoshihiro Nagano · So Yeon Min · Benedikt Boecking · Phi Vu Tran · Kamyar Ghasemipour · Qianggang Ding · Shouvik Mani · Vikram Voleti · Rasool Fakoor · Miao Xu · Kenneth Marino · Lisa Lee · Volker Tresp · Jean-Francois Kagy · Marvin Zhang · Barnabas Poczos · Dinesh Khandelwal · Adrien Bardes · Evan Shelhamer · Jiacheng Zhu · Ziming Li · Xiaoyan Li · Dmitrii Krasheninnikov · Ruohan Wang · Mayoore Jaiswal · Emad Barsoum · Suvansh Sanjeev · Theeraphol Wattanavekin · Qizhe Xie · Sifan Wu · Yuki Yoshida · David Kanaa · Sina Khoshfetrat Pakazad · Mehdi Maasoumy -
2019 Poster: Exact Gaussian Processes on a Million Data Points »
Ke Alexander Wang · Geoff Pleiss · Jacob Gardner · Stephen Tyree · Kilian Weinberger · Andrew Gordon Wilson -
2019 Poster: Function-Space Distributions over Kernels »
Gregory Benton · Wesley Maddox · Jayson Salkey · Julio Albinati · Andrew Gordon Wilson -
2019 Poster: A Simple Baseline for Bayesian Uncertainty in Deep Learning »
Wesley Maddox · Pavel Izmailov · Timur Garipov · Dmitry Vetrov · Andrew Gordon Wilson -
2017 : Poster Session Speech: source separation, enhancement, recognition, synthesis »
Shuayb Zarar · Rasool Fakoor · SRI HARSHA DUMPALA · Minje Kim · Paris Smaragdis · Mohit Dubey · Jong Hwan Ko · Sakriani Sakti · Yuxuan Wang · Lijiang Guo · Garrett T Kenyon · Andros Tjandra · Tycho Tax · Younggun Lee -
2016 : Contributed Talk 1: Learning Optimal Interventions »
Jonas Mueller -
2015 Poster: Principal Differences Analysis: Interpretable Characterization of Differences between Distributions »
Jonas Mueller · Tommi Jaakkola