Timezone: »
Poster
A Composable Specification Language for Reinforcement Learning Tasks
Kishor Jothimurugan · Rajeev Alur · Osbert Bastani
Thu Dec 12 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #221
Reinforcement learning is a promising approach for learning control policies for robot tasks. However, specifying complex tasks (e.g., with multiple objectives and safety constraints) can be challenging, since the user must design a reward function that encodes the entire task. Furthermore, the user often needs to manually shape the reward to ensure convergence of the learning algorithm. We propose a language for specifying complex control tasks, along with an algorithm that compiles specifications in our language into a reward function and automatically performs reward shaping. We implement our approach in a tool called SPECTRL, and show that it outperforms several state-of-the-art baselines.
Author Information
Kishor Jothimurugan (University of Pennsylvania)
Rajeev Alur (University of Pennsylvania)
Osbert Bastani (University of Pennysylvania)
More from the Same Authors
-
2020 : Paper 50: Diverse Sampling for Flow-Based Trajectory Forecasting »
Jason Yecheng Ma · Jeevana Priya Inala · Dinesh Jayaraman · Osbert Bastani -
2021 Spotlight: Program Synthesis Guided Reinforcement Learning for Partially Observed Environments »
Yichen Yang · Jeevana Priya Inala · Osbert Bastani · Yewen Pu · Armando Solar-Lezama · Martin Rinard -
2021 : Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning »
Jason Yecheng Ma · Andrew Shen · Osbert Bastani · Dinesh Jayaraman -
2021 : Specification-Guided Learning of Nash Equilibria with High Social Welfare »
Kishor Jothimurugan · Suguman Bansal · Osbert Bastani · Rajeev Alur -
2021 : PAC Synthesis of Machine Learning Programs »
Osbert Bastani -
2021 : Synthesizing Video Trajectory Queries »
Stephen Mell · Favyen Bastani · Stephan Zdancewic · Osbert Bastani -
2021 : Improving Human Decision-Making with Machine Learning »
Hamsa Bastani · Osbert Bastani · Park Sinchaisri -
2021 : Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning »
Jason Yecheng Ma · Andrew Shen · Osbert Bastani · Dinesh Jayaraman -
2021 : Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning »
Jason Yecheng Ma · Andrew Shen · Osbert Bastani · Dinesh Jayaraman -
2022 : Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms »
Vashist Avadhanula · Omar Abdul Baki · Hamsa Bastani · Osbert Bastani · Caner Gocmen · Daniel Haimovich · Darren Hwang · Dmytro Karamshuk · Thomas Leeper · Jiayuan Ma · Gregory macnamara · Jake Mullet · Christopher Palow · Sung Park · Varun S Rajagopal · Kevin Schaeffer · Parikshit Shah · Deeksha Sinha · Nicolas Stier-Moses · Ben Xu -
2022 : Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms »
Vashist Avadhanula · Omar Abdul Baki · Hamsa Bastani · Osbert Bastani · Caner Gocmen · Daniel Haimovich · Darren Hwang · Dmytro Karamshuk · Thomas Leeper · Jiayuan Ma · Gregory macnamara · Jake Mullet · Christopher Palow · Sung Park · Varun S Rajagopal · Kevin Schaeffer · Parikshit Shah · Deeksha Sinha · Nicolas Stier-Moses · Ben Xu -
2022 : Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training »
Jason Yecheng Ma · Shagun Sodhani · Dinesh Jayaraman · Osbert Bastani · Vikash Kumar · Amy Zhang -
2022 : VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training »
Jason Yecheng Ma · Shagun Sodhani · Dinesh Jayaraman · Osbert Bastani · Vikash Kumar · Amy Zhang -
2022 : Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training »
Jason Yecheng Ma · Shagun Sodhani · Dinesh Jayaraman · Osbert Bastani · Vikash Kumar · Amy Zhang -
2022 : Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms »
Vashist Avadhanula · Omar Abdul Baki · Hamsa Bastani · Osbert Bastani · Caner Gocmen · Daniel Haimovich · Darren Hwang · Dmytro Karamshuk · Thomas Leeper · Jiayuan Ma · Gregory macnamara · Jake Mullet · Christopher Palow · Sung Park · Varun S Rajagopal · Kevin Schaeffer · Parikshit Shah · Deeksha Sinha · Nicolas Stier-Moses · Ben Xu -
2022 : Policy Aware Model Learning via Transition Occupancy Matching »
Jason Yecheng Ma · Kausik Sivakumar · Osbert Bastani · Dinesh Jayaraman -
2022 : Robust Option Learning for Adversarial Generalization »
Kishor Jothimurugan · Steve Hsu · Osbert Bastani · Rajeev Alur -
2022 : VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training »
Jason Yecheng Ma · Shagun Sodhani · Dinesh Jayaraman · Osbert Bastani · Vikash Kumar · Amy Zhang -
2022 : Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training »
Jason Yecheng Ma · Shagun Sodhani · Dinesh Jayaraman · Osbert Bastani · Vikash Kumar · Amy Zhang -
2022 : Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms »
Vashist Avadhanula · Omar Abdul Baki · Hamsa Bastani · Osbert Bastani · Caner Gocmen · Daniel Haimovich · Darren Hwang · Dmytro Karamshuk · Thomas Leeper · Jiayuan Ma · Gregory macnamara · Jake Mullet · Christopher Palow · Sung Park · Varun S Rajagopal · Kevin Schaeffer · Parikshit Shah · Deeksha Sinha · Nicolas Stier-Moses · Ben Xu -
2022 Poster: PAC Prediction Sets for Meta-Learning »
Sangdon Park · Edgar Dobriban · Insup Lee · Osbert Bastani -
2022 Poster: Offline Goal-Conditioned Reinforcement Learning via $f$-Advantage Regression »
Jason Yecheng Ma · Jason Yan · Dinesh Jayaraman · Osbert Bastani -
2022 Poster: Neurosymbolic Deep Generative Models for Sequence Data with Relational Constraints »
Halley Young · Maxwell Du · Osbert Bastani -
2022 Poster: Regret Bounds for Risk-Sensitive Reinforcement Learning »
Osbert Bastani · Jason Yecheng Ma · Estelle Shen · Wanqiao Xu -
2022 Poster: Practical Adversarial Multivalid Conformal Prediction »
Osbert Bastani · Varun Gupta · Christopher Jung · Georgy Noarov · Ramya Ramalingam · Aaron Roth -
2021 Poster: Conservative Offline Distributional Reinforcement Learning »
Jason Yecheng Ma · Dinesh Jayaraman · Osbert Bastani -
2021 Poster: Compositional Reinforcement Learning from Logical Specifications »
Kishor Jothimurugan · Suguman Bansal · Osbert Bastani · Rajeev Alur -
2021 Poster: Program Synthesis Guided Reinforcement Learning for Partially Observed Environments »
Yichen Yang · Jeevana Priya Inala · Osbert Bastani · Yewen Pu · Armando Solar-Lezama · Martin Rinard -
2021 Poster: Learning Models for Actionable Recourse »
Alexis Ross · Himabindu Lakkaraju · Osbert Bastani -
2020 Poster: Neurosymbolic Transformers for Multi-Agent Communication »
Jeevana Priya Inala · Yichen Yang · James Paulos · Yewen Pu · Osbert Bastani · Vijay Kumar · Martin Rinard · Armando Solar-Lezama -
2018 Poster: Verifiable Reinforcement Learning via Policy Extraction »
Osbert Bastani · Yewen Pu · Armando Solar-Lezama