Timezone: »
In recent years, high-confidence reinforcement learning algorithms have enjoyed success in application areas with high-quality models and plentiful data, but robotics remains a challenging domain for scaling up such approaches. Furthermore, very little work has been done on the even more difficult problem of safe imitation learning, in which the demonstrator's reward function is not known. This talk focuses on three recent developments in this emerging area of research: (1) a theory of safe imitation learning; (2) scalable reward inference in the absence of models; (3) efficient off-policy policy evaluation. The proposed algorithms offer a blend of safety and practicality, making a significant step towards safe robot learning with modest amounts of real-world data.
Author Information
Scott Niekum (UT Austin)
More from the Same Authors
-
2022 : Language-guided Task Adaptation for Imitation Learning »
Prasoon Goyal · Raymond Mooney · Scott Niekum -
2022 : A Ranking Game for Imitation Learning »
Harshit Sushil Sikchi · Akanksha Saran · Wonjoon Goo · Scott Niekum -
2022 Workshop: All Things Attention: Bridging Different Perspectives on Attention »
Abhijat Biswas · Akanksha Saran · Khimya Khetarpal · Reuben Aronson · Ruohan Zhang · Grace Lindsay · Scott Niekum -
2021 Poster: Adversarial Intrinsic Motivation for Reinforcement Learning »
Ishan Durugkar · Mauricio Tec · Scott Niekum · Peter Stone -
2021 Poster: SOPE: Spectrum of Off-Policy Estimators »
Christina Yuan · Yash Chandak · Stephen Giguere · Philip Thomas · Scott Niekum -
2021 Poster: Universal Off-Policy Evaluation »
Yash Chandak · Scott Niekum · Bruno da Silva · Erik Learned-Miller · Emma Brunskill · Philip Thomas -
2020 Poster: Bayesian Robust Optimization for Imitation Learning »
Daniel S. Brown · Scott Niekum · Marek Petrik -
2015 Poster: Policy Evaluation Using the Ω-Return »
Philip Thomas · Scott Niekum · Georgios Theocharous · George Konidaris