Timezone: »
Methods to learn under algorithmic triage have predominantly focused on supervised learning settings where each decision, or prediction, is independent of each other. Under algorithmic triage, a supervised learning model predicts a fraction of the instances and humans predict the remaining ones. In this work, we take a first step towards developing reinforcement learning models that are optimized to operate under algorithmic triage. To this end, we look at the problem through the framework of options and develop a two-stage actor-critic method to learn reinforcement learning models under triage. The first stage performs offline, off-policy training using human data gathered in an environment where the human has operated on their own. The second stage performs on-policy training to account for the impact that switching may have on the human policy, which may be difficult to anticipate from the above human data. Extensive simulation experiments in a synthetic car driving task show that the machine models and the triage policies trained using our two-stage method effectively complement human policies and outperform those provided by several competitive baselines.
Author Information
Eleni Straitouri (MPI-SWS)
Adish Singla (MPI-SWS)
Vahid Balazadeh Meresht (Sharif University of Technology)
Manuel Rodriguez (Max Planck Institute for Software Systems)
More from the Same Authors
-
2021 : Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments »
Amin Rakhsha · Xuezhou Zhang · Jerry Zhu · Adish Singla -
2021 : Poster: Fair Clustering Using Antidote Data »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 : Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments »
Amin Rakhsha · Xuezhou Zhang · Jerry Zhu · Adish Singla -
2022 Poster: On Batch Teaching with Sample Complexity Bounded by VCD »
Farnam Mansouri · Hans Simon · Adish Singla · Sandra Zilles -
2022 Poster: Counterfactual Temporal Point Processes »
Kimia Noorbakhsh · Manuel Rodriguez -
2023 Poster: Finding Counterfactually Optimal Action Sequences in Continuous State Spaces »
Stratis Tsirtsis · Manuel Rodriguez -
2023 Poster: Human-Aligned Calibration for AI-Assisted Decision Making »
Nina Corvelo Benz · Manuel Rodriguez -
2023 Workshop: Generative AI for Education (GAIED): Advances, Opportunities, and Challenges »
Paul Denny · Sumit Gulwani · Neil Heffernan · Tanja Käser · Steven Moore · Anna Rafferty · Adish Singla -
2022 Spotlight: Lightning Talks 1A-3 »
Kimia Noorbakhsh · Ronan Perry · Qi Lyu · Jiawei Jiang · Christian Toth · Olivier Jeunen · Xin Liu · Yuan Cheng · Lei Li · Manuel Rodriguez · Julius von Kügelgen · Lars Lorch · Nicolas Donati · Lukas Burkhalter · Xiao Fu · Zhongdao Wang · Songtao Feng · Ciarán Gilligan-Lee · Rishabh Mehrotra · Fangcheng Fu · Jing Yang · Bernhard Schölkopf · Ya-Li Li · Christian Knoll · Maks Ovsjanikov · Andreas Krause · Shengjin Wang · Hong Zhang · Mounia Lalmas · Bolin Ding · Bo Du · Yingbin Liang · Franz Pernkopf · Robert Peharz · Anwar Hithnawi · Julius von Kügelgen · Bo Li · Ce Zhang -
2022 Spotlight: Counterfactual Temporal Point Processes »
Kimia Noorbakhsh · Manuel Rodriguez -
2022 Spotlight: On Batch Teaching with Sample Complexity Bounded by VCD »
Farnam Mansouri · Hans Simon · Adish Singla · Sandra Zilles -
2022 Poster: Envy-free Policy Teaching to Multiple Agents »
Jiarui Gan · R Majumdar · Adish Singla · Goran Radanovic -
2022 Poster: Exploration-Guided Reward Shaping for Reinforcement Learning under Sparse Rewards »
Rati Devidze · Parameswaran Kamalaruban · Adish Singla -
2022 Poster: Provable Defense against Backdoor Policies in Reinforcement Learning »
Shubham Bharti · Xuezhou Zhang · Adish Singla · Jerry Zhu -
2021 : Fair Clustering Using Antidote Data »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 Workshop: Human Centered AI »
Michael Muller · Plamen P Angelov · Shion Guha · Marina Kogan · Gina Neff · Nuria Oliver · Manuel Rodriguez · Adrian Weller -
2021 : Fairness Degrading Adversarial Attacks Against Clustering Algorithms »
Anshuman Chhabra · Adish Singla · Prasant Mohapatra -
2021 Poster: Curriculum Design for Teaching via Demonstrations: Theory and Applications »
Gaurav Yengera · Rati Devidze · Parameswaran Kamalaruban · Adish Singla -
2021 Poster: Explicable Reward Design for Reinforcement Learning Agents »
Rati Devidze · Goran Radanovic · Parameswaran Kamalaruban · Adish Singla -
2021 Poster: On Blame Attribution for Accountable Multi-Agent Sequential Decision Making »
Stelios Triantafyllou · Adish Singla · Goran Radanovic -
2021 Poster: Teaching an Active Learner with Contrastive Examples »
Chaoqi Wang · Adish Singla · Yuxin Chen -
2021 Poster: Differentiable Learning Under Triage »
Nastaran Okati · Abir De · Manuel Rodriguez -
2021 Poster: Teaching via Best-Case Counterexamples in the Learning-with-Equivalence-Queries Paradigm »
Akash Kumar · Yuxin Chen · Adish Singla -
2021 Poster: Counterfactual Explanations in Sequential Decision Making Under Uncertainty »
Stratis Tsirtsis · Abir De · Manuel Rodriguez -
2020 Poster: Synthesizing Tasks for Block-based Programming »
Umair Ahmed · Maria Christakis · Aleksandr Efremov · Nigel Fernandez · Ahana Ghosh · Abhik Roychoudhury · Adish Singla -
2020 Poster: Task-agnostic Exploration in Reinforcement Learning »
Xuezhou Zhang · Yuzhe Ma · Adish Singla -
2019 Workshop: Learning with Temporal Point Processes »
Manuel Rodriguez · Le Song · Isabel Valera · Yan Liu · Abir De · Hongyuan Zha -
2019 Workshop: Workshop on Human-Centric Machine Learning »
Plamen P Angelov · Nuria Oliver · Adrian Weller · Manuel Rodriguez · Isabel Valera · Silvia Chiappa · Hoda Heidari · Niki Kilbertus -
2019 Poster: Teaching Multiple Concepts to a Forgetful Learner »
Anette Hunziker · Yuxin Chen · Oisin Mac Aodha · Manuel Gomez Rodriguez · Andreas Krause · Pietro Perona · Yisong Yue · Adish Singla -
2019 Poster: Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models »
Farnam Mansouri · Yuxin Chen · Ara Vartanian · Jerry Zhu · Adish Singla -
2019 Poster: Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints »
Sebastian Tschiatschek · Ahana Ghosh · Luis Haug · Rati Devidze · Adish Singla -
2018 : Assisted Inverse Reinforcement Learning »
Adish Singla · Rati Devidze -
2018 : Manuel Gomez Rodriguez - Enhancing the Accuracy and Fairness of Human Decision Making »
Manuel Rodriguez -
2018 Poster: Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners »
Yuxin Chen · Adish Singla · Oisin Mac Aodha · Pietro Perona · Yisong Yue -
2018 Poster: Teaching Inverse Reinforcement Learners via Features and Demonstrations »
Luis Haug · Sebastian Tschiatschek · Adish Singla -
2018 Poster: Enhancing the Accuracy and Fairness of Human Decision Making »
Isabel Valera · Adish Singla · Manuel Gomez Rodriguez -
2017 Poster: From Parity to Preference-based Notions of Fairness in Classification »
Muhammad Bilal Zafar · Isabel Valera · Manuel Rodriguez · Krishna Gummadi · Adrian Weller -
2015 Poster: COEVOLVE: A Joint Point Process Model for Information Diffusion and Network Co-evolution »
Mehrdad Farajtabar · Yichen Wang · Manuel Rodriguez · Shuang Li · Hongyuan Zha · Le Song -
2015 Oral: COEVOLVE: A Joint Point Process Model for Information Diffusion and Network Co-evolution »
Mehrdad Farajtabar · Yichen Wang · Manuel Rodriguez · Shuang Li · Hongyuan Zha · Le Song