Timezone: »

 
Poster
Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples
Hao Sun · Alihan Hüyük · Daniel Jarrett · Mihaela van der Schaar

Tue Dec 12 08:45 AM -- 10:45 AM (PST) @ Great Hall & Hall B1+B2 #1406
Event URL: https://sites.google.com/view/explain-offline-rl »

Learning controllers with offline data in decision-making systems is an essential area of research due to its potential to reduce the risk of applications in real-world systems. However, in responsibility-sensitive settings such as healthcare, decision accountability is of paramount importance, yet has not been adequately addressed by the literature.This paper introduces the Accountable Offline Controller (AOC) that employs the offline dataset as the Decision Corpus and performs accountable control based on a tailored selection of examples, referred to as the Corpus Subset. AOC operates effectively in low-data scenarios, can be extended to the strictly offline imitation setting, and displays qualities of both conservation and adaptability.We assess AOC's performance in both simulated and real-world healthcare scenarios, emphasizing its capability to manage offline control tasks with high levels of performance while maintaining accountability.

Author Information

Hao Sun (Cambridge)
Hao Sun

I am a penultimate year PhD student at the University of Cambridge. I believe Reinforcement Learning is a vital component of the solution for achieving AGI. My previous work on DRL is motivated by practical applications like robotics, healthcare, finance, and large language models. My research keywords during the past 4 years include: RL via Supervised Learning (2020-); Goal-Conditioned RL (2020-) Value-Based DRL (2021-); Offline RL (2021-); Optimism in Exploration (2021-); Uncertainty Quantification (2022-); Data-Centric Off-Policy Evaluation (2022-); Interpretable RL (2023-); RL in Language Models. (2023-)

Alihan Hüyük (University of Cambridge)
Daniel Jarrett (University of Cambridge)
Mihaela van der Schaar (University of Cambridge)

More from the Same Authors