Timezone: »
Finding an effective medical treatment often requires a search by trial and error. Making this search more efficient by minimizing the number of unnecessary trials could lower both costs and patient suffering. We formalize this problem as learning a policy for finding a near-optimal treatment in a minimum number of trials using a causal inference framework. We give a model-based dynamic programming algorithm which learns from observational data while being robust to unmeasured confounding. To reduce time complexity, we suggest a greedy algorithm which bounds the near-optimality constraint. The methods are evaluated on synthetic and real-world healthcare data and compared to model-free reinforcement learning. We find that our methods compare favorably to the model-free baseline while offering a more transparent trade-off between search time and treatment efficacy.
Author Information
Samuel HÃ¥kansson (Chalmers University of Technology)
Viktor Lindblom (Chalmers University of Technology)
Omer Gottesman (Harvard University)
Fredrik Johansson (Chalmers University of Technology)
More from the Same Authors
-
2022 Poster: Efficient learning of nonlinear prediction models with time-series privileged information »
Bastian Jung · Fredrik Johansson -
2021 Poster: Learning Markov State Abstractions for Deep Reinforcement Learning »
Cameron Allen · Neev Parikh · Omer Gottesman · George Konidaris -
2020 : Mini-panel discussion 1 - Bridging the gap between theory and practice »
Aviv Tamar · Emma Brunskill · Jost Tobias Springenberg · Omer Gottesman · Daniel Mankowitz -
2018 Poster: Representation Balancing MDPs for Off-policy Policy Evaluation »
Yao Liu · Omer Gottesman · Aniruddh Raghu · Matthieu Komorowski · Aldo Faisal · Finale Doshi-Velez · Emma Brunskill