Timezone: »
We consider off-policy evaluation and optimization with continuous action spaces. We focus on observational data where the data collection policy is unknown and needs to be estimated from data. We take a semi-parametric approach where the value function takes a known parametric form in the treatment, but we are agnostic on how it depends on the observed contexts. We propose a doubly robust off-policy estimate for this setting and show that off-policy optimization based on this doubly robust estimate is robust to estimation errors of the policy function or the regression model. We also show that the variance of our off-policy estimate achieves the semi-parametric efficiency bound. Our results also apply if the model does not satisfy our semi-parametric form but rather we measure regret in terms of the best projection of the true value function to this functional space. Our work extends prior approaches of policy optimization from observational data that only considered discrete actions. We provide an experimental evaluation of our method in a synthetic data example motivated by optimal personalized pricing.
Author Information
Victor Chernozhukov (MIT)
Mert Demirer (MIT)
Greg Lewis (Microsoft Research)
Vasilis Syrgkanis (Microsoft Research)
More from the Same Authors
-
2021 : Double/Debiased Machine Learning for Dynamic Treatment Effects via $g$-Estimation »
Greg Lewis · Vasilis Syrgkanis -
2021 : Estimating the Long-Term Effects of Novel Treatments »
Keith Battocchi · Maggie Hei · Greg Lewis · Miruna Oprescu · Vasilis Syrgkanis -
2023 Poster: Future-Dependent Value-Based Off-Policy Evaluation in POMDPs »
Masatoshi Uehara · Haruka Kiyohara · Andrew Bennett · Victor Chernozhukov · Nan Jiang · Nathan Kallus · Chengchun Shi · Wen Sun -
2021 : Victor Chernozhukov - Omitted Confounder Bias Bounds for Machine Learned Causal Models »
Victor Chernozhukov -
2021 Poster: Double/Debiased Machine Learning for Dynamic Treatment Effects »
Greg Lewis · Vasilis Syrgkanis -
2021 Poster: Asymptotics of the Bootstrap via Stability with Applications to Inference with Model Selection »
Morgane Austern · Vasilis Syrgkanis -
2021 Poster: Estimating the Long-Term Effects of Novel Treatments »
Keith Battocchi · Eleanor Dillon · Maggie Hei · Greg Lewis · Miruna Oprescu · Vasilis Syrgkanis -
2020 Poster: Minimax Estimation of Conditional Moment Models »
Nishanth Dikkala · Greg Lewis · Lester Mackey · Vasilis Syrgkanis -
2019 : Coffee break, posters, and 1-on-1 discussions »
Julius von Kügelgen · David Rohde · Candice Schumann · Grace Charles · Victor Veitch · Vira Semenova · Mert Demirer · Vasilis Syrgkanis · Suraj Nair · Aahlad Puli · Masatoshi Uehara · Aditya Gopalan · Yi Ding · Ignavier Ng · Khashayar Khosravi · Eli Sherman · Shuxi Zeng · Aleksander Wieczorek · Hao Liu · Kyra Gan · Jason Hartford · Miruna Oprescu · Alexander D'Amour · Jörn Boehnke · Yuta Saito · Théophile Griveau-Billion · Chirag Modi · Shyngys Karimov · Jeroen Berrevoets · Logan Graham · Imke Mayer · Dhanya Sridhar · Issa Dahabreh · Alan Mishler · Duncan Wadsworth · Khizar Qureshi · Rahul Ladhania · Gota Morishita · Paul Welle -
2019 Poster: Low-Rank Bandit Methods for High-Dimensional Dynamic Pricing »
Jonas Mueller · Vasilis Syrgkanis · Matt Taddy -
2019 Poster: Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments »
Vasilis Syrgkanis · Victor Lei · Miruna Oprescu · Maggie Hei · Keith Battocchi · Greg Lewis -
2019 Spotlight: Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments »
Vasilis Syrgkanis · Victor Lei · Miruna Oprescu · Maggie Hei · Keith Battocchi · Greg Lewis -
2018 Workshop: Smooth Games Optimization and Machine Learning »
Simon Lacoste-Julien · Ioannis Mitliagkas · Gauthier Gidel · Vasilis Syrgkanis · Eva Tardos · Leon Bottou · Sebastian Nowozin -
2017 Workshop: Learning in the Presence of Strategic Behavior »
Nika Haghtalab · Yishay Mansour · Tim Roughgarden · Vasilis Syrgkanis · Jennifer Wortman Vaughan -
2017 Poster: Welfare Guarantees from Data »
Darrell Hoy · Denis Nekipelov · Vasilis Syrgkanis -
2017 Poster: Robust Optimization for Non-Convex Objectives »
Robert S Chen · Brendan Lucier · Yaron Singer · Vasilis Syrgkanis -
2017 Poster: A Sample Complexity Measure with Applications to Learning Optimal Auctions »
Vasilis Syrgkanis -
2017 Oral: Robust Optimization for Non-Convex Objectives »
Robert S Chen · Brendan Lucier · Yaron Singer · Vasilis Syrgkanis -
2016 Poster: Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits »
Vasilis Syrgkanis · Haipeng Luo · Akshay Krishnamurthy · Robert Schapire -
2015 Poster: No-Regret Learning in Bayesian Games »
Jason Hartline · Vasilis Syrgkanis · Eva Tardos -
2015 Poster: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire -
2015 Oral: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire