Timezone: »
The goal of offline policy evaluation (OPE) is to evaluate target policies based on logged data under a different distribution. Because no one method is uniformly best, model selection is important, but difficult without online exploration. We propose soft stability weighting (SSW) for adaptively combining offline estimates from ensembles of fitted-Q-evaluation (FQE) and model-based evaluation methods generated by different random initializations of neural networks. Soft stability weighting computes a state-action-conditional weighted average of the median FQE and model-based prediction by normalizing the state-action-conditional standard deviation of ensembles of both methods relative to the average standard deviation of each method. Therefore it compares the relative stability of predictions in the ensemble to the perturbations from random initializations, drawn from a truncated normal distribution scaled by the input feature size.
Author Information
Briton Park
Xian Wu (University of California, Berkeley)
Bin Yu (UC Berkeley)
Bin Yu is Chancellor’s Professor in the Departments of Statistics and of Electrical Engineering & Computer Sciences at the University of California at Berkeley and a former chair of Statistics at UC Berkeley. Her research focuses on practice, algorithm, and theory of statistical machine learning and causal inference. Her group is engaged in interdisciplinary research with scientists from genomics, neuroscience, and precision medicine. In order to augment empirical evidence for decision-making, they are investigating methods/algorithms (and associated statistical inference problems) such as dictionary learning, non-negative matrix factorization (NMF), EM and deep learning (CNNs and LSTMs), and heterogeneous effect estimation in randomized experiments (X-learner). Their recent algorithms include staNMF for unsupervised learning, iterative Random Forests (iRF) and signed iRF (s-iRF) for discovering predictive and stable high-order interactions in supervised learning, contextual decomposition (CD) and aggregated contextual decomposition (ACD) for phrase or patch importance extraction from an LSTM or a CNN. She is a member of the U.S. National Academy of Sciences and Fellow of the American Academy of Arts and Sciences. She was a Guggenheim Fellow in 2006, and the Tukey Memorial Lecturer of the Bernoulli Society in 2012. She was President of IMS (Institute of Mathematical Statistics) in 2013-2014 and the Rietz Lecturer of IMS in 2016. She received the E. L. Scott Award from COPSS (Committee of Presidents of Statistical Societies) in 2018. Moreover, Yu was a founding co-director of the Microsoft Research Asia (MSR) Lab at Peking Univeristy and is a member of the scientific advisory board at the UK Alan Turning Institute for data science and AI.
Angela Zhou (University of Southern California)
More from the Same Authors
-
2021 : It's COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks »
Michelle Bao · Angela Zhou · Samantha Zottola · Brian Brubach · Sarah Desmarais · Aaron Horowitz · Kristian Lum · Suresh Venkatasubramanian -
2021 : Importance of Representation Learning for Off-Policy Fitted Q-Evaluation »
Xian Wu · Nevena Lazic · Dong Yin · Cosmin Paduraru -
2021 : Stateful Offline Contextual Policy Evaluation and Learning »
Angela Zhou -
2022 : Gradient dynamics of single-neuron autoencoders on orthogonal data »
Nikhil Ghosh · Spencer Frei · Wooseok Ha · Bin Yu -
2022 Panel: Panel 6A-4: Empirical Gateaux Derivatives… & Practical Adversarial Multivalid… »
Georgy Noarov · Angela Zhou -
2022 : Stable Discovery of Interpretable Subgroups via Calibration in Causal Studies »
Bin Yu -
2022 Poster: Off-Policy Evaluation with Policy-Dependent Optimization Response »
Wenshuo Guo · Michael Jordan · Angela Zhou -
2022 Poster: Empirical Gateaux Derivatives for Causal Inference »
Michael Jordan · Yixin Wang · Angela Zhou -
2021 : Data Opportunities: unsolved medical problems and where new data can help »
Bin Yu · Regina Barzilay · Marzyeh Ghassemi · Emma Pierson -
2021 : Stateful Offline Contextual Policy Evaluation and Learning »
Angela Zhou -
2021 Workshop: Machine Learning Meets Econometrics (MLECON) »
David Bruns-Smith · Arthur Gretton · Limor Gultchin · Niki Kilbertus · Krikamol Muandet · Evan Munro · Angela Zhou -
2021 Poster: Adaptive wavelet distillation from neural networks through interpretations »
Wooseok Ha · Chandan Singh · Francois Lanusse · Srigokul Upadhyayula · Bin Yu -
2021 : It's COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks »
Michelle Bao · Angela Zhou · Samantha Zottola · Brian Brubach · Sarah Desmarais · Aaron Horowitz · Kristian Lum · Suresh Venkatasubramanian -
2020 Workshop: Consequential Decisions in Dynamic Environments »
Niki Kilbertus · Angela Zhou · Ashia Wilson · John Miller · Lily Hu · Lydia T. Liu · Nathan Kallus · Shira Mitchell -
2020 : Spotlight Talk 4: Fairness, Welfare, and Equity in Personalized Pricing »
Nathan Kallus · Angela Zhou -
2020 Poster: Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning »
Nathan Kallus · Angela Zhou -
2019 : Coffee Break and Poster Session »
Rameswar Panda · Prasanna Sattigeri · Kush Varshney · Karthikeyan Natesan Ramamurthy · Harvineet Singh · Vishwali Mhasawade · Shalmali Joshi · Laleh Seyyed-Kalantari · Matthew McDermott · Gal Yona · James Atwood · Hansa Srinivasan · Yonatan Halpern · D. Sculley · Behrouz Babaki · Margarida Carvalho · Josie Williams · Narges Razavian · Haoran Zhang · Amy Lu · Irene Y Chen · Xiaojie Mao · Angela Zhou · Nathan Kallus -
2019 : Opening Remarks »
Thorsten Joachims · Nathan Kallus · Michele Santacatterina · Adith Swaminathan · David Sontag · Angela Zhou -
2019 Workshop: “Do the right thing”: machine learning and causal inference for improved decision making »
Michele Santacatterina · Thorsten Joachims · Nathan Kallus · Adith Swaminathan · David Sontag · Angela Zhou -
2019 Poster: The Fairness of Risk Scores Beyond Classification: Bipartite Ranking and the XAUC Metric »
Nathan Kallus · Angela Zhou -
2019 Poster: Assessing Disparate Impact of Personalized Interventions: Identifiability and Bounds »
Nathan Kallus · Angela Zhou -
2019 Poster: A Debiased MDI Feature Importance Measure for Random Forests »
Xiao Li · Yu Wang · Sumanta Basu · Karl Kumbier · Bin Yu -
2019 Invited Talk: Veridical Data Science »
Bin Yu -
2018 Poster: Confounding-Robust Policy Improvement »
Nathan Kallus · Angela Zhou -
2017 : Deep nets meet real neurons: pattern selectivity of V4 through transfer learning and stability analysis »
Bin Yu -
2017 : Invited Talk »
Bin Yu