Workshop: Causal Inference Challenges in Sequential Decision Making: Bridging Theory and Practice
Off-Policy Evaluation with Embedded Spaces
Jaron Jia Rong Lee · David Arbour · Georgios Theocharous
Slate recommendation systems are commonly evaluated prior to deployment using off-policy evaluation methods, whereby data collected under the old logging policy is used to predict the performance of a new target policy. However, in practice most recommendation systems are not observed to recommend the vast majority of items, which is an issue since existing methods require that the probability of the target policy recommending an item can only be non-zero when the probability of the logging policy is non-zero. To circumvent this issue, we explore the use of item embeddings. By representing queries and slates in an embedding space, we are able to share information to extrapolate behaviors for queries and items that have not been seen yet.