Timezone: »
Consider the problem of estimating the causal effect of some attribute of a text document; for example: what effect does writing a polite vs. rude email have on response time? To estimate a causal effect from observational data, we need to control for confounding by adjusting for a set of covariates X that includes all common causes of the treatment and outcome. For this adjustment to work, the data must satisfy overlap: the probability of treatment should be bounded away from 0 and 1 for all levels of X. In the text setting, we can try to satisfy the requirement that we adjust for all common causes by adjusting for all the text. However, when the treatment is an attribute of the text, this violates overlap. The main goal of this paper is to develop an alternative approach that allows us to adjust for a “part” of the text that is large enough to control for confounding but small enough to avoid overlap violations. We propose a procedure that can identify and throw away the part of the text that is only predictive of the treatment. This information is not necessary to control for confounding (it does not affect the outcome) and so can be safely removed. On the other hand, if the removed information was necessary for perfect treatment prediction, then overlap will be recovered. We adapt deep models and propose a learning strategy to recognize multiple representations with different prediction properties. The procedure explicitly divides a (BERT) embedding of the text into one piece relevant to the outcome and one relevant to the treatment only. A regularization term is included to enforce this structure. Early empirical results show that our method effectively detects an appropriate confounding variable and mitigates the overlap issue.
Author Information
Lin Gui (The University of Chicago)
Victor Veitch (University of Chicago, Google)
More from the Same Authors

2021 Spotlight: Counterfactual Invariance to Spurious Correlations in Text Classification »
Victor Veitch · Alexander D'Amour · Steve Yadlowsky · Jacob Eisenstein 
2021 : Using Embeddings to Estimate Peer Influence on Social Networks »
Irina Cristali · Victor Veitch 
2021 : Using Embeddings to Estimate Peer Influence on Social Networks »
Irina Cristali · Victor Veitch 
2021 Poster: Counterfactual Invariance to Spurious Correlations in Text Classification »
Victor Veitch · Alexander D'Amour · Steve Yadlowsky · Jacob Eisenstein 
2020 Poster: Sense and Sensitivity Analysis: Simple PostHoc Analysis of Bias Due to Unobserved Confounding »
Victor Veitch · Anisha Zaveri 
2020 Spotlight: Sense and Sensitivity Analysis: Simple PostHoc Analysis of Bias Due to Unobserved Confounding »
Victor Veitch · Anisha Zaveri 
2019 : Coffee break, posters, and 1on1 discussions »
Julius von Kügelgen · David Rohde · Candice Schumann · Grace Charles · Victor Veitch · Vira Semenova · Mert Demirer · Vasilis Syrgkanis · Suraj Nair · Aahlad Puli · Masatoshi Uehara · Aditya Gopalan · Yi Ding · Ignavier Ng · Khashayar Khosravi · Eli Sherman · Shuxi Zeng · Aleksander Wieczorek · Hao Liu · Kyra Gan · Jason Hartford · Miruna Oprescu · Alexander D'Amour · Jörn Boehnke · Yuta Saito · Théophile GriveauBillion · Chirag Modi · Shyngys Karimov · Jeroen Berrevoets · Logan Graham · Imke Mayer · Dhanya Sridhar · Issa Dahabreh · Alan Mishler · Duncan Wadsworth · Khizar Qureshi · Rahul Ladhania · Gota Morishita · Paul Welle 
2019 Poster: Using Embeddings to Correct for Unobserved Confounding in Networks »
Victor Veitch · Yixin Wang · David Blei 
2019 Poster: Adapting Neural Networks for the Estimation of Treatment Effects »
Claudia Shi · David Blei · Victor Veitch 
2015 : The general class of (sparse) random graphs arising from exchangeable point processes »
Victor Veitch