Timezone: »
To estimate adjusted gender wage gaps, economists build models that predict wage from observable data. Although an individual's complete job history may be predictive of wage, economists typically summarize experience with hand-constructed summary statistics about the past. In this work, we estimate the adjusted gender wage gap for an individual's entire job history by learning a low-dimensional representation of their career. We develop a transformer-based representation model that is pretrained on massive, passively-collected resume data that is then fine-tuned to predict wages on the small, nationally representative survey data that economists use for wage gap estimation. This dimension-reduction approach produces unbiased estimates of the adjusted wage gap as long as each representation corresponds to the same full job history for males and females. We discuss how this condition relates to the sufficiency fairness criterion; although the adjusted wage gap is not a causal quantity, we take inspiration from the high-dimensional confounding literature to assess and mitigate sufficiency. We validate our approach with experiments on semi-synthetic and real-world data. Our method makes more accurate wage predictions than economic baselines. When applied to wage survey data in the United States, our method finds that a substantial portion of the gender wage gap can be attributed to differences in job history, although this proportion varies by year and across sub-populations.
Author Information
Keyon Vafa (Columbia University)
Susan Athey (Stanford University)
Susan Athey is The Economics of Technology Professor at Stanford Graduate School of Business. She received her bachelor's degree from Duke University and her Ph.D. from Stanford, and she holds an honorary doctorate from Duke University. She previously taught at the economics departments at MIT, Stanford and Harvard. In 2007, Professor Athey received the John Bates Clark Medal, awarded by the American Economic Association to “that American economist under the age of forty who is adjudged to have made the most significant contribution to economic thought and knowledge.” She was elected to the National Academy of Science in 2012 and to the American Academy of Arts and Sciences in 2008. Professor Athey’s research focuses on the economics of the internet, online advertising, the news media, marketplace design, and the intersection of machine learning and econometrics. She advises governments and businesses on marketplace design and platform economics.
David Blei (Columbia University)
David Blei is a Professor of Statistics and Computer Science at Columbia University, and a member of the Columbia Data Science Institute. His research is in statistical machine learning, involving probabilistic topic models, Bayesian nonparametric methods, and approximate posterior inference algorithms for massive data. He works on a variety of applications, including text, images, music, social networks, user behavior, and scientific data. David has received several awards for his research, including a Sloan Fellowship (2010), Office of Naval Research Young Investigator Award (2011), Presidential Early Career Award for Scientists and Engineers (2011), Blavatnik Faculty Award (2013), and ACM-Infosys Foundation Award (2013). He is a fellow of the ACM.
More from the Same Authors
-
2021 : Modeling Worker Career Trajectories with Neural Sequence Models »
Keyon Vafa -
2021 : Unveiling Mode-connectivity of the ELBO Landscape »
Edith Zhang · David Blei -
2022 : An Invariant Learning Characterization of Controlled Text Generation »
Claudia Shi · Carolina Zheng · Keyon Vafa · Amir Feder · David Blei -
2022 : A Bayesian Causal Inference Approach for Assessing Fairness in Clinical Decision-Making »
Linying Zhang · Lauren Richter · Yixin Wang · Anna Ostropolets · Noemie Elhadad · David Blei · George Hripcsak -
2022 : CAREER: Economic Prediction of Labor Sequence Data Under Distribution Shift »
Keyon Vafa · Emil Palikot · Tianyu Du · Ayush Kanodia · Susan Athey · David Blei -
2022 : An Invariant Learning Characterization of Controlled Text Generation »
Claudia Shi · Carolina Zheng · Keyon Vafa · Amir Feder · David Blei -
2022 : CAREER: Economic Prediction of Labor Sequence Data Under Distribution Shift »
Keyon Vafa · Emil Palikot · Tianyu Du · Ayush Kanodia · Susan Athey · David Blei -
2022 : An Invariant Learning Characterization of Controlled Text Generation »
Claudia Shi · Carolina Zheng · Keyon Vafa · Amir Feder · David Blei -
2021 : David Blei - On the Assumptions of Synthetic Control Methods »
David Blei -
2021 Test Of Time: Online Learning for Latent Dirichlet Allocation »
Matthew Hoffman · Francis Bach · David Blei -
2021 Poster: Posterior Collapse and Latent Variable Non-identifiability »
Yixin Wang · David Blei · John Cunningham -
2020 Workshop: I Can’t Believe It’s Not Better! Bridging the gap between theory and empiricism in probabilistic machine learning »
Jessica Forde · Francisco Ruiz · Melanie Fernandez Pradier · Aaron Schein · Finale Doshi-Velez · Isabel Valera · David Blei · Hanna Wallach -
2020 Poster: Markovian Score Climbing: Variational Inference with KL(p||q) »
Christian Naesseth · Fredrik Lindsten · David Blei -
2019 : Susan Athey »
Susan Athey -
2019 Poster: Discrete Flows: Invertible Generative Models of Discrete Data »
Dustin Tran · Keyon Vafa · Kumar Agrawal · Laurent Dinh · Ben Poole -
2019 Poster: Poisson-Randomized Gamma Dynamical Systems »
Aaron Schein · Scott Linderman · Mingyuan Zhou · David Blei · Hanna Wallach -
2019 Poster: Variational Bayes under Model Misspecification »
Yixin Wang · David Blei -
2019 Poster: Using Embeddings to Correct for Unobserved Confounding in Networks »
Victor Veitch · Yixin Wang · David Blei -
2019 Poster: Adapting Neural Networks for the Estimation of Treatment Effects »
Claudia Shi · David Blei · Victor Veitch -
2018 : Datasets and Benchmarks for Causal Learning »
Csaba Szepesvari · Isabelle Guyon · Nicolai Meinshausen · David Blei · Elias Bareinboim · Bernhard Schölkopf · Pietro Perona -
2018 : The Blessings of Multiple Causes »
David Blei -
2018 Tutorial: Counterfactual Inference »
Susan Athey -
2017 : Panel: On the Foundations and Future of Approximate Inference »
David Blei · Zoubin Ghahramani · Katherine Heller · Tim Salimans · Max Welling · Matthew D. Hoffman -
2017 Workshop: Advances in Approximate Bayesian Inference »
Francisco Ruiz · Stephan Mandt · Cheng Zhang · James McInerney · James McInerney · Dustin Tran · Dustin Tran · David Blei · Max Welling · Tamara Broderick · Michalis Titsias -
2017 Poster: Hierarchical Implicit Models and Likelihood-Free Variational Inference »
Dustin Tran · Rajesh Ranganath · David Blei -
2017 Poster: Structured Embedding Models for Grouped Data »
Maja Rudolph · Francisco Ruiz · Susan Athey · David Blei -
2017 Poster: Variational Inference via $\chi$ Upper Bound Minimization »
Adji Bousso Dieng · Dustin Tran · Rajesh Ranganath · John Paisley · David Blei -
2017 Poster: Context Selection for Embedding Models »
Liping Liu · Francisco Ruiz · Susan Athey · David Blei -
2016 : Causal Inference for Recommendation Systems »
David Blei -
2016 : Panel Discussion »
Shakir Mohamed · David Blei · Ryan Adams · José Miguel Hernández-Lobato · Ian Goodfellow · Yarin Gal -
2016 : Deep exponential families »
David Blei -
2016 Workshop: Advances in Approximate Bayesian Inference »
Tamara Broderick · Stephan Mandt · James McInerney · Dustin Tran · David Blei · Kevin Murphy · Andrew Gelman · Michael I Jordan -
2016 Poster: Operator Variational Inference »
Rajesh Ranganath · Dustin Tran · Jaan Altosaar · David Blei -
2016 Poster: The Generalized Reparameterization Gradient »
Francisco Ruiz · Michalis Titsias · David Blei -
2016 Poster: Exponential Family Embeddings »
Maja Rudolph · Francisco Ruiz · Stephan Mandt · David Blei -
2016 Tutorial: Variational Inference: Foundations and Modern Methods »
David Blei · Shakir Mohamed · Rajesh Ranganath -
2015 Workshop: Advances in Approximate Bayesian Inference »
Dustin Tran · Tamara Broderick · Stephan Mandt · James McInerney · Shakir Mohamed · Alp Kucukelbir · Matthew D. Hoffman · Neil Lawrence · David Blei -
2015 Poster: The Population Posterior and Bayesian Modeling on Streams »
James McInerney · Rajesh Ranganath · David Blei -
2015 Poster: Automatic Variational Inference in Stan »
Alp Kucukelbir · Rajesh Ranganath · Andrew Gelman · David Blei -
2015 Spotlight: Automatic Variational Inference in Stan »
Alp Kucukelbir · Rajesh Ranganath · Andrew Gelman · David Blei -
2015 Poster: Copula variational inference »
Dustin Tran · David Blei · Edo M Airoldi