Timezone: »
Labor economists regularly analyze employment data by fitting predictive models to small, carefully constructed longitudinal survey datasets. Although modern machine learning methods offer promise for such problems, these survey datasets are too small to take advantage of them. In recent years large datasets of online resumes have also become available, providing data about the career trajectories of millions of individuals. However, the distribution of these large resume datasets differ in meaningful ways from the survey datasets used for economic estimation; standard econometric models cannot take advantage of their scale or make predictions under distribution shift. To this end we develop CAREER, a transformer-based model that uses transfer learning to learn representations of job sequences. CAREER is first fit to large, passively-collected resume data and then fine-tuned on samples of the downstream data distribution of interest. We find that CAREER forms accurate predictions of job sequences, achieving state-of-the-art predictive performance on three widely-used economics datasets. We also find that CAREER is adept at making predictions under distribution shifts in time.
Author Information
Keyon Vafa (Columbia University)
Emil Palikot (Stanford University)
Tianyu Du (Stanford University)
Ayush Kanodia (Stanford University)
My name is Ayush Kanodia. I’m a PhD student in Computer Science at Stanford University, currently in my third year. I mix Machine Learning and Economics. I am lucky to be advised by Susan Athey.
Susan Athey (Stanford University)
Susan Athey is The Economics of Technology Professor at Stanford Graduate School of Business. She received her bachelor's degree from Duke University and her Ph.D. from Stanford, and she holds an honorary doctorate from Duke University. She previously taught at the economics departments at MIT, Stanford and Harvard. In 2007, Professor Athey received the John Bates Clark Medal, awarded by the American Economic Association to “that American economist under the age of forty who is adjudged to have made the most significant contribution to economic thought and knowledge.” She was elected to the National Academy of Science in 2012 and to the American Academy of Arts and Sciences in 2008. Professor Athey’s research focuses on the economics of the internet, online advertising, the news media, marketplace design, and the intersection of machine learning and econometrics. She advises governments and businesses on marketplace design and platform economics.
David Blei (Columbia University)
David Blei is a Professor of Statistics and Computer Science at Columbia University, and a member of the Columbia Data Science Institute. His research is in statistical machine learning, involving probabilistic topic models, Bayesian nonparametric methods, and approximate posterior inference algorithms for massive data. He works on a variety of applications, including text, images, music, social networks, user behavior, and scientific data. David has received several awards for his research, including a Sloan Fellowship (2010), Office of Naval Research Young Investigator Award (2011), Presidential Early Career Award for Scientists and Engineers (2011), Blavatnik Faculty Award (2013), and ACM-Infosys Foundation Award (2013). He is a fellow of the ACM.
Related Events (a corresponding poster, oral, or spotlight)
-
2022 : CAREER: Economic Prediction of Labor Sequence Data Under Distribution Shift »
Dates n/a. Room
More from the Same Authors
-
2021 : Modeling Worker Career Trajectories with Neural Sequence Models »
Keyon Vafa -
2021 : Boosting engagement in ed tech with personalized recommendations »
Ayush Kanodia -
2021 : Unveiling Mode-connectivity of the ELBO Landscape »
Edith Zhang · David Blei -
2022 : An Invariant Learning Characterization of Controlled Text Generation »
Claudia Shi · Carolina Zheng · Keyon Vafa · Amir Feder · David Blei -
2022 : A Bayesian Causal Inference Approach for Assessing Fairness in Clinical Decision-Making »
Linying Zhang · Lauren Richter · Yixin Wang · Anna Ostropolets · Noemie Elhadad · David Blei · George Hripcsak -
2022 : Adjusting the Gender Wage Gap with a Low-Dimensional Representation of Job History »
Keyon Vafa · Susan Athey · David Blei -
2022 : An Invariant Learning Characterization of Controlled Text Generation »
Claudia Shi · Carolina Zheng · Keyon Vafa · Amir Feder · David Blei -
2022 : An Invariant Learning Characterization of Controlled Text Generation »
Claudia Shi · Carolina Zheng · Keyon Vafa · Amir Feder · David Blei -
2021 : David Blei - On the Assumptions of Synthetic Control Methods »
David Blei -
2021 Test Of Time: Online Learning for Latent Dirichlet Allocation »
Matthew Hoffman · Francis Bach · David Blei -
2021 Poster: Posterior Collapse and Latent Variable Non-identifiability »
Yixin Wang · David Blei · John Cunningham -
2020 Workshop: I Can’t Believe It’s Not Better! Bridging the gap between theory and empiricism in probabilistic machine learning »
Jessica Forde · Francisco Ruiz · Melanie Fernandez Pradier · Aaron Schein · Finale Doshi-Velez · Isabel Valera · David Blei · Hanna Wallach -
2020 Poster: Markovian Score Climbing: Variational Inference with KL(p||q) »
Christian Naesseth · Fredrik Lindsten · David Blei -
2019 : Susan Athey »
Susan Athey -
2019 Poster: Discrete Flows: Invertible Generative Models of Discrete Data »
Dustin Tran · Keyon Vafa · Kumar Agrawal · Laurent Dinh · Ben Poole -
2019 Poster: Poisson-Randomized Gamma Dynamical Systems »
Aaron Schein · Scott Linderman · Mingyuan Zhou · David Blei · Hanna Wallach -
2019 Poster: Variational Bayes under Model Misspecification »
Yixin Wang · David Blei -
2019 Poster: Using Embeddings to Correct for Unobserved Confounding in Networks »
Victor Veitch · Yixin Wang · David Blei -
2019 Poster: Adapting Neural Networks for the Estimation of Treatment Effects »
Claudia Shi · David Blei · Victor Veitch -
2018 : Datasets and Benchmarks for Causal Learning »
Csaba Szepesvari · Isabelle Guyon · Nicolai Meinshausen · David Blei · Elias Bareinboim · Bernhard Schölkopf · Pietro Perona -
2018 : The Blessings of Multiple Causes »
David Blei -
2018 Tutorial: Counterfactual Inference »
Susan Athey -
2017 : Panel: On the Foundations and Future of Approximate Inference »
David Blei · Zoubin Ghahramani · Katherine Heller · Tim Salimans · Max Welling · Matthew D. Hoffman -
2017 Workshop: Advances in Approximate Bayesian Inference »
Francisco Ruiz · Stephan Mandt · Cheng Zhang · James McInerney · James McInerney · Dustin Tran · Dustin Tran · David Blei · Max Welling · Tamara Broderick · Michalis Titsias -
2017 Poster: Hierarchical Implicit Models and Likelihood-Free Variational Inference »
Dustin Tran · Rajesh Ranganath · David Blei -
2017 Poster: Structured Embedding Models for Grouped Data »
Maja Rudolph · Francisco Ruiz · Susan Athey · David Blei -
2017 Poster: Variational Inference via $\chi$ Upper Bound Minimization »
Adji Bousso Dieng · Dustin Tran · Rajesh Ranganath · John Paisley · David Blei -
2017 Poster: Context Selection for Embedding Models »
Liping Liu · Francisco Ruiz · Susan Athey · David Blei -
2016 : Causal Inference for Recommendation Systems »
David Blei -
2016 : Panel Discussion »
Shakir Mohamed · David Blei · Ryan Adams · José Miguel Hernández-Lobato · Ian Goodfellow · Yarin Gal -
2016 : Deep exponential families »
David Blei -
2016 Workshop: Advances in Approximate Bayesian Inference »
Tamara Broderick · Stephan Mandt · James McInerney · Dustin Tran · David Blei · Kevin Murphy · Andrew Gelman · Michael I Jordan -
2016 Poster: Operator Variational Inference »
Rajesh Ranganath · Dustin Tran · Jaan Altosaar · David Blei -
2016 Poster: The Generalized Reparameterization Gradient »
Francisco Ruiz · Michalis Titsias · David Blei -
2016 Poster: Exponential Family Embeddings »
Maja Rudolph · Francisco Ruiz · Stephan Mandt · David Blei -
2016 Tutorial: Variational Inference: Foundations and Modern Methods »
David Blei · Shakir Mohamed · Rajesh Ranganath -
2015 Workshop: Advances in Approximate Bayesian Inference »
Dustin Tran · Tamara Broderick · Stephan Mandt · James McInerney · Shakir Mohamed · Alp Kucukelbir · Matthew D. Hoffman · Neil Lawrence · David Blei -
2015 Poster: The Population Posterior and Bayesian Modeling on Streams »
James McInerney · Rajesh Ranganath · David Blei -
2015 Poster: Automatic Variational Inference in Stan »
Alp Kucukelbir · Rajesh Ranganath · Andrew Gelman · David Blei -
2015 Spotlight: Automatic Variational Inference in Stan »
Alp Kucukelbir · Rajesh Ranganath · Andrew Gelman · David Blei -
2015 Poster: Copula variational inference »
Dustin Tran · David Blei · Edo M Airoldi