Timezone: »

Simultaneous Missing Value Imputation and Structure Learning with Groups
Pablo Morales-Alvarez · Wenbo Gong · Angus Lamb · Simon Woodhead · Simon Peyton Jones · Nick Pawlowski · Miltiadis Allamanis · Cheng Zhang

Thu Dec 01 02:00 PM -- 04:00 PM (PST) @ Hall J #339

Learning structures between groups of variables from data with missing values is an important task in the real world, yet difficult to solve. One typical scenario is discovering the structure among topics in the education domain to identify learning pathways. Here, the observations are student performances for questions under each topic which contain missing values. However, most existing methods focus on learning structures between a few individual variables from the complete data. In this work, we propose VISL, a novel scalable structure learning approach that can simultaneously infer structures between groups of variables under missing data and perform missing value imputations with deep learning. Particularly, we propose a generative model with a structured latent space and a graph neural network-based architecture, scaling to a large number of variables. Empirically, we conduct extensive experiments on synthetic, semi-synthetic, and real-world education data sets. We show improved performances on both imputation and structure learning accuracy compared to popular and recent approaches.

Author Information

Pablo Morales-Alvarez (University of Granada)
Wenbo Gong (Microsoft)
Angus Lamb (Microsoft Research)
Simon Woodhead (Eedi)
Simon Woodhead

Head of Research and co-founder of Eedi, and host of the Data Science in Education meetup in London. He leads machine learning research at Eedi, and turns this into new product features. With experience leading both product development and research, he has created award-winning edtech solutions with strong data science foundations.

Simon Peyton Jones (Microsoft Research, Cambridge)
Nick Pawlowski (Microsoft Research)
Miltiadis Allamanis (Microsoft Research)
Cheng Zhang (Microsoft Research, Cambridge, UK)

Cheng Zhang is a principal researcher at Microsoft Research Cambridge, UK. She leads the Data Efficient Decision Making (Project Azua) team in Microsoft. Before joining Microsoft, she was with the statistical machine learning group of Disney Research Pittsburgh, located at Carnegie Mellon University. She received her Ph.D. from the KTH Royal Institute of Technology. She is interested in advancing machine learning methods, including variational inference, deep generative models, and sequential decision-making under uncertainty; and adapting machine learning to social impactful applications such as education and healthcare. She co-organized the Symposium on Advances in Approximate Bayesian Inference from 2017 to 2019.

More from the Same Authors