Timezone: »

VAEM: a Deep Generative Model for Heterogeneous Mixed Type Data
Chao Ma · Sebastian Tschiatschek · Richard Turner · José Miguel Hernández-Lobato · Cheng Zhang

Thu Dec 10 09:00 AM -- 11:00 AM (PST) @ Poster Session 5 #1365

Deep generative models often perform poorly in real-world applications due to the heterogeneity of natural data sets. Heterogeneity arises from data containing different types of features (categorical, ordinal, continuous, etc.) and features of the same type having different marginal distributions. We propose an extension of variational autoencoders (VAEs) called VAEM to handle such heterogeneous data. VAEM is a deep generative model that is trained in a two stage manner, such that the first stage provides a more uniform representation of the data to the second stage, thereby sidestepping the problems caused by heterogeneous data. We provide extensions of VAEM to handle partially observed data, and demonstrate its performance in data generation, missing data prediction and sequential feature selection tasks. Our results show that VAEM broadens the range of real-world applications where deep generative models can be successfully deployed.

Author Information

Chao Ma (University of Cambridge)
Sebastian Tschiatschek (Microsoft Research)
Richard Turner (University of Cambridge)
Jose Miguel Hernández-Lobato (University of Cambridge)
Cheng Zhang (Microsoft Research, Cambridge, UK)

Cheng Zhang is a principal researcher at Microsoft Research Cambridge, UK. She leads the Data Efficient Decision Making (Project Azua) team in Microsoft. Before joining Microsoft, she was with the statistical machine learning group of Disney Research Pittsburgh, located at Carnegie Mellon University. She received her Ph.D. from the KTH Royal Institute of Technology. She is interested in advancing machine learning methods, including variational inference, deep generative models, and sequential decision-making under uncertainty; and adapting machine learning to social impactful applications such as education and healthcare. She co-organized the Symposium on Advances in Approximate Bayesian Inference from 2017 to 2019.

More from the Same Authors