Timezone: »

Improving black-box optimization in VAE latent space using decoder uncertainty
Pascal Notin · José Miguel Hernández-Lobato · Yarin Gal

Wed Dec 08 12:30 AM -- 02:00 AM (PST) @

Optimization in the latent space of variational autoencoders is a promising approach to generate high-dimensional discrete objects that maximize an expensive black-box property (e.g., drug-likeness in molecular generation, function approximation with arithmetic expressions). However, existing methods lack robustness as they may decide to explore areas of the latent space for which no data was available during training and where the decoder can be unreliable, leading to the generation of unrealistic or invalid objects. We propose to leverage the epistemic uncertainty of the decoder to guide the optimization process. This is not trivial though, as a naive estimation of uncertainty in the high-dimensional and structured settings we consider would result in high estimator variance. To solve this problem, we introduce an importance sampling-based estimator that provides more robust estimates of epistemic uncertainty. Our uncertainty-guided optimization approach does not require modifications of the model architecture nor the training process. It produces samples with a better trade-off between black-box objective and validity of the generated samples, sometimes improving both simultaneously. We illustrate these advantages across several experimental settings in digit generation, arithmetic expression approximation and molecule generation for drug design.

Author Information

Pascal Notin (University of Oxford)
Pascal Notin

I am a Ph.D. candidate in the Oxford Applied and Theoretical Machine Learning Group, part of the Computer Science Department at the University of Oxford, under the supervision of Yarin Gal. My research is in machine learning and motivated by questions in computational biology and chemistry. From a machine learning standpoint, I have been focusing primarily on generative modeling, Bayesian deep learning, large-scale training and active learning. This has led to the development of new methods for protein modeling, mutation effects prediction and de novo drug design. I have co-created and am the lead organizer for the Machine Learning for Drug Discovery (MLDD) workshop at ICLR, co-organized the GeneDisco challenge, and co-organize the Workshop on Computational Biology (WCB) at ICML. I have several years of applied machine learning experience developing AI solutions, primarily within the healthcare and pharmaceutical industries. Prior to coming to Oxford, I was a Senior Manager at McKinsey & Company in the New York and Paris offices, where I was leading cross-disciplinary teams on fast-paced analytics engagements. I obtained a M.S. in Operations Research from Columbia University, and a B.S. and M.S. in Applied Mathematics from Ecole Polytechnique. My research is funded by the Engineering and Physical Sciences Research Council and a GSK scholarship.

José Miguel Hernández-Lobato (University of Cambridge)
Yarin Gal (University of Oxford)
Yarin Gal

Yarin leads the Oxford Applied and Theoretical Machine Learning (OATML) group. He is an Associate Professor of Machine Learning at the Computer Science department, University of Oxford. He is also the Tutorial Fellow in Computer Science at Christ Church, Oxford, and a Turing Fellow at the Alan Turing Institute, the UK’s national institute for data science and artificial intelligence. Prior to his move to Oxford he was a Research Fellow in Computer Science at St Catharine’s College at the University of Cambridge. He obtained his PhD from the Cambridge machine learning group, working with Prof Zoubin Ghahramani and funded by the Google Europe Doctoral Fellowship. He made substantial contributions to early work in modern Bayesian deep learning—quantifying uncertainty in deep learning—and developed ML/AI tools that can inform their users when the tools are “guessing at random”. These tools have been deployed widely in industry and academia, with the tools used in medical applications, robotics, computer vision, astronomy, in the sciences, and by NASA. Beyond his academic work, Yarin works with industry on deploying robust ML tools safely and responsibly. He co-chairs the NASA FDL AI committee, and is an advisor with Canadian medical imaging company Imagia, Japanese robotics company Preferred Networks, as well as numerous startups.

More from the Same Authors