Workshop: AI for Science: Mind the Gaps

Fragment-Based Sequential Translation for Molecular Optimization

Benson Chen · Xiang Fu · Regina Barzilay · Tommi Jaakkola


Search of novel molecular compounds with desired properties is an important problem in drug discovery. Many existing generative models for molecules operate on the atom level. We instead focus on generating molecular fragments--meaningful substructures of molecules. We construct a coherent latent representation for molecular fragments through a learned variational autoencoder (VAE) that is capable of generating diverse and meaningful fragments. Equipped with the learned fragment vocabulary, we propose Fragment-based Sequential Translation (FaST), which iteratively translates model-discovered molecules into increasingly novel molecules with high property scores. Empirical evaluation shows that FaST achieves significant improvement over state-of-the-art methods on benchmark single-objective/multi-objective molecular optimization tasks.