Timezone: »
The design of functional molecules relies on the representation used: a flexible and informative representation can improve downstream generation tasks. String representations such as SMILES and SELFIES serve as the basis for chemical language models, and the robustness of SELFIES makes it naturally suited for molecular optimization with genetic algorithms. But while SMILES and SELFIES are atomic representations, several recent approaches take advantage of the inductive bias of molecular fragments. In this work, we present Group SELFIES, introducing group tokens that represent functional groups or entire substructures while maintaining robustness. Group tokens give control over which structures should be preserved during optimization. Experiments indicate that Group SELFIES improves distribution learning and improves the quality of molecules generated by simply taking random Group SELFIES strings. The code is available at \url{https://anonymous.4open.science/r/group-selfies-4D87/}.
Author Information
Austin Cheng (University of Toronto)
Andy Cai (University of Toronto)
Santiago Miret (Intel AI Lab)
Gustavo Malkomes (Intel)
Mariano Phielipp (Intel AI Labs)
Dr. Mariano Phielipp works at the Intel AI Lab inside the Intel Artificial Intelligence Products Group. His work includes research and development in deep learning, deep reinforcement learning, machine learning, and artificial intelligence. Since joining Intel, Dr. Phielipp has developed and worked on Computer Vision, Face Recognition, Face Detection, Object Categorization, Recommendation Systems, Online Learning, Automatic Rule Learning, Natural Language Processing, Knowledge Representation, Energy Based Algorithms, and other Machine Learning and AI-related efforts. Dr. Phielipp has also contributed to different disclosure committees, won an Intel division award related to Robotics, and has a large number of patents and pending patents. He has published on NeuriPS, ICML, ICLR, AAAI, IROS, IEEE, SPIE, IASTED, and EUROGRAPHICS-IEEE Conferences and Workshops.
Alan Aspuru-Guzik (University of Toronto)
More from the Same Authors
-
2020 : Safety Aware Reinforcement Learning (SARL) »
Santiago Miret -
2021 : The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks »
Rafael Rafailov · · Tianhe Yu · Avi Singh · Mariano Phielipp · Chelsea Finn -
2021 : Learning Discrete Neural Reaction Class to Improve Retrosynthesis Prediction »
Théophile Gaudin · Animesh Garg · Alan Aspuru-Guzik -
2021 : The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks »
Rafael Rafailov · · Tianhe Yu · Avi Singh · Mariano Phielipp · Chelsea Finn -
2022 : Offline Policy Comparison with Confidence: Benchmarks and Baselines »
Anurag Koul · Mariano Phielipp · Alan Fern -
2022 : Multi-Objective GFlowNets »
Moksh Jain · Sharath Chandra Raparthy · Alex Hernandez-Garcia · Jarrid Rector-Brooks · Yoshua Bengio · Santiago Miret · Emmanuel Bengio -
2022 : Assessing multi-objective optimization of molecules with genetic algorithms against relevant baselines »
Nathanael Kusanda · Gary Tom · Riley Hickman · AkshatKumar Nigam · Kjell Jorner · Alan Aspuru-Guzik -
2022 : On Multi-information source Constraint Active Search »
Gustavo Malkomes · Bolong Cheng · Santiago Miret -
2022 : PhAST: Physics-Aware, Scalable, and Task-specific GNNs for accelerated catalyst design »
ALEXANDRE DUVAL · Victor Schmidt · Alex Hernandez-Garcia · Santiago Miret · Yoshua Bengio · David Rolnick -
2022 : Human-in-the-Loop Approaches For Task Guidance In Manufacturing Settings »
Ramesh Manuvinakurike · Santiago Miret · Richard Beckwith · Saurav Sahay · Giuseppe Raffa -
2022 : Hyperparameter Optimization of Graph Neural Networks for the OpenCatalyst Dataset: A Case Study »
Carmelo Gonzales · Eric Lee · Kin Long Kelvin Lee · Joyce Tang · Santiago Miret -
2022 : Conformer Search Using SE3-Transformers and Imitation Learning »
Luca Thiede · Santiago Miret · Krzysztof Sadowski · Haoping Xu · Mariano Phielipp · Alan Aspuru-Guzik -
2022 : Open MatSci ML Toolkit: A Flexible Framework for Machine Learning in Materials Science »
Santiago Miret · Kin Long Kelvin Lee · Carmelo Gonzales · Marcel Nassar · Krzysztof Sadowski -
2022 Workshop: AI for Accelerated Materials Design (AI4Mat) »
Santiago Miret · Marta Skreta · Zamyla Morgan-Chan · Benjamin Sanchez-Lengeling · Shyue Ping Ong · Alan Aspuru-Guzik -
2021 : Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision Quantization »
Santiago Miret · Vui Seng Chua · Mattias Marder · Mariano Phielipp · Nilesh Jain · Somdeb Majumdar -
2020 : Panel »
Alan Aspuru-Guzik · Jennifer Listgarten · Klaus-Robert Müller · Nadine Schneider -
2020 Workshop: Learning Meaningful Representations of Life (LMRL.org) »
Elizabeth Wood · Debora Marks · Ray Jones · Adji Bousso Dieng · Alan Aspuru-Guzik · Anshul Kundaje · Barbara Engelhardt · Chang Liu · Edward Boyden · Kresten Lindorff-Larsen · Mor Nitzan · Smita Krishnaswamy · Wouter Boomsma · Yixin Wang · David Van Valen · Orr Ashenberg -
2020 Poster: Language-Conditioned Imitation Learning for Robot Manipulation Tasks »
Simon Stepputtis · Joseph Campbell · Mariano Phielipp · Stefan Lee · Chitta Baral · Heni Ben Amor -
2020 Spotlight: Language-Conditioned Imitation Learning for Robot Manipulation Tasks »
Simon Stepputtis · Joseph Campbell · Mariano Phielipp · Stefan Lee · Chitta Baral · Heni Ben Amor -
2020 Poster: Instance-based Generalization in Reinforcement Learning »
Martin Bertran · Natalia Martinez · Mariano Phielipp · Guillermo Sapiro -
2019 : Alán Aspuru-Guzik »
Alan Aspuru-Guzik -
2019 : Molecules and Genomes »
David Haussler · Djork-Arné Clevert · Michael Keiser · Alan Aspuru-Guzik · David Duvenaud · David Jones · Jennifer Wei · Alexander D'Amour -
2019 Poster: Goal-conditioned Imitation Learning »
Yiming Ding · Carlos Florensa · Pieter Abbeel · Mariano Phielipp -
2017 : Machine Learning for Molecular Materials Design »
Alan Aspuru-Guzik -
2017 Workshop: Machine Learning for Molecules and Materials »
Kristof Schütt · Klaus-Robert Müller · Anatole von Lilienfeld · José Miguel Hernández-Lobato · Klaus-Robert Müller · Alan Aspuru-Guzik · Bharath Ramsundar · Matt Kusner · Brooks Paige · Stefan Chmiela · Alexandre Tkatchenko · Anatole von Lilienfeld · Koji Tsuda