Timezone: »
Advancements in DNA synthesis and sequencing technologies have enabled a novel paradigm of protein design where machine learning models trained on experimental data are used to guide exploration of a protein sequence landscape. ML-guided directed evolution (MLDE) has the potential to not only build upon the successes of directed evolution, but to also unlock new strategies that can make more efficient use of experimental data, and trade off between multiple optimization objectives. Building an MLDE pipeline involves manifold design choices ranging from data collection strategies to modeling choices, each of which has a large impact on the downstream effectiveness of designed sequences. The cost of collecting experimental data makes benchmarking these pipelines on real data prohibitively difficult, necessitating the development of synthetic landscapes where MLDE strategies can be tested. In this work, we develop a framework called SLIP (“Synthetic Landscape Inference for Proteins”) for constructing synthetic landscapes with tunable difficulty based on Potts Models. SLIP is open-source.
Author Information
Neil Thomas (UC Berkeley)
Atish Agarwala (Google Research)
David Belanger (Google)
Yun Song (UC Berkeley)
Lucy Colwell (Cambridge University)
More from the Same Authors
-
2021 : End-to-end learning of multiple sequence alignmentswith differentiable Smith-Waterman »
Samantha Petti · Nicholas Bhattacharya · Roshan Rao · Justas Dauparas · Neil Thomas · Juannan Zhou · Alexander Rush · Peter Koo · Sergey Ovchinnikov -
2022 : A Second-order Regression Model Shows Edge of Stability Behavior »
Fabian Pedregosa · Atish Agarwala · Jeffrey Pennington -
2022 : Predicting conformational landscapes of known and putative fold-switching proteins using AlphaFold2 »
Hannah Wayment-Steele · Sergey Ovchinnikov · Lucy Colwell · Dorothee Kern -
2022 : Poster Session 1 »
Andrew Lowy · Thomas Bonnier · Yiling Xie · Guy Kornowski · Simon Schug · Seungyub Han · Nicolas Loizou · xinwei zhang · Laurent Condat · Tabea E. Röber · Si Yi Meng · Marco Mondelli · Runlong Zhou · Eshaan Nichani · Adrian Goldwaser · Rudrajit Das · Kayhan Behdin · Atish Agarwala · Mukul Gagrani · Gary Cheng · Tian Li · Haoran Sun · Hossein Taheri · Allen Liu · Siqi Zhang · Dmitrii Avdiukhin · Bradley Brown · Miaolan Xie · Junhyung Lyle Kim · Sharan Vaswani · Xinmeng Huang · Ganesh Ramachandra Kini · Angela Yuan · Weiqiang Zheng · Jiajin Li -
2022 : Predicting conformational landscapes of known and putative fold-switching proteins using AlphaFold2 »
Hannah Wayment-Steele · Sergey Ovchinnikov · Lucy Colwell · Dorothee Kern -
2021 : End-to-end learning of multiple sequence alignmentswith differentiable Smith-Waterman »
Samantha Petti · Nicholas Bhattacharya · Roshan Rao · Justas Dauparas · Neil Thomas · Juannan Zhou · Alexander Rush · Peter Koo · Sergey Ovchinnikov -
2020 : Is Transfer Learning Necessary for Protein Landscape Prediction? »
David Belanger · David Dohan -
2020 Poster: Evaluating Attribution for Graph Neural Networks »
Benjamin Sanchez-Lengeling · Jennifer Wei · Brian Lee · Emily Reif · Peter Wang · Wesley Qian · Kevin McCloskey · Lucy Colwell · Alexander Wiltschko -
2019 Poster: Evaluating Protein Transfer Learning with TAPE »
Roshan Rao · Nicholas Bhattacharya · Neil Thomas · Yan Duan · Peter Chen · John Canny · Pieter Abbeel · Yun Song -
2019 Spotlight: Evaluating Protein Transfer Learning with TAPE »
Roshan Rao · Nicholas Bhattacharya · Neil Thomas · Yan Duan · Peter Chen · John Canny · Pieter Abbeel · Yun Song -
2018 Poster: A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks »
Jeffrey Chan · Valerio Perrone · Jeffrey Spence · Paul Jenkins · Sara Mathieson · Yun Song -
2018 Spotlight: A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks »
Jeffrey Chan · Valerio Perrone · Jeffrey Spence · Paul Jenkins · Sara Mathieson · Yun Song