Timezone: »
Pre-trained protein language models have demonstrated significant applicability in different protein engineering task. A general usage of these pre-trained transformer models latent representation is to use a mean pool across residue positions to reduce the feature dimensions to further downstream tasks such as predicting bio-physics properties or other functional behaviours. In this paper we provide a two-fold contribution to machine learning (ML) driven drug design. Firstly, we demonstrate the power of sparsity by promoting penalization of pre- trained transformer models to secure more robust and accurate melting temperature (Tm) prediction of single-chain variable fragments with a mean absolute error of 0.23C. Secondly, we demonstrate the power of framing our prediction problem in a probabilistic framework. Specifically, we advocate for the need of adopting probabilistic frameworks especially in the context of ML driven drug design.
Author Information
Yevgen Zainchkovskyy (Technical University of Denmark)
Jesper Ferkinghoff-Borg (Novo Nordisk A/S)
Anja Bennett
Thomas Egebjerg (Novo Nordisk AS)
Nikolai Lorenzen
Per Greisen (Novo Nordisk)
Søren Hauberg (Technical University of Denmark)
Carsten Stahlhut (Novo Nordisk A/S)
More from the Same Authors
-
2021 : A kernel for continuously relaxed, discrete Bayesian optimization of protein sequences »
Yevgen Zainchkovskyy · Simon Bartels · Søren Hauberg · Jes Frellsen · Wouter Boomsma -
2021 Meetup: Copenhagen, Denmark »
Søren Hauberg -
2022 : Identifying endogenous peptide receptors by combining structure and transmembrane topology prediction »
Felix Teufel · Jan Christian Refsgaard · Christian Toft Madsen · Carsten Stahlhut · Mads Grønborg · Dennis Madsen · Ole Winther -
2022 : Optimal Latent Transport »
Hrittik Roy · Søren Hauberg -
2022 : Identifying latent distances with Finslerian geometry »
Alison Pouplin · David Eklund · Carl Henrik Ek · Søren Hauberg -
2022 Poster: Revisiting Active Sets for Gaussian Process Decoders »
Pablo Moreno-Muñoz · Cilie Feldager · Søren Hauberg -
2022 Poster: Laplacian Autoencoders for Learning Stochastic Representations »
Marco Miani · Frederik Warburg · Pablo Moreno-Muñoz · Nicki Skafte · Søren Hauberg -
2021 Poster: Bounds all around: training energy-based models with bidirectional bounds »
Cong Geng · Jia Wang · Zhiyong Gao · Jes Frellsen · Søren Hauberg -
2020 : Isometric Gaussian Process Latent Variable Model »
Martin Jørgensen · Søren Hauberg -
2020 : Invited Talk 3: Reparametrization invariance in representation learning »
Søren Hauberg -
2019 Poster: Reliable training and estimation of variance networks »
Nicki Skafte · Martin Jørgensen · Søren Hauberg -
2019 Poster: Explicit Disentanglement of Appearance and Perspective in Generative Models »
Nicki Skafte · Søren Hauberg -
2016 Poster: A Locally Adaptive Normal Distribution »
Georgios Arvanitidis · Lars K Hansen · Søren Hauberg -
2011 Demonstration: A smartphone 3D functional brain scanner »
Carsten Stahlhut · Arkadiusz Stopczynski · Jakob Eg Larsen · Michael K Petersen · Lars K Hansen