Timezone: »

A kernel for continuously relaxed, discrete Bayesian optimization of protein sequences
Yevgen Zainchkovskyy · Simon Bartels · Søren Hauberg · Jes Frellsen · Wouter Boomsma

Protein sequences follow a discrete alphabet rendering gradient-based techniques a poor choice for optimization-driven protein design. Contemporary approaches instead perform optimization in a continuous latent representation, but unfortunately the representation metric is generally a poor measure similarity between the represented proteins. This make (global) Bayesian optimization over such latent representations inefficient as commonly applied covariance functions are strongly dependent on the representation metric. Here we argue in favor of using the Jensen-Shannon divergence between the represented protein sequences to define a covariance function over the latent representation. Our exploratory experiments indicate that this kernel is worth further investigation.

Author Information

Yevgen Zainchkovskyy (Technical University of Denmark)
Simon Bartels (Copenhagen University)
Søren Hauberg (Technical University of Denmark)
Jes Frellsen (Technical University of Denmark)
Wouter Boomsma (University of Copenhagen)

More from the Same Authors