Timezone: »
Data-driven design is making headway into a number of application areas, including protein, small-molecule, and materials engineering. The design goal is to construct an object with desired properties, such as a protein that binds to a therapeutic target, or a superconducting material with a higher critical temperature than previously observed. To that end, costly experimental measurements are being replaced with calls to high-capacity regression models trained on labeled data, which can be leveraged in an in silico search for design candidates. However, the design goal necessitates moving into regions of the design space beyond where such models were trained. Therefore, one can ask: should the regression model be altered as the design algorithm explores the design space, in the absence of new data? Herein, we answer this question in the affirmative. In particular, we (i) formalize the data-driven design problem as a non-zero-sum game, (ii) develop a principled strategy for retraining the regression model as the design algorithm proceeds---what we refer to as autofocusing, and (iii) demonstrate the promise of autofocusing empirically.
Author Information
Clara Fannjiang (UC Berkeley)
Jennifer Listgarten (UC Berkeley)
More from the Same Authors
-
2022 : Designing active and thermostable enzymes with sequence-only predictive models »
Clara Fannjiang · Micah Olivas · Eric Greene · Craig Markin · Bram Wallace · Ben Krause · Margaux Pinney · James Fraser · Polly Fordyce · Ali Madani · Nikhil Naik -
2020 : Invited Talk: Jennifer Listgarten »
Jennifer Listgarten -
2020 : Panel »
Alan Aspuru-Guzik · Jennifer Listgarten · Klaus-Robert Müller · Nadine Schneider -
2018 Poster: Gaussian Process Prior Variational Autoencoders »
Francesco Paolo Casale · Adrian Dalca · Luca Saglietti · Jennifer Listgarten · Nicolo Fusi