Timezone: »
Active search is a learning paradigm for actively identifying as many members of a given class as possible. A critical target scenario is high-throughput screening for scientific discovery, such as drug or materials discovery. In these settings, specialized instruments can often evaluate \emph{multiple} points simultaneously; however, all existing work on active search focuses on sequential acquisition. We bridge this gap, addressing batch active search from both the theoretical and practical perspective. We first derive the Bayesian optimal policy for this problem, then prove a lower bound on the performance gap between sequential and batch optimal policies: the ``cost of parallelization.'' We also propose novel, efficient batch policies inspired by state-of-the-art sequential policies, and develop an aggressive pruning technique that can dramatically speed up computation. We conduct thorough experiments on data from three application domains: a citation network, material science, and drug discovery, testing all proposed policies (14 total) with a wide range of batch sizes. Our results demonstrate that the empirical performance gap matches our theoretical bound, that nonmyopic policies usually significantly outperform myopic alternatives, and that diversity is an important consideration for batch policy design.
Author Information
Shali Jiang (Washington University in St. Louis)
Gustavo Malkomes (Washington University in St. Louis)
Matthew Abbott (Washington University in St. Louis)
Benjamin Moseley (Carnegie Mellon University)
Roman Garnett (Washington University in St. Louis)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Spotlight: Efficient nonmyopic batch active search »
Thu. Dec 6th 08:35 -- 08:40 PM Room Room 220 CD
More from the Same Authors
-
2022 : On Multi-information source Constraint Active Search »
Gustavo Malkomes · Bolong Cheng · Santiago Miret -
2022 : Group SELFIES: A Robust Fragment-Based Molecular String Representation »
Austin Cheng · Andy Cai · Santiago Miret · Gustavo Malkomes · Mariano Phielipp · Alan Aspuru-Guzik -
2023 Poster: Online List Labeling with Predictions »
Samuel McCauley · Benjamin Moseley · Aidin Niaparast · Shikha Singh -
2023 Poster: The Behavior and Convergence of Local Bayesian Optimization »
Kaiwen Wu · Kyurae Kim · Roman Garnett · Jacob Gardner -
2022 : Panel »
Roman Garnett · José Miguel Hernández-Lobato · Eytan Bakshy · Syrine Belakaria · Stefanie Jegelka -
2022 Poster: Local Bayesian optimization via maximizing probability of descent »
Quan Nguyen · Kaiwen Wu · Jacob Gardner · Roman Garnett -
2022 Poster: Algorithms with Prediction Portfolios »
Michael Dinitz · Sungjin Im · Thomas Lavastida · Benjamin Moseley · Sergei Vassilvitskii -
2021 : AI workloads inside databases »
Guy Van den Broeck · Alexander Ratner · Benjamin Moseley · Konstantinos Karanasos · Parisa Kordjamshidi · Molham Aref · Arun Kumar -
2021 Poster: Robust Online Correlation Clustering »
Silvio Lattanzi · Benjamin Moseley · Sergei Vassilvitskii · Yuyan Wang · Rudy Zhou -
2021 Oral: Faster Matchings via Learned Duals »
Michael Dinitz · Sungjin Im · Thomas Lavastida · Benjamin Moseley · Sergei Vassilvitskii -
2021 Poster: Faster Matchings via Learned Duals »
Michael Dinitz · Sungjin Im · Thomas Lavastida · Benjamin Moseley · Sergei Vassilvitskii -
2020 Poster: Fair Hierarchical Clustering »
Sara Ahmadian · Alessandro Epasto · Marina Knittel · Ravi Kumar · Mohammad Mahdian · Benjamin Moseley · Philip Pham · Sergei Vassilvitskii · Yuyan Wang -
2020 Poster: Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees »
Shali Jiang · Daniel Jiang · Maximilian Balandat · Brian Karrer · Jacob Gardner · Roman Garnett -
2019 Poster: Backprop with Approximate Activations for Memory-efficient Network Training »
Ayan Chakrabarti · Benjamin Moseley -
2019 Poster: Cost Effective Active Search »
Shali Jiang · Roman Garnett · Benjamin Moseley -
2019 Poster: D-VAE: A Variational Autoencoder for Directed Acyclic Graphs »
Muhan Zhang · Shali Jiang · Zhicheng Cui · Roman Garnett · Yixin Chen -
2018 Poster: Automating Bayesian optimization with Bayesian optimization »
Gustavo Malkomes · Roman Garnett -
2017 Poster: Approximation Bounds for Hierarchical Clustering: Average Linkage, Bisecting K-means, and Local Search »
Benjamin Moseley · Joshua Wang -
2017 Oral: Approximation Bounds for Hierarchical Clustering: Average Linkage, Bisecting K-means, and Local Search »
Benjamin Moseley · Joshua Wang -
2016 Poster: Bayesian optimization for automated model selection »
Gustavo Malkomes · Charles Schaff · Roman Garnett -
2015 : *Roman Garnett* Bayesian Quadrature: Lessons Learned and Looking Forwards »
Roman Garnett -
2015 Poster: Fast Distributed k-Center Clustering with Outliers on Massive Data »
Gustavo Malkomes · Matt J Kusner · Wenlin Chen · Kilian Q Weinberger · Benjamin Moseley -
2015 Poster: Bayesian Active Model Selection with an Application to Automated Audiometry »
Jacob Gardner · Gustavo Malkomes · Roman Garnett · Kilian Weinberger · Dennis Barbour · John Cunningham -
2014 Poster: Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature »
Tom Gunter · Michael A Osborne · Roman Garnett · Philipp Hennig · Stephen J Roberts -
2013 Poster: Σ-Optimality for Active Learning on Gaussian Random Fields »
Yifei Ma · Roman Garnett · Jeff Schneider