Timezone: »
Determinantal point processes (DPPs) are a useful probabilistic model for selecting a small diverse subset out of a large collection of items, with applications in summarization, recommendation, stochastic optimization, experimental design and more. Given a kernel function and a subset size k, our goal is to sample k out of n items with probability proportional to the determinant of the kernel matrix induced by the subset (a.k.a. k-DPP). Existing k-DPP sampling algorithms require an expensive preprocessing step which involves multiple passes over all n items, making it infeasible for large datasets. A naïve heuristic addressing this problem is to uniformly subsample a fraction of the data and perform k-DPP sampling only on those items, however this method offers no guarantee that the produced sample will even approximately resemble the target distribution over the original dataset. In this paper, we develop alpha-DPP, an algorithm which adaptively builds a sufficiently large uniform sample of data that is then used to efficiently generate a smaller set of k items, while ensuring that this set is drawn exactly from the target distribution defined on all n items. We show empirically that our algorithm produces a k-DPP sample after observing only a small fraction of all elements, leading to several orders of magnitude faster performance compared to the state-of-the-art. Our implementation of alpha-DPP is provided at https://github.com/guilgautier/DPPy/.
Author Information
Daniele Calandriello (DeepMind)
Michal Derezinski (UC Berkeley)
Michal Valko (DeepMind)
Michal is a machine learning scientist in DeepMind Paris, tenured researcher at Inria, and the lecturer of the master course Graphs in Machine Learning at l'ENS Paris-Saclay. Michal is primarily interested in designing algorithms that would require as little human supervision as possible. This means 1) reducing the “intelligence” that humans need to input into the system and 2) minimizing the data that humans need to spend inspecting, classifying, or “tuning” the algorithms. That is why he is working on methods and settings that are able to deal with minimal feedback, such as deep reinforcement learning, bandit algorithms, or self-supervised learning. Michal is actively working on represenation learning and building worlds models. He is also working on deep (reinforcement) learning algorithm that have some theoretical underpinning. He has also worked on sequential algorithms with structured decisions where exploiting the structure leads to provably faster learning. He received his Ph.D. in 2011 from the University of Pittsburgh under the supervision of Miloš Hauskrecht and after was a postdoc of Rémi Munos before taking a permanent position at Inria in 2012.
Related Events (a corresponding poster, oral, or spotlight)
-
2020 Poster: Sampling from a k-DPP without looking at all items »
Thu. Dec 10th 05:00 -- 07:00 PM Room Poster Session 5 #1660
More from the Same Authors
-
2021 Spotlight: Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update »
Michal Derezinski · Jonathan Lacotte · Mert Pilanci · Michael Mahoney -
2021 Spotlight: Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret »
Jean Tarbouriech · Runlong Zhou · Simon Du · Matteo Pirotta · Michal Valko · Alessandro Lazaric -
2021 Spotlight: A Provably Efficient Sample Collection Strategy for Reinforcement Learning »
Jean Tarbouriech · Matteo Pirotta · Michal Valko · Alessandro Lazaric -
2021 : One Pass ImageNet »
Clara Huiyi Hu · Ang Li · Daniele Calandriello · Dilan Gorur -
2022 : Curiosity in Hindsight »
Daniel Jarrett · Corentin Tallec · Florent Altché · Thomas Mesnard · Remi Munos · Michal Valko -
2023 Poster: Model-free Posterior Sampling via Learning Rate Randomization »
Daniil Tiapkin · Denis Belomestny · Daniele Calandriello · Eric Moulines · Remi Munos · Alexey Naumov · Pierre Perrault · Michal Valko · Pierre Ménard -
2022 Spotlight: Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees »
Daniil Tiapkin · Denis Belomestny · Daniele Calandriello · Eric Moulines · Remi Munos · Alexey Naumov · Mark Rowland · Michal Valko · Pierre Ménard -
2022 Poster: BYOL-Explore: Exploration by Bootstrapped Prediction »
Zhaohan Guo · Shantanu Thakoor · Miruna Pislar · Bernardo Avila Pires · Florent Altché · Corentin Tallec · Alaa Saade · Daniele Calandriello · Jean-Bastien Grill · Yunhao Tang · Michal Valko · Remi Munos · Mohammad Gheshlaghi Azar · Bilal Piot -
2022 Poster: Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees »
Daniil Tiapkin · Denis Belomestny · Daniele Calandriello · Eric Moulines · Remi Munos · Alexey Naumov · Mark Rowland · Michal Valko · Pierre Ménard -
2021 Oral: Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity »
Ran Liu · Mehdi Azabou · Max Dabagia · Chi-Heng Lin · Mohammad Gheshlaghi Azar · Keith Hengen · Michal Valko · Eva Dyer -
2021 Poster: ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions »
Luigi Carratino · Stefano Vigogna · Daniele Calandriello · Lorenzo Rosasco -
2021 Poster: Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity »
Ran Liu · Mehdi Azabou · Max Dabagia · Chi-Heng Lin · Mohammad Gheshlaghi Azar · Keith Hengen · Michal Valko · Eva Dyer -
2021 Poster: Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update »
Michal Derezinski · Jonathan Lacotte · Mert Pilanci · Michael Mahoney -
2021 Poster: Learning in two-player zero-sum partially observable Markov games with perfect recall »
Tadashi Kozuno · Pierre Ménard · Remi Munos · Michal Valko -
2021 Poster: Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret »
Jean Tarbouriech · Runlong Zhou · Simon Du · Matteo Pirotta · Michal Valko · Alessandro Lazaric -
2021 Poster: A Provably Efficient Sample Collection Strategy for Reinforcement Learning »
Jean Tarbouriech · Matteo Pirotta · Michal Valko · Alessandro Lazaric -
2021 Poster: Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation »
Yunhao Tang · Tadashi Kozuno · Mark Rowland · Remi Munos · Michal Valko -
2020 Poster: Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization »
Michal Derezinski · Burak Bartan · Mert Pilanci · Michael Mahoney -
2020 Poster: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning »
Jean-Bastien Grill · Florian Strub · Florent Altché · Corentin Tallec · Pierre Richemond · Elena Buchatskaya · Carl Doersch · Bernardo Avila Pires · Daniel (Zhaohan) Guo · Mohammad Gheshlaghi Azar · Bilal Piot · koray kavukcuoglu · Remi Munos · Michal Valko -
2020 Oral: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning »
Jean-Bastien Grill · Florian Strub · Florent Altché · Corentin Tallec · Pierre Richemond · Elena Buchatskaya · Carl Doersch · Bernardo Avila Pires · Daniel (Zhaohan) Guo · Mohammad Gheshlaghi Azar · Bilal Piot · koray kavukcuoglu · Remi Munos · Michal Valko -
2020 Poster: Exact expressions for double descent and implicit regularization via surrogate random design »
Michal Derezinski · Feynman Liang · Michael Mahoney -
2020 Poster: Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystrom method »
Michal Derezinski · Rajiv Khanna · Michael Mahoney -
2020 Poster: Precise expressions for random projections: Low-rank approximation and randomized Newton »
Michal Derezinski · Feynman Liang · Zhenyu Liao · Michael Mahoney -
2020 Oral: Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystrom method »
Michal Derezinski · Rajiv Khanna · Michael Mahoney -
2020 Poster: Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits »
Pierre Perrault · Etienne Boursier · Michal Valko · Vianney Perchet -
2020 Poster: Improved Sample Complexity for Incremental Autonomous Exploration in MDPs »
Jean Tarbouriech · Matteo Pirotta · Michal Valko · Alessandro Lazaric -
2020 Poster: Planning in Markov Decision Processes with Gap-Dependent Sample Complexity »
Anders Jonsson · Emilie Kaufmann · Pierre Menard · Omar Darwiche Domingues · Edouard Leurent · Michal Valko -
2020 Oral: Improved Sample Complexity for Incremental Autonomous Exploration in MDPs »
Jean Tarbouriech · Matteo Pirotta · Michal Valko · Alessandro Lazaric -
2019 Poster: Distributed estimation of the inverse Hessian by determinantal averaging »
Michal Derezinski · Michael Mahoney -
2019 Poster: Exact sampling of determinantal point processes with sublinear time preprocessing »
Michal Derezinski · Daniele Calandriello · Michal Valko -
2019 Poster: Planning in entropy-regularized Markov decision processes and games »
Jean-Bastien Grill · Omar Darwiche Domingues · Pierre Menard · Remi Munos · Michal Valko -
2019 Poster: On two ways to use determinantal point processes for Monte Carlo integration »
Guillaume Gautier · Rémi Bardenet · Michal Valko -
2019 Poster: Multiagent Evaluation under Incomplete Information »
Mark Rowland · Shayegan Omidshafiei · Karl Tuyls · Julien Perolat · Michal Valko · Georgios Piliouras · Remi Munos -
2019 Spotlight: Multiagent Evaluation under Incomplete Information »
Mark Rowland · Shayegan Omidshafiei · Karl Tuyls · Julien Perolat · Michal Valko · Georgios Piliouras · Remi Munos -
2018 Poster: Optimistic optimization of a Brownian »
Jean-Bastien Grill · Michal Valko · Remi Munos -
2018 Poster: On Fast Leverage Score Sampling and Optimal Learning »
Alessandro Rudi · Daniele Calandriello · Luigi Carratino · Lorenzo Rosasco -
2018 Poster: Statistical and Computational Trade-Offs in Kernel K-Means »
Daniele Calandriello · Lorenzo Rosasco -
2018 Spotlight: Statistical and Computational Trade-Offs in Kernel K-Means »
Daniele Calandriello · Lorenzo Rosasco -
2018 Poster: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2018 Spotlight: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2017 Poster: Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback »
Zheng Wen · Branislav Kveton · Michal Valko · Sharan Vaswani -
2017 Poster: Efficient Second-Order Online Kernel Learning with Adaptive Embedding »
Daniele Calandriello · Alessandro Lazaric · Michal Valko -
2017 Poster: Unbiased estimates for linear regression via volume sampling »
Michal Derezinski · Manfred K. Warmuth -
2017 Spotlight: Unbiased estimates for linear regression via volume sampling »
Michal Derezinski · Manfred K. Warmuth -
2016 Poster: Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning »
Jean-Bastien Grill · Michal Valko · Remi Munos -
2016 Oral: Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning »
Jean-Bastien Grill · Michal Valko · Remi Munos -
2015 Poster: Black-box optimization of noisy functions with unknown smoothness »
Jean-Bastien Grill · Michal Valko · Remi Munos · Remi Munos -
2014 Workshop: Second Workshop on Transfer and Multi-Task Learning: Theory meets Practice »
Urun Dogan · Tatiana Tommasi · Yoshua Bengio · Francesco Orabona · Marius Kloft · Andres Munoz · Gunnar Rätsch · Hal Daumé III · Mehryar Mohri · Xuezhi Wang · Daniel Hernández-lobato · Song Liu · Thomas Unterthiner · Pascal Germain · Vinay P Namboodiri · Michael Goetz · Christopher Berlind · Sigurd Spieckermann · Marta Soare · Yujia Li · Vitaly Kuznetsov · Wenzhao Lian · Daniele Calandriello · Emilie Morvant -
2014 Poster: The limits of squared Euclidean distance regularization »
Michal Derezinski · Manfred K. Warmuth -
2014 Poster: Efficient learning by implicit exploration in bandit problems with side observations »
Tomáš Kocák · Gergely Neu · Michal Valko · Remi Munos -
2014 Spotlight: The limits of squared Euclidean distance regularization »
Michal Derezinski · Manfred K. Warmuth -
2014 Poster: Extreme bandits »
Alexandra Carpentier · Michal Valko -
2014 Poster: Online combinatorial optimization with stochastic decision sets and adversarial losses »
Gergely Neu · Michal Valko -
2014 Poster: Sparse Multi-Task Reinforcement Learning »
Daniele Calandriello · Alessandro Lazaric · Marcello Restelli