Timezone: »
A version of the dueling bandit problem is addressed in which a Condorcet winner may not exist. Two algorithms are proposed that instead seek to minimize regret with respect to the Copeland winner, which, unlike the Condorcet winner, is guaranteed to exist. The first, Copeland Confidence Bound (CCB), is designed for small numbers of arms, while the second, Scalable Copeland Bandits (SCB), works better for large-scale problems. We provide theoretical results bounding the regret accumulated by CCB and SCB, both substantially improving existing results. Such existing results either offer bounds of the form O(K log T) but require restrictive assumptions, or offer bounds of the form O(K^2 log T) without requiring such assumptions. Our results offer the best of both worlds: O(K log T) bounds without restrictive assumptions.
Author Information
Masrour Zoghi (University of Amsterdam)
Zohar Karnin (Yahoo Research)
Shimon Whiteson (University of Oxford)
Maarten de Rijke (University of Amsterdam)
More from the Same Authors
-
2021 Spotlight: Bayesian Bellman Operators »
Mattie Fellows · Kristian Hartikainen · Shimon Whiteson -
2021 : No DICE: An Investigation of the Bias-Variance Tradeoff in Meta-Gradients »
Risto Vuorio · Jacob Beck · Greg Farquhar · Jakob Foerster · Shimon Whiteson -
2021 : On the Practical Consistency of Meta-Reinforcement Learning Algorithms »
Zheng Xiong · Luisa Zintgraf · Jacob Beck · Risto Vuorio · Shimon Whiteson -
2021 : Model based multi-agent reinforcement learning with tensor decompositions »
Pascal van der Vaart · Anuj Mahajan · Shimon Whiteson -
2021 : Reinforcement Learning in Factored Action Spaces using Tensor Decompositions »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviichuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2021 : Generalized Belief Learning in Multi-Agent Settings »
Darius Muglich · Luisa Zintgraf · Christian Schroeder de Witt · Shimon Whiteson · Jakob Foerster -
2022 Poster: In Defense of the Unitary Scalarization for Deep Multi-Task Learning »
Vitaly Kurin · Alessandro De Palma · Ilya Kostrikov · Shimon Whiteson · Pawan K Mudigonda -
2022 Poster: Truncated Emphatic Temporal Difference Methods for Prediction and Control »
Shangtong Zhang · Shimon Whiteson -
2022 Poster: Equivariant Networks for Zero-Shot Coordination »
Darius Muglich · Christian Schroeder de Witt · Elise van der Pol · Shimon Whiteson · Jakob Foerster -
2021 : Reinforcement Learning in Factored Action Spaces using Tensor Decompositions »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviichuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2021 : Model based multi-agent reinforcement learning with tensor decompositions »
Pascal van der Vaart · Anuj Mahajan · Shimon Whiteson -
2021 Poster: FACMAC: Factored Multi-Agent Centralised Policy Gradients »
Bei Peng · Tabish Rashid · Christian Schroeder de Witt · Pierre-Alexandre Kamienny · Philip Torr · Wendelin Boehmer · Shimon Whiteson -
2021 Poster: Bayesian Bellman Operators »
Mattie Fellows · Kristian Hartikainen · Shimon Whiteson -
2021 Poster: Regularized Softmax Deep Multi-Agent Q-Learning »
Ling Pan · Tabish Rashid · Bei Peng · Longbo Huang · Shimon Whiteson -
2021 Poster: Snowflake: Scaling GNNs to high-dimensional continuous control via parameter freezing »
Charles Blake · Vitaly Kurin · Maximilian Igl · Shimon Whiteson -
2020 Poster: Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning »
Tabish Rashid · Gregory Farquhar · Bei Peng · Shimon Whiteson -
2020 Poster: Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? »
Vitaly Kurin · Saad Godil · Shimon Whiteson · Bryan Catanzaro -
2020 Poster: Learning Retrospective Knowledge with Reverse Reinforcement Learning »
Shangtong Zhang · Vivek Veeriah · Shimon Whiteson -
2020 Demonstration: RetaiL: Open your own grocery store to reduce waste »
Sami Jullien · Sebastian Schelter · Maarten de Rijke -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Presentations »
Rahul Mehta · Andrew Lampinen · Binghong Chen · Sergio Pascual-Diaz · Jordi Grau-Moya · Aldo Faisal · Jonathan Tompson · Yiren Lu · Khimya Khetarpal · Martin Klissarov · Pierre-Luc Bacon · Doina Precup · Thanard Kurutach · Aviv Tamar · Pieter Abbeel · Jinke He · Maximilian Igl · Shimon Whiteson · Wendelin Boehmer · Raphaël Marinier · Olivier Pietquin · Karol Hausman · Sergey Levine · Chelsea Finn · Tianhe Yu · Lisa Lee · Benjamin Eysenbach · Emilio Parisotto · Eric Xing · Ruslan Salakhutdinov · Hongyu Ren · Anima Anandkumar · Deepak Pathak · Christopher Lu · Trevor Darrell · Alexei Efros · Phillip Isola · Feng Liu · Bo Han · Gang Niu · Masashi Sugiyama · Saurabh Kumar · Janith Petangoda · Johan Ferret · James McClelland · Kara Liu · Animesh Garg · Robert Lange -
2019 : Bayes-Adaptive Deep Reinforcement Learning via Meta-Learning - Invited Talk »
Shimon Whiteson -
2019 Poster: MAVEN: Multi-Agent Variational Exploration »
Anuj Mahajan · Tabish Rashid · Mikayel Samvelyan · Shimon Whiteson -
2019 Poster: Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning »
Gregory Farquhar · Shimon Whiteson · Jakob Foerster -
2019 Poster: Multi-Agent Common Knowledge Reinforcement Learning »
Christian Schroeder de Witt · Jakob Foerster · Gregory Farquhar · Philip Torr · Wendelin Boehmer · Shimon Whiteson -
2019 Poster: DAC: The Double Actor-Critic Architecture for Learning Options »
Shangtong Zhang · Shimon Whiteson -
2019 Poster: Fast Efficient Hyperparameter Tuning for Policy Gradient Methods »
Supratik Paul · Vitaly Kurin · Shimon Whiteson -
2019 Poster: VIREL: A Variational Inference Framework for Reinforcement Learning »
Mattie Fellows · Anuj Mahajan · Tim G. J. Rudner · Shimon Whiteson -
2019 Spotlight: VIREL: A Variational Inference Framework for Reinforcement Learning »
Mattie Fellows · Anuj Mahajan · Tim G. J. Rudner · Shimon Whiteson -
2019 Poster: Generalized Off-Policy Actor-Critic »
Shangtong Zhang · Wendelin Boehmer · Shimon Whiteson -
2017 Poster: Dynamic-Depth Context Tree Weighting »
Joao V Messias · Shimon Whiteson -
2016 : Learning to Communicate with Deep Multi−Agent Reinforcement Learning »
Shimon Whiteson -
2016 Poster: Learning to Communicate with Deep Multi-Agent Reinforcement Learning »
Jakob Foerster · Yannis Assael · Nando de Freitas · Shimon Whiteson -
2016 Poster: Multi-armed Bandits: Competing with Optimal Sequences »
Zohar Karnin · Oren Anava -
2016 Poster: Verification Based Solution for Structured MAB Problems »
Zohar Karnin -
2013 Poster: Distributed Exploration in Multi-Armed Bandits »
Eshcar Hillel · Zohar Karnin · Tomer Koren · Ronny Lempel · Oren Somekh -
2013 Spotlight: Distributed Exploration in Multi-Armed Bandits »
Eshcar Hillel · Zohar Karnin · Tomer Koren · Ronny Lempel · Oren Somekh -
2013 Poster: Near-Optimal Entrywise Sampling for Data Matrices »
Dimitris Achlioptas · Zohar Karnin · Edo Liberty