Timezone: »

Poster
Censored Semi-Bandits: A Framework for Resource Allocation with Censored Feedback
Arun Verma · Manjesh Kumar Hanawal · Arun Rajkumar · Raman Sankaran

Thu Dec 12 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #40

In this paper, we study Censored Semi-Bandits, a novel variant of the semi-bandits problem. The learner is assumed to have a fixed amount of resources, which it allocates to the arms at each time step. The loss observed from an arm is random and depends on the amount of resources allocated to it. More specifically, the loss equals zero if the allocation for the arm exceeds a constant (but unknown) threshold that can be dependent on the arm. Our goal is to learn a feasible allocation that minimizes the expected loss. The problem is challenging because the loss distribution and threshold value of each arm are unknown. We study this novel setting by establishing its equivalence' to Multiple-Play Multi-Armed Bandits (MP-MAB) and Combinatorial Semi-Bandits. Exploiting these equivalences, we derive optimal algorithms for our setting using existing algorithms for MP-MAB and Combinatorial Semi-Bandits. Experiments on synthetically generated data validate performance guarantees of the proposed algorithms.

Author Information

Arun Verma (Indian Institute of Technology Bombay)

Postdoctoral Research Fellow at National University of Singapore

Manjesh Kumar Hanawal (Indian Institute of Technology Bombay)

Manjesh K. Hanawal received the M.S. degree in ECE from the Indian Institute of Science, Bengaluru, India, in 2009, and the Ph.D. degree from INRIA, Sophia Antipolis, France, and the University of Avignon, Avignon, France, in 2013. He was a Scientist-B with the Center for Artificial Intelligence and Robotics, DRDO, Bengaluru, India. He was a Post-Doctoral Fellow with Boston University for two years. He is currently an Assistant Professor in industrial engineering and operations research with the Indian Institute of Technology Bombay, Mumbai, India. His research interests include performance evaluation, machine learning, and network economics. He is a recipient of the Inspire Faculty Award from DST and the Early Career Research Award from SERB, Govt. of India.