Poster
|
|
Improved Bayes Regret Bounds for Multi-Task Hierarchical Bayesian Bandit Algorithms
Jiechao Guan · Hui Xiong
|
|
Competition
|
Sat 14:50
|
Combinatorial Bandits with Full Bandit Feedback
Vaneet Aggarwal
|
|
Workshop
|
|
Adaptive Transductive Inference via Sequential Experimental Design with Contextual Retention
Tareq Si Salem
|
|
Workshop
|
|
Prioritization Strategies for LLM-Designed Restless Bandit Rewards in Public Health
Shresth Verma · Niclas Boehmer · Lingkai Kong · Milind Tambe
|
|
Workshop
|
|
The Crucial Role of Samplers in Online Direct Preference Optimization
Ruizhe Shi · Runlong Zhou · Simon Du
|
|
Workshop
|
|
Fast Convergence of Softmax Policy Mirror Ascent for Bandits & Tabular MDPs
Reza Asad · Reza Babanezhad Harikandeh · Issam Hadj Laradji · Nicolas Le Roux · Sharan Vaswani
|
|
Workshop
|
|
Uncoupled and Convergent Learning in Monotone Games under Bandit Feedback
Jing Dong · Baoxiang Wang · Yaoliang Yu
|
|
Workshop
|
|
Order-Optimal Regret in Distributed Kernel Bandits using Uniform Sampling with Shared Randomness
Nikola Pavlovic · Sudeep Salgia · Qing Zhao
|
|
Workshop
|
|
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao · Chenlu Ye · Quanquan Gu · Tong Zhang
|
|
Workshop
|
|
Balancing Act: Prioritization Strategies for LLM-Designed Restless Bandit Rewards
Shresth Verma · Niclas Boehmer · Lingkai Kong · Milind Tambe
|
|
Workshop
|
|
A Unified Framework for Speculative Decoding with Multiple Drafters as a Bandit
Taehyeon Kim · Hojung Jung · Se-Young Yun
|
|
Workshop
|
|
Incentivized Exploration in Two-sided Matching Markets
Dung Ngo · Vamsi Potluru · Manuela Veloso
|
|