firstbacksecondback
99 Results
Poster
|
Fri 16:30 |
Thompson Sampling For Combinatorial Bandits: Polynomial Regret and Mismatched Sampling Paradox Raymond Zhang · Richard Combes |
|
Poster
|
Fri 16:30 |
Soft-Label Integration for Robust Toxicity Classification Zelei Cheng · Xian Wu · Jiahao Yu · Shuo Han · Xin-Qiang Cai · Xinyu Xing |
|
Poster
|
Wed 11:00 |
Exploratory Retrieval-Augmented Planning For Continual Embodied Instruction Following Minjong Yoo · Jinwoo Jang · Wei-Jin Park · Honguk Woo |
|
Poster
|
Fri 16:30 |
Preference-based Pure Exploration Apurv Shukla · Debabrota Basu |
|
Poster
|
Wed 16:30 |
Policy Mirror Descent with Lookahead Kimon Protopapas · Anas Barakat |
|
Poster
|
Fri 16:30 |
Towards Diverse Device Heterogeneous Federated Learning via Task Arithmetic Knowledge Integration Mahdi Morafah · Vyacheslav Kungurtsev · Hojin Chang · Chen Chen · Bill Lin |
|
Poster
|
Wed 11:00 |
Graph Learning for Numeric Planning Dillon Chen · Sylvie Thiebaux |
|
Poster
|
Fri 11:00 |
How Does Variance Shape the Regret in Contextual Bandits? Zeyu Jia · Jian Qian · Alexander Rakhlin · Chen-Yu Wei |
|
Poster
|
Wed 16:30 |
Deterministic Policies for Constrained Reinforcement Learning in Polynomial Time Jeremy McMahan |
|
Poster
|
Wed 11:00 |
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging Zhenyi Lu · Chenghao Fan · Wei Wei · Xiaoye Qu · Dangyang Chen · Yu Cheng |
|
Poster
|
Fri 16:30 |
ALPINE: Unveiling The Planning Capability of Autoregressive Learning in Language Models Siwei Wang · Yifei Shen · Shi Feng · Haoran Sun · Shang-Hua Teng · Wei Chen |
|
Poster
|
Fri 16:30 |
What type of inference is planning? Miguel Lazaro-Gredilla · Li Ku · Kevin Murphy · Dileep George |