firstbacksecondback
1374 Results
Poster
|
Wed 16:30 |
BEACON: Benchmark for Comprehensive RNA Tasks and Language Models Yuchen Ren · Zhiyuan Chen · Lifeng Qiao · Hongtai Jing · Yuchen Cai · Sheng Xu · Peng Ye · Xinzhu Ma · Siqi Sun · Hongliang Yan · Dong Yuan · Wanli Ouyang · Xihui Liu |
|
Poster
|
CoIN: A Benchmark of Continual Instruction Tuning for Multimodel Large Language Models Cheng Chen · Junchen Zhu · Xu Luo · Hengtao Shen · Jingkuan Song · Lianli Gao |
||
Poster
|
Fri 11:00 |
Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models Yilun Jin · Zheng Li · Chenwei Zhang · Tianyu Cao · Yifan Gao · Pratik Jayarao · Mao Li · Xin Liu · Ritesh Sarkhel · Xianfeng Tang · Haodong Wang · Zhengyang Wang · Wenju Xu · Jingfeng Yang · Qingyu Yin · Xian Li · Priyanka Nigam · Yi Xu · Kai Chen · Qiang Yang · Meng Jiang · Bing Yin |
|
Poster
|
Wed 11:00 |
Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning Hao Shao · Shengju Qian · Han Xiao · Guanglu Song · ZHUOFAN ZONG · Letian Wang · Yu Liu · Hongsheng Li |
|
Poster
|
FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving Xiaohan Lin · Qingxing Cao · Yinya Huang · Haiming Wang · Jianqiao Lu · Zhengying Liu · Linqi Song · Xiaodan Liang |
||
Poster
|
Wed 11:00 |
Automating Dataset Updates Towards Reliable and Timely Evaluation of Large Language Models Jiahao Ying · Yixin Cao · Yushi Bai · QIANRU SUN · Bo Wang · Wei Tang · Zhaojun Ding · Yizhe Yang · Xuanjing Huang · Shuicheng Yan |