Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

67 Results

<<   <   Page 6 of 6   >>   >
Workshop
Sat 11:42 GEAR: An Efficient Error Reduction Framework for KV Cache Compression in LLM Inference
· Qingru Zhang · Souvik Kundu · Geonhwa Jeong · Zaoxing Liu · Tushar Krishna · Tuo Zhao
Workshop
Approximate Top-k for Increased Parallelism
Oscar Key · Luka Ribar · Alberto Cattaneo · Luke Hudlass-Galley · Douglas Orr
Workshop
Fused-Layer CNNs for Memory-Efficient Inference on Microcontrollers
Mark Deutel · Frank Hannig · Christopher Mutschler · Jürgen Teich
Workshop
Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection
Yun Zhu · Jia-Chen Gu · Caitlin Sikora · Ho Ko · Yinxiao Liu · Chu-Cheng Lin · Lei Shu · Liangchen Luo · Lei Meng · Bang Liu · Jindong Chen
Poster
Fri 16:30 Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation
Wangbo Zhao · Jiasheng Tang · Yizeng Han · Yibing Song · Kai Wang · Gao Huang · Fan Wang · Yang You
Poster
Wed 16:30 Toward Efficient Inference for Mixture of Experts
Haiyang Huang · Newsha Ardalani · Anna Sun · Liu Ke · Shruti Bhosale · Hsien-Hsin Lee · Carole-Jean Wu · Benjamin Lee
Workshop
Optimizing the IFMIF-DONES Particle Accelerator with Differentiable Deep Learning Surrogate Models
Galo Gallardo · Guillermo Rodriguez Llorente · Lucas Magariños · Rodrigo Morant Navascués · Nikita Kkhvatkin Petrovsky · Roberto Gómez-Espinosa Martín