firstbacksecondback
45 Results
Workshop
|
Advancing Heat Demand Forecasting with Attention Mechanisms: Opportunities and Challenges Adithya Ramachandran · Andreas Maier · Siming Bayer |
||
Poster
|
Fri 16:30 |
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers Lorenzo Tiberi · Francesca Mignacco · Kazuki Irie · Haim Sompolinsky |
|
Poster
|
Wed 11:00 |
Linear Transformers are Versatile In-Context Learners Max Vladymyrov · Johannes von Oswald · Mark Sandler · Rong Ge |
|
Poster
|
Wed 11:00 |
KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization Tianyi Zhang · Jonah Yi · Zhaozhuo Xu · Anshumali Shrivastava |
|
Workshop
|
Adversarial Training based Domain Adaptation for Cross-Subject Emotion Recognition Sungpil Woo · MUHAMMAD ZUBAIR · Sunhwan Lim · Daeyoung Kim |
||
Poster
|
Fri 11:00 |
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding Zhenyu Zhang · Runjin Chen · Shiwei Liu · Zhewei Yao · Olatunji Ruwase · Beidi Chen · Xiaoxia Wu · Zhangyang "Atlas" Wang |
|
Poster
|
Wed 11:00 |
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers Yibo Jiang · Goutham Rajendran · Pradeep Ravikumar · Bryon Aragam |
|
Poster
|
Thu 16:30 |
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision Jay Shah · Ganesh Bikshandi · Ying Zhang · Vijay Thakkar · Pradeep Ramani · Tri Dao |
|
Poster
|
Fri 11:00 |
Loki: Low-rank Keys for Efficient Sparse Attention Prajwal Singhania · Siddharth Singh · Shwai He · Soheil Feizi · Abhinav Bhatele |
|
Poster
|
Wed 16:30 |
Order-Independence Without Fine Tuning Reid McIlroy-Young · Katrina Brown · Conlan Olson · Linjun Zhang · Cynthia Dwork |
|
Poster
|
Thu 16:30 |
Bridging the Divide: Reconsidering Softmax and Linear Attention Dongchen Han · Yifan Pu · Zhuofan Xia · Yizeng Han · Xuran Pan · Xiu Li · Jiwen Lu · Shiji Song · Gao Huang |
|
Poster
|
Wed 16:30 |
QT-ViT: Improving Linear Attention in ViT with Quadratic Taylor Expansion Yixing Xu · Chao Li · Dong Li · Xiao Sheng · Fan Jiang · Lu Tian · Emad Barsoum |