firstbacksecondback
45 Results
Poster
|
Wed 16:30 |
Demystify Mamba in Vision: A Linear Attention Perspective Dongchen Han · Ziyi Wang · Zhuofan Xia · Yizeng Han · Yifan Pu · Chunjiang Ge · Jun Song · Shiji Song · Bo Zheng · Gao Huang |
|
Poster
|
Thu 16:30 |
Transformers need glasses! Information over-squashing in language tasks Federico Barbero · Andrea Banino · Steven Kapturowski · Dharshan Kumaran · João Madeira Araújo · Oleksandr Vitvitskyi · Razvan Pascanu · Petar Veličković |
|
Poster
|
Fri 11:00 |
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention William Brandon · Mayank Mishra · Aniruddha Nrusimha · Rameswar Panda · Jonathan Ragan-Kelley |
|
Poster
|
Thu 11:00 |
Nonlocal Attention Operator: Materializing Hidden Knowledge Towards Interpretable Physics Discovery Yue Yu · Ning Liu · Fei Lu · Tian Gao · Siavash Jafarzadeh · Stewart A Silling |
|
Poster
|
Wed 11:00 |
Exploring Context Window of Large Language Models via Decomposed Positional Vectors Zican Dong · Junyi Li · Xin Men · Xin Zhao · Bingning Wang · Zhen Tian · weipeng chen · Ji-Rong Wen |
|
Poster
|
Thu 11:00 |
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization Coleman Hooper · Sehoon Kim · Hiva Mohammadzadeh · Michael Mahoney · Sophia Shao · Kurt Keutzer · Amir Gholami |
|
Poster
|
Wed 16:30 |
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Xuezhe Ma · Xiaomeng Yang · Wenhan Xiong · Beidi Chen · LILI YU · Hao Zhang · Jonathan May · Luke Zettlemoyer · Omer Levy · Chunting Zhou |
|
Poster
|
Thu 16:30 |
Gated Slot Attention for Efficient Linear-Time Sequence Modeling Yu Zhang · Songlin Yang · Rui-Jie Zhu · Yue Zhang · Leyang Cui · Yiqiao Wang · Bolun Wang · Freda Shi · Bailin Wang · Wei Bi · Peng Zhou · Guohong Fu |
|
Poster
|
Thu 16:30 |
xLSTM: Extended Long Short-Term Memory Maximilian Beck · Korbinian Pöppel · Markus Spanring · Andreas Auer · Oleksandra Prudnikova · Michael Kopp · Günter Klambauer · Johannes Brandstetter · Sepp Hochreiter |