firstbacksecondback
26 Results
Workshop
|
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities Vicky Zayats · Peter Chen · Melissa Ferrari · Dirk Padfield |
||
Workshop
|
CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios Luning Wang · Shiyao Li · Xuefei Ning · Zhihang Yuan · Shengen Yan · Guohao Dai · Yu Wang |
||
Poster
|
Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation Wei Dong · Yuan Sun · Yiting Yang · Xing Zhang · Zhijun Lin · Qingsen Yan · Haokui Zhang · Peng Wang · Yang Yang · Hengtao Shen |
||
Workshop
|
Composite Attention: A Framework for Combining Sequence Mixing Primitives Jake Cunningham · Marc Deisenroth |
||
Workshop
|
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models Keivan Alizadeh-Vahid · Iman Mirzadeh · Hooman Shahrkokhi · Dmitry Belenko · Frank Sun · Minsik Cho · Mohammad Hossein Sekhavat · Moin Nabi · Mehrdad Farajtabar |
||
Poster
|
Thu 11:00 |
HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning Chunlin Tian · Zhan Shi · Zhijiang Guo · Li Li · Cheng-Zhong Xu |
|
Oral
|
Thu 10:40 |
HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning Chunlin Tian · Zhan Shi · Zhijiang Guo · Li Li · Cheng-Zhong Xu |
|
Workshop
|
Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts Youngseog Chung · Dhruv Malik · Jeff Schneider · Yuanzhi Li · Aarti Singh |
||
Workshop
|
EchoAtt: Attend, Copy, then Adjust\\ for More Efficient Large Language Models Hossein Rajabzadeh · Aref Jafari · Aman Sharma · Benyamin Jami · HYOCK JU KWON · Ali Ghodsi · Boxing Chen · Mehdi Rezaghoizadeh |
||
Workshop
|
Sat 8:15 |
The Fourth Workshop on Efficient Natural Language and Speech Processing (ENLSP-IV): Highlighting New Architectures for Future Foundation Models Mehdi Rezagholizadeh · Peyman Passban · Yu Cheng · Soheila Samiee · Yue Dong · Vahid Partovi Nia · Qun Liu · Boxing Chen |
|
Workshop
|
Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts Ashwinee Panda · Vatsal Baherwani · Zain Sarwar · Benjamin Therien · Sambit Sahu · Stephen Rawls · Supriyo Chakraborty · Tom Goldstein |
||
Workshop
|
StructMoE : Structured Mixture of Experts Using Low Rank Experts Zain Sarwar · Ashwinee Panda · Benjamin Thérien · Stephen Rawls · Anirban Das · Kartik Balasubramaniam · Berkcan Kapusuzoglu · Shixiong Zhang · Sambit Sahu · MILIND NAPHADE · Supriyo Chakraborty |