firstbacksecondback
69 Results
Workshop
|
Scaling Smart: Accelerating Large Language Model Pre-Training with Small Model Initialization Mohammad Samragh · Iman Mirzadeh · Keivan Alizadeh-Vahid · Fartash Faghri · Minsik Cho · Moin Nabi · Devang Naik · Mehrdad Farajtabar |
||
Workshop
|
Enhanced label noise robustness through early adaptive filtering for the self-supervised speaker verification task Abderrahim Fathan · Xiaolin Zhu · MD JAHANGIR ALAM |
||
Poster
|
Fri 16:30 |
Communication Efficient Distributed Training with Distributed Lion Bo Liu · Lemeng Wu · Lizhang Chen · Kaizhao Liang · Jiaxu Zhu · Chen Liang · Raghuraman Krishnamoorthi · Qiang Liu |
|
Workshop
|
KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation Rambod Azimi · Rishav Rishav · Marek Teichmann · Samira Ebrahimi Kahou |
||
Poster
|
Thu 16:30 |
VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections Roy Miles · Pradyumna Reddy · Ismail Elezi · Jiankang Deng |
|
Poster
|
Thu 16:30 |
A Single-Step, Sharpness-Aware Minimization is All You Need to Achieve Efficient and Accurate Sparse Training Jie Ji · Gen Li · Jingjing Fu · Fatemeh Afghah · Linke Guo · Xiaoyong Yuan · Xiaolong Ma |
|
Workshop
|
WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average Louis Fournier · Adel Nabli · Masih Aminbeidokhti · Marco Pedersoli · Eugene Belilovsky · Edouard Oyallon |
||
Workshop
|
Computational Bottlenecks of Training Small-scale Large Language Models Saleh Ashkboos · Iman Mirzadeh · Keivan Alizadeh-Vahid · Mohammad Hossein Sekhavat · Moin Nabi · Mehrdad Farajtabar · Fartash Faghri |
||
Poster
|
Fri 11:00 |
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training Wenyu Du · Tongxu Luo · Zihan Qiu · Zeyu Huang · Yikang Shen · Reynold Cheng · Yike Guo · Jie Fu |
|
Workshop
|
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training Hiroki Naganuma · Xinzhi Zhang · Man-Chung Yue · Ioannis Mitliagkas · Russell J. Hewett · Philipp Witte · Yin Tat Lee |
||
Poster
|
Wed 16:30 |
Regularized Adaptive Momentum Dual Averaging with an Efficient Inexact Subproblem Solver for Training Structured Neural Network Zih-Syuan Huang · Ching-pei Lee |
|
Workshop
|
QuAILoRA: Quantization-Aware Initialization for LoRA Neal G. Lawton · Aishwarya Padmakumar · Judith Gaspers · Jack FitzGerald · Anoop Kumar · Greg Ver Steeg · Aram Galstyan |