Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

69 Results

<<   <   Page 5 of 6   >   >>
Workshop
Scaling Smart: Accelerating Large Language Model Pre-Training with Small Model Initialization
Mohammad Samragh · Iman Mirzadeh · Keivan Alizadeh-Vahid · Fartash Faghri · Minsik Cho · Moin Nabi · Devang Naik · Mehrdad Farajtabar
Workshop
Enhanced label noise robustness through early adaptive filtering for the self-supervised speaker verification task
Abderrahim Fathan · Xiaolin Zhu · MD JAHANGIR ALAM
Poster
Fri 16:30 Communication Efficient Distributed Training with Distributed Lion
Bo Liu · Lemeng Wu · Lizhang Chen · Kaizhao Liang · Jiaxu Zhu · Chen Liang · Raghuraman Krishnamoorthi · Qiang Liu
Workshop
KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation
Rambod Azimi · Rishav Rishav · Marek Teichmann · Samira Ebrahimi Kahou
Poster
Thu 16:30 VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections
Roy Miles · Pradyumna Reddy · Ismail Elezi · Jiankang Deng
Poster
Thu 16:30 A Single-Step, Sharpness-Aware Minimization is All You Need to Achieve Efficient and Accurate Sparse Training
Jie Ji · Gen Li · Jingjing Fu · Fatemeh Afghah · Linke Guo · Xiaoyong Yuan · Xiaolong Ma
Workshop
WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average
Louis Fournier · Adel Nabli · Masih Aminbeidokhti · Marco Pedersoli · Eugene Belilovsky · Edouard Oyallon
Workshop
Computational Bottlenecks of Training Small-scale Large Language Models
Saleh Ashkboos · Iman Mirzadeh · Keivan Alizadeh-Vahid · Mohammad Hossein Sekhavat · Moin Nabi · Mehrdad Farajtabar · Fartash Faghri
Poster
Fri 11:00 Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
Wenyu Du · Tongxu Luo · Zihan Qiu · Zeyu Huang · Yikang Shen · Reynold Cheng · Yike Guo · Jie Fu
Workshop
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training
Hiroki Naganuma · Xinzhi Zhang · Man-Chung Yue · Ioannis Mitliagkas · Russell J. Hewett · Philipp Witte · Yin Tat Lee
Poster
Wed 16:30 Regularized Adaptive Momentum Dual Averaging with an Efficient Inexact Subproblem Solver for Training Structured Neural Network
Zih-Syuan Huang · Ching-pei Lee
Workshop
QuAILoRA: Quantization-Aware Initialization for LoRA
Neal G. Lawton · Aishwarya Padmakumar · Judith Gaspers · Jack FitzGerald · Anoop Kumar · Greg Ver Steeg · Aram Galstyan