Skip to yearly menu bar Skip to main content


Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

Dami Choi · Derrick Xin · Hamid Dadkhahi · Justin Gilmer · Ankush Garg · Orhan Firat · Chih-Kuan Yeh · Andrew Dai · Behrooz Ghorbani

Great Hall & Hall B1+B2 (level 1) #1009
[ ]
[ Paper [ Poster [ OpenReview
Tue 12 Dec 8:45 a.m. PST — 10:45 a.m. PST


In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.

Chat is not available.