Timezone: »

Sparse Mixture-of-Experts are Domain Generalizable Learners
Bo Li · Yifei Shen · Jingkang Yang · Yezhen Wang · Jiawei Ren · Tong Che · Jun Zhang · Ziwei Liu
Event URL: https://openreview.net/forum?id=GYS5EBMEK6 »

In domain generalization (DG), most existing methods focused on the loss function design. This paper proposes to explore an orthogonal direction, i.e., the design of the backbone architecture. It is motivated by an empirical finding that transformer-based models trained with empirical risk minimization (ERM) outperform CNN-based models employing state-of-the-art (SOTA) DG algorithms on multiple DG datasets. We develop a formal framework to characterize a network's robustness to distribution shifts by studying its architecture's alignment with the correlations in the dataset. This analysis guides us to propose a novel DG model built upon vision transformers, namely \emph{Generalizable Mixture-of-Experts (GMoE)}. Experiments on DomainBed demonstrate that GMoE trained with ERM outperforms SOTA DG baselines by a large margin.

Author Information

Bo Li (Nanyang Technological University)
Yifei Shen (HKUST)
Jingkang Yang (Nanyang Technological University)
Yezhen Wang (Montreal Institute for Learning Algorithms, University of Montreal, Université de Montréal)
Jiawei Ren (Nanyang Technological University)
Tong Che (MILA, Montreal)
Jun Zhang (The Hong Kong University of Science and Technology)

Jun Zhang received his Ph.D. degree in Electrical and Computer Engineering from the University of Texas at Austin. He is an Associate Professor in the Department of Electronic and Computer Engineering at the Hong Kong University of Science and Technology. His research interests include wireless communications and networking, mobile edge computing and edge AI, and cooperative AI. He is a co-recipient of several best paper awards, including the 2021 Best Survey Paper Award of IEEE Communications Society, the 2019 IEEE Communications Society & Information Theory Society Joint Paper Award, and the 2016 Marconi Prize Paper Award in Wireless Communications. Two papers he co-authored received the Young Author Best Paper Award of the IEEE Signal Processing Society in 2016 and 2018, respectively. He also received the 2016 IEEE ComSoc Asia-Pacific Best Young Researcher Award. He is an IEEE Fellow.

Ziwei Liu (Nanyang Technological University)

More from the Same Authors