Timezone: »

SWAD: Domain Generalization by Seeking Flat Minima
Junbum Cha · Sanghyuk Chun · Kyungjae Lee · Han-Cheol Cho · Seunghyun Park · Yunsung Lee · Sungrae Park

Thu Dec 09 12:30 AM -- 02:00 AM (PST) @

Domain generalization (DG) methods aim to achieve generalizability to an unseen target domain by using only training data from the source domains. Although a variety of DG methods have been proposed, a recent study shows that under a fair evaluation protocol, called DomainBed, the simple empirical risk minimization (ERM) approach works comparable to or even outperforms previous methods. Unfortunately, simply solving ERM on a complex, non-convex loss function can easily lead to sub-optimal generalizability by seeking sharp minima. In this paper, we theoretically show that finding flat minima results in a smaller domain generalization gap. We also propose a simple yet effective method, named Stochastic Weight Averaging Densely (SWAD), to find flat minima. SWAD finds flatter minima and suffers less from overfitting than does the vanilla SWA by a dense and overfit-aware stochastic weight sampling strategy. SWAD shows state-of-the-art performances on five DG benchmarks, namely PACS, VLCS, OfficeHome, TerraIncognita, and DomainNet, with consistent and large margins of +1.6% averagely on out-of-domain accuracy. We also compare SWAD with conventional generalization methods, such as data augmentation and consistency regularization methods, to verify that the remarkable performance improvements are originated from by seeking flat minima, not from better in-domain generalizability. Last but not least, SWAD is readily adaptable to existing DG methods without modification; the combination of SWAD and an existing DG method further improves DG performances. Source code is available at https://github.com/khanrc/swad.

Author Information

Junbum Cha (Kakao Brain)

AI Researcher @ Kakao Brain

Sanghyuk Chun (NAVER AI Lab)

I'm a research scientist and tech leader at NAVER AI Lab, working on machine learning and its applications. In particular, my research interests focus on bridging the gap between two gigantic topics: reliable machine learning tasks (e.g., robustness [C3, C9, C10, W1, W3], de-biasing or domain generalization [C6, A6], uncertainty estimation [C11, A3], explainability [C5, C11, A2, A4, W2], and fair evaluation [C5, C11]) and learning with limited annotations (e.g., multi-modal learning [C11], weakly-supervised learning [C2, C3, C4, C5, C7, C8, C12, W2, W4, W5, W6, A2, A4], and self-supervised learning). I have contributed large-scale machine learning algorithms [C3, C9, C10, C13] in NAVER AI Lab as well. Prior to working at NAVER, I worked as a research engineer at the advanced recommendation team (ART) in Kakao from 2016 to 2018. I received a master's degree in Electrical Engineering from Korea Advanced Institute of Science and Technology (KAIST) in 2016. During my master's degree, I researched a scalable algorithm for robust subspace clustering (the algorithm is based on robust PCA and k-means clustering). Before my master's study, I worked at IUM-SOCIUS in 2012 as a software engineering internship. I also did a research internship at Networked and Distributed Computing System Lab in KAIST and NAVER Labs during summer 2013 and fall 2015, respectively.

Kyungjae Lee (Seoul National University)
Han-Cheol Cho (NAVER)

Studied natural language processing in graduate school. Now working on DL for computer vision.

Seunghyun Park (Clova AI Research, Naver Corp.)
Yunsung Lee (Korea University)
Sungrae Park (UPSTAGE)

More from the Same Authors

  • 2022 Poster: A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective »
    Chanwoo Park · Sangdoo Yun · Sanghyuk Chun
  • 2021 Workshop: ImageNet: Past, Present, and Future »
    Zeynep Akata · Lucas Beyer · Sanghyuk Chun · A. Sophia Koepke · Diane Larlus · Seong Joon Oh · Rafael Rezende · Sangdoo Yun · Xiaohua Zhai
  • 2021 Poster: CATs: Cost Aggregation Transformers for Visual Correspondence »
    Seokju Cho · Sunghwan Hong · Sangryul Jeon · Yunsung Lee · Kwanghoon Sohn · Seungryong Kim
  • 2021 Poster: Neural Hybrid Automata: Learning Dynamics With Multiple Modes and Stochastic Transitions »
    Michael Poli · Stefano Massaroli · Luca Scimeca · Sanghyuk Chun · Seong Joon Oh · Atsushi Yamashita · Hajime Asama · Jinkyoo Park · Animesh Garg
  • 2019 : Posters »
    Timo Denk · Ioannis Androutsopoulos · Oleg Bakhteev · Mohamed Kane · Petar Stojanov · Seunghyun Park · Bharat Mamidibathula · Kostiantyn Liepieshov · Johannes Höhne · Song Feng · Zikri Bayraktar · Kehinde Aruleba · ALEKSANDR OGALTSOV · Rita Kuznetsova · Paul Bennett · Saghar Hosseini · Kshtij Fadnis · Luis Lastras · Mehrdad Jabbarzadeh Gangeh · Christian Reisswig · Emad Elwany · Ilias Chalkidis · Jonathan DeGange · Kaixuan Zhang · Luke de Oliveira · Muhammed Koçyiğit · Haoyu Dong · Vera Liao · Wonseok Hwang
  • 2019 : Lunch + Poster Session »
    Frederik Gerzer · Bill Yang Cai · Pieter-Jan Hoedt · Kelly Kochanski · Soo Kyung Kim · Yunsung Lee · Sunghyun Park · Sharon Zhou · Martin Gauch · Jonathan Wilson · Joyjit Chatterjee · Shamindra Shrotriya · Dimitri Papadimitriou · Christian Schön · Valentina Zantedeschi · Gabriella Baasch · Willem Waegeman · Gautier Cosne · Dara Farrell · Brendan Lucier · Letif Mones · Caleb Robinson · Tafara Chitsiga · Victor Kristof · Hari Prasanna Das · Yimeng Min · Alexandra Puchko · Alexandra Luccioni · Kyle Story · Jason Hickey · Yue Hu · Björn Lütjens · Zhecheng Wang · Renzhi Jing · Genevieve Flaspohler · Jingfan Wang · Saumya Sinha · Qinghu Tang · Armi Tiihonen · Ruben Glatt · Muge Komurcu · Jan Drgona · Juan Gomez-Romero · Ashish Kapoor · Dylan J Fitzpatrick · Alireza Rezvanifar · Adrian Albert · Olya (Olga) Irzak · Kara Lamb · Ankur Mahesh · Kiwan Maeng · Frederik Kratzert · Sorelle Friedler · Niccolo Dalmasso · Alex Robson · Lindiwe Malobola · Lucas Maystre · Yu-wen Lin · Surya Karthik Mukkavili · Brian Hutchinson · Alexandre Lacoste · Yanbing Wang · Zhengcheng Wang · Yinda Zhang · Victoria Preston · Jacob Pettit · Draguna Vrabie · Miguel Molina-Solana · Tonio Buonassisi · Andrew Annex · Tunai P Marques · Catalin Voss · Johannes Rausch · Max Evans
  • 2018 Poster: Maximum Causal Tsallis Entropy Imitation Learning »
    Kyungjae Lee · Sungjoon Choi · Songhwai Oh