Timezone: »
Image classification models tend to make decisions based on peripheral attributes of data items that have strong correlation with a target variable (i.e., dataset bias). These biased models suffer from the poor generalization capability when evaluated on unbiased datasets. Existing approaches for debiasing often identify and emphasize those samples with no such correlation (i.e., bias-conflicting) without defining the bias type in advance. However, such bias-conflicting samples are significantly scarce in biased datasets, limiting the debiasing capability of these approaches. This paper first presents an empirical analysis revealing that training with "diverse" bias-conflicting samples beyond a given training set is crucial for debiasing as well as the generalization capability. Based on this observation, we propose a novel feature-level data augmentation technique in order to synthesize diverse bias-conflicting samples. To this end, our method learns the disentangled representation of (1) the intrinsic attributes (i.e., those inherently defining a certain class) and (2) bias attributes (i.e., peripheral attributes causing the bias), from a large number of bias-aligned samples, the bias attributes of which have strong correlation with the target variable. Using the disentangled representation, we synthesize bias-conflicting samples that contain the diverse intrinsic attributes of bias-aligned samples by swapping their latent features. By utilizing these diversified bias-conflicting features during the training, our approach achieves superior classification accuracy and debiasing results against the existing baselines on both synthetic and real-world datasets.
Author Information
Jungsoo Lee (Korea Advanced Institute of Science and Technology)
Eungyeup Kim (Kakao Enterprise)
Juyoung Lee (kakao enterprise)
Jihyeon Lee (Korea Advanced Institute of Science and Technology)
Jaegul Choo (Korea Advanced Institute of Science and Technology)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Oral: Learning Debiased Representation via Disentangled Feature Augmentation »
Tue. Dec 7th 09:40 -- 09:55 AM Room
More from the Same Authors
-
2021 : Deep-DFT: Physics-ML hybrid method to predict DFT energy using Transformer »
Youngwoo Cho · Seunghoon Yi · Jaegul Choo · Joonseok Lee · Sookyung Kim -
2022 Poster: WaveBound: Dynamic Error Bounds for Stable Time Series Forecasting »
Youngin Cho · Daejin Kim · DONGMIN KIM · MOHAMMAD AZAM KHAN · Jaegul Choo -
2022 Poster: Mining Multi-Label Samples from Single Positive Labels »
Youngin Cho · Daejin Kim · MOHAMMAD AZAM KHAN · Jaegul Choo -
2022 Poster: CEDe: A collection of expert-curated datasets with atom-level entity annotations for Optical Chemical Structure Recognition »
Rodrigo Hormazabal · Changyoung Park · Soonyoung Lee · Sehui Han · Yeonsik Jo · Jaewan Lee · Ahra Jo · Seung Hwan Kim · Jaegul Choo · Moontae Lee · Honglak Lee