Timezone: »

Poster
Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach
Kaiwen Yang · Yanchao Sun · Jiahao Su · Fengxiang He · Xinmei Tian · Furong Huang · Tianyi Zhou · Dacheng Tao

Tue Nov 29 09:30 AM -- 11:00 AM (PST) @ Hall J #101

Data augmentation is a critical contributing factor to the success of deep learning but heavily relies on prior domain knowledge which is not always available. Recent works on automatic data augmentation learn a policy to form a sequence of augmentation operations, which are still pre-defined and restricted to limited options. In this paper, we show that a prior-free autonomous data augmentation's objective can be derived from a representation learning principle that aims to preserve the minimum sufficient information of the labels. Given an example, the objective aims at creating a distant hard positive example'' as the augmentation, while still preserving the original label. We then propose a practical surrogate to the objective that can be optimized efficiently and integrated seamlessly into existing methods for a broad class of machine learning tasks, e.g., supervised, semi-supervised, and noisy-label learning. Unlike previous works, our method does not require training an extra generative model but instead leverages the intermediate layer representations of the end-task model for generating data augmentations. In experiments, we show that our method consistently brings non-trivial improvements to the three aforementioned learning tasks from both efficiency and final performance, either or not combined with pre-defined augmentations, e.g., on medical images when domain knowledge is unavailable and the existing augmentation techniques perform poorly. Code will be released publicly.

#### Author Information

##### Fengxiang He (JD.com Inc)

Fengxiang He received his BSc in statistics from University of Science and Technology of China, MPhil and PhD in computer science from the University of Sydney. He is currently an algorithm scientist at JD Explore Academy, JD.com Inc., leading its trustworthy AI team. His research interest is in the theory and practice of trustworthy AI, including deep learning theory, privacy-preserving ML, decentralized learning, and their applications. He has published in prominent journals and conferences, including TNNLS, TMM, TCSVT, ICML, NeurIPS, ICLR, CVPR, and ICCV. He is the area chair of AISTATS, BMVC, and ACML. He is the leading author of several standards on trustworthy AI.

##### Tianyi Zhou (University of Washington, Seattle)

Tianyi Zhou is a Ph.D. student in Computer Science at University of Washington and a member of MELODI lab led by Prof. Jeff A. Bilmes. He will be joining University of Maryland, College Park as a tenure-track assistant professor at the Department of Computer Science and affiliated with UMIACS in 2022. His research interests are in machine learning, optimization, and natural language processing. He has published ~60 papers at NeurIPS, ICML, ICLR, AISTATS, EMNLP, NAACL, COLING, KDD, ICDM, AAAI, IJCAI, ISIT, Machine Learning (Springer), IEEE TIP/TNNLS/TKDE, etc. He is the recipient of the Best Student Paper Award at ICDM 2013 and the 2020 IEEE TCSC Most Influential Paper Award.