Timezone: »

ORIENT: Submodular Mutual Information Measures for Data Subset Selection under Distribution Shift
Athresh Karanam · Krishnateja Killamsetty · Harsha Kokel · Rishabh Iyer

Wed Nov 30 09:00 AM -- 11:00 AM (PST) @ Hall J #421

Real-world machine-learning applications require robust models that generalize well to distribution shift settings, which is typical in real-world situations. Domain adaptation techniques aim to address this issue of distribution shift by minimizing the disparities between domains to ensure that the model trained on the source domain performs well on the target domain. Nevertheless, the existing domain adaptation methods are computationally very expensive. In this work, we aim to improve the efficiency of existing supervised domain adaptation (SDA) methods by using a subset of source data that is similar to target data for faster model training. Specifically, we propose ORIENT, a subset selection framework that uses the submodular mutual information (SMI) functions to select a source data subset similar to the target data for faster training. Additionally, we demonstrate how existing robust subset selection strategies, such as GLISTER, GRADMATCH, and CRAIG, when used with a held-out query set, fit within our proposed framework and demonstrate the connections with them. Finally, we empirically demonstrate that SDA approaches like d-SNE, CCSA, and standard Cross-entropy training, when employed together with ORIENT, achieve a) faster training and b) better performance on the target data.

Author Information

Athresh Karanam (University of Texas, Dallas)
Krishnateja Killamsetty (University of Texas, Dallas)
Harsha Kokel (University of Texas, Dallas)
Rishabh Iyer (University of Texas, Dallas)

Bio: Prof. Rishabh Iyer is currently an Assistant Professor at the University of Texas, Dallas, where he leads the CARAML Lab. He is also a Visiting Assistant Professor at the Indian Institute of Technology, Bombay. He completed his Ph.D. in 2015 from the University of Washington, Seattle. He is excited in making ML more efficient (both computational and labeling efficiency), robust, and fair. He has received the best paper award at Neural Information Processing Systems (NeurIPS/NIPS) in 2013, the International Conference of Machine Learning (ICML) in 2013, and an Honorable Mention at CODS-COMAD in 2021. He has also won a Microsoft Research Ph.D. Fellowship, a Facebook Ph.D. Fellowship, and the Yang Award for Outstanding Graduate Student from the University of Washington.

More from the Same Authors