Timezone: »
A common assumption in theoretical models of learning such as the standard PAC model [20], as well as in the design of learning algorithms, is that training instances are drawn according to the same distribution as the unseen test examples. In practice, however, there are many cases where this assumption does not hold. There can be no hope for generalization, of course, when the training and test distributions vastly differ, but when they are less dissimilar, learning can be more successful. The main theme of this workshop is the theoretical, algorithmic, and empirical analysis of such cases where there is a mismatch between the training and test distributions. This includes the crucial scenario of domain adaptation where the training examples are drawn from a source domain distinct from the target domain from which the test examples are extracted, or the more general scenario of multiple source adaptation where training instances may have been collected from multiple source domains, all distinct from the target [13]. The topic of our workshop also covers other important problems such that of sample bias correction and has tight connections with other problems such as active learning where the active distribution corresponding to the learner's labeling request differs from the target distribution. Many other intermediate problems and scenarios appear in practice, which will be all covered by this workshop. These problems are all critical and appear in almost all real-world applications of machine learning. Ignoring them can lead to dramatically poor results. Some straightforward existing solutions based on importance weighting are not always successful [5]. Which algorithms should be used for domain adaptation? Under what theoretical conditions will they be successful? How do these algorithms scale to large domain adaptation problems? These are some of the questions that the workshop aims to address. The problem of domain adaptation and other related ones already mentioned are crucial in practice. They arise in a variety of applications in natural language processing [7, 2, 10, 4, 6], speech processing [11, 8, 17, 19, 9, 18], computer vision [15], and many other areas.
The empirical performance of domain adaptation in these applications, the design of new and effective algorithms, as well as the creation of a solid theoretical framework for domain adaptation as initiated by recent work [1, 13, 12, 14, 5] are all challenging objectives for this workshop. By bringing together current experts in all aspects of this problem, we aim to foster collaborations and successful progress in this field.
Goals:
Despite the recent advances in domain adaptation, many of the most successful practical achievements in domain adaptation [3, 16, 21] have not been robust, in part because they lack formal assumptions about when they could perform well. At the same time, some of the most influential theoretical work guarantees near optimal performance in new domains, but under assumptions that may not hold in practice [1, 12, 13].
Our workshop will bridge theory and practice in the following ways:
1.We will have one applied and two theoretical invited talks.
2.We will advertise the workshop to both the applied and theoretical communities.
3.We will have discussion sessions whose aim emphasizes both the formal assumptions underlying successful practical algorithms and new algorithms based on theoretical foundations.
Workshop attendees should come away with an understanding of the domain adaptation problem, how it appears in practical applications and existing theoretical guarantees that can be provided in this more general setting. More importantly, attendees will be exposed to the important open problems of the field, which will encourage new collaborations and results.
References:
[1] S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira. Analysis of representations for domain adaptation. Proceedings of NIPS 2006, 2007.
[2] J. Blitzer, M. Dredze, and F. Pereira. Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. In ACL 2007, 2007.
[3] J. Blitzer, R. McDonald, and F. Pereira. Domain adaptation with structural correspondence learning. In Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, 2006.
[4] C. Chelba and A. Acero. Adaptation of maximum entropy capitalizer: Little data can help a lot. Computer Speech & Language, 20(4):382-399, 2006.
[5] C. Cortes, Y. Mansour, and M. Mohri. Learning bounds for importance weighting. In Advances in Neural Information Processing Systems (NIPS 2010), Vancouver, Canada, 2010. MIT Press.
[6] H. Daum'e III and D. Marcu. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research, 26:101-126, 2006.
[7] M. Dredze, J. Blitzer, P. P. Talukdar, K. Ganchev, J. Graca, and F. Pereira. Frustratingly Hard Domain Adaptation for Parsing. In CoNLL 2007, 2007.
[8] J.-L. Gauvain and Chin-Hui. Maximum a posteriori estimation for multi- variate gaussian mixture observations of markov chains. IEEE Transactions on Speech and Audio Processing, 2(2):291-298, 1994.
[9] F. Jelinek. Statistical Methods for Speech Recognition. The MIT Press, 1998.
[10] J. Jiang and C. Zhai. Instance Weighting for Domain Adaptation in NLP. In Proceedings of ACL 2007, pages 264-271. Association for Computational Linguistics, 2007.
[11] C. J. Legetter and P. C. Woodland. Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models. Computer Speech and Language, pages 171-185, 1995.
[12] Y. Mansour, M. Mohri, and A. Rostamizadeh. Domain adaptation: Learning bounds and algorithms. Conference on Learning Theory, 2009.
[13] Y. Mansour, M. Mohri, and A. Rostamizadeh. Domain adaptation with multiple sources. In Advances in Neural Information Processing Systems (NIPS 2008), pages 1041-1048, Vancouver, Canada, 2009. MIT Press.
[14] Y. Mansour, M. Mohri, and A. Rostamizadeh. Multiple source adaptation and the R'enyi divergence. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI 2009), Montr'eal, Canada, June 2009.
[15] A. M. Mart'inez. Recognizing imprecisely localized, partially occluded, and expression variant faces from a single sample per class. IEEE Trans. Pattern Anal. Mach. Intell., 24(6):748-763, 2002.
[16] D. McClosky, E. Charniak, and M. Johnson. Reranking and self-training for parser adaptation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 337-344. Association for Computational Linguistics, 2006.
[17] S. D. Pietra, V. D. Pietra, R. L. Mercer, and S. Roukos. Adaptive language modeling using minimum discriminant estimation. In HLT '91: Proceedings of the workshop on Speech and Natural Language, pages 103-106, 1992.
[18] B. Roark and M. Bacchiani. Supervised and unsupervised PCFG adaptation to novel domains. In Proceedings of HLT-NAACL, 2003.
[19] R. Rosenfeld. A Maximum Entropy Approach to Adaptive Statistical Language Modeling. Computer Speech and Language, 10:187-228, 1996.
[20] L. G. Valiant. A theory of the learnable. ACM Press New York, NY, USA, 1984.
[21] G. Xue, W. Dai, Q. Yang, and Y. Yu. Topic-bridged plsa for cross-domain text classication. In SIGIR, 2008.
Author Information
John Blitzer (Google Research)
Corinna Cortes (Google Research)
Afshin Rostamizadeh (UC Berkeley)
More from the Same Authors
-
2022 : A Theory of Learning with Competing Objectives and User Feedback »
Pranjal Awasthi · Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2022 : Theory and Algorithm for Batch Distribution Drift Problems »
Pranjal Awasthi · Corinna Cortes · Christopher Mohri -
2022 : AdaME: Adaptive learning of multisource adaptationensembles »
Scott Yak · Javier Gonzalvo · Mehryar Mohri · Corinna Cortes -
2022 : A Theory of Learning with Competing Objectives and User Feedback »
Pranjal Awasthi · Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2022 : A Theory of Learning with Competing Objectives and User Feedback »
Pranjal Awasthi · Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2021 Poster: Boosting with Multiple Sources »
Corinna Cortes · Mehryar Mohri · Dmitry Storcheus · Ananda Theertha Suresh -
2020 Poster: Agnostic Learning with Multiple Objectives »
Corinna Cortes · Mehryar Mohri · Javier Gonzalvo · Dmitry Storcheus -
2019 : Poster Session »
Rishav Chourasia · Yichong Xu · Corinna Cortes · Chien-Yi Chang · Yoshihiro Nagano · So Yeon Min · Benedikt Boecking · Phi Vu Tran · Kamyar Ghasemipour · Qianggang Ding · Shouvik Mani · Vikram Voleti · Rasool Fakoor · Miao Xu · Kenneth Marino · Lisa Lee · Volker Tresp · Jean-Francois Kagy · Marvin Zhang · Barnabas Poczos · Dinesh Khandelwal · Adrien Bardes · Evan Shelhamer · Jiacheng Zhu · Ziming Li · Xiaoyan Li · Dmitrii Krasheninnikov · Ruohan Wang · Mayoore Jaiswal · Emad Barsoum · Suvansh Sanjeev · Theeraphol Wattanavekin · Qizhe Xie · Sifan Wu · Yuki Yoshida · David Kanaa · Sina Khoshfetrat Pakazad · Mehdi Maasoumy -
2019 Poster: Learning GANs and Ensembles Using Discrepancy »
Ben Adlam · Corinna Cortes · Mehryar Mohri · Ningshan Zhang -
2019 Poster: Regularized Gradient Boosting »
Corinna Cortes · Mehryar Mohri · Dmitry Storcheus -
2018 Poster: Efficient Gradient Computation for Structured Output Learning with Rational and Tropical Losses »
Corinna Cortes · Vitaly Kuznetsov · Mehryar Mohri · Dmitry Storcheus · Scott Yang -
2016 Poster: Structured Prediction Theory Based on Factor Graph Complexity »
Corinna Cortes · Vitaly Kuznetsov · Mehryar Mohri · Scott Yang -
2016 Poster: Boosting with Abstention »
Corinna Cortes · Giulia DeSalvo · Mehryar Mohri -
2013 Poster: Learning Kernels Using Local Rademacher Complexity »
Corinna Cortes · Marius Kloft · Mehryar Mohri -
2013 Spotlight: Learning Kernels Using Local Rademacher Complexity »
Corinna Cortes · Marius Kloft · Mehryar Mohri -
2013 Session: Oral Session 6 »
Corinna Cortes -
2012 Poster: Accuracy at the Top »
Stephen Boyd · Corinna Cortes · Mehryar Mohri · Ana Radovanovic -
2011 Poster: Co-Training for Domain Adaptation »
Minmin Chen · Kilian Q Weinberger · John Blitzer -
2010 Poster: Learning Bounds for Importance Weighting »
Corinna Cortes · Yishay Mansour · Mehryar Mohri -
2009 Poster: Learning Non-Linear Combinations of Kernels »
Corinna Cortes · Mehryar Mohri · Afshin Rostamizadeh -
2009 Poster: Polynomial Semantic Indexing »
Bing Bai · Jason E Weston · David Grangier · Ronan Collobert · Kunihiko Sadamasa · Yanjun Qi · Corinna Cortes · Mehryar Mohri -
2008 Workshop: Kernel Learning: Automatic Selection of Optimal Kernels »
Corinna Cortes · Arthur Gretton · Gert Lanckriet · Mehryar Mohri · Afshin Rostamizadeh -
2007 Workshop: Efficient Machine Learning - Overcoming Computational Bottlenecks in Machine Learning (Part 2) »
Samy Bengio · Corinna Cortes · Dennis DeCoste · Francois Fleuret · Ramesh Natarajan · Edwin Pednault · Dan Pelleg · Elad Yom-Tov -
2007 Workshop: Efficient Machine Learning - Overcoming Computational Bottlenecks in Machine Learning (Part 1) »
Samy Bengio · Corinna Cortes · Dennis DeCoste · Francois Fleuret · Ramesh Natarajan · Edwin Pednault · Dan Pelleg · Elad Yom-Tov -
2006 Poster: On Transductive Regression »
Corinna Cortes · Mehryar Mohri