Timezone: »
The focus in machine learning has branched beyond training classifiers on a single task to investigating how previously acquired knowledge in a source domain can be leveraged to facilitate learning in a related target domain, known as inductive transfer learning. Three active lines of research have independently explored transfer learning using neural networks. In weight transfer, a model trained on the source domain is used as an initialization point for a network to be trained on the target domain. In deep metric learning, the source domain is used to construct an embedding that captures class structure in both the source and target domains. In few-shot learning, the focus is on generalizing well in the target domain based on a limited number of labeled examples. We compare state-of-the-art methods from these three paradigms and also explore hybrid adapted-embedding methods that use limited target-domain data to fine tune embeddings constructed from source-domain data. We conduct a systematic comparison of methods in a variety of domains, varying the number of labeled instances available in the target domain (k), as well as the number of target-domain classes. We reach three principal conclusions: (1) Deep embeddings are far superior, compared to weight transfer, as a starting point for inter-domain transfer or model re-use (2) Our hybrid methods robustly outperform every few-shot learning and every deep metric learning method previously proposed, with a mean error reduction of 34% over state-of-the-art. (3) Among loss functions for discovering embeddings, the histogram loss (Ustinova & Lempitsky, 2016) is most robust. We hope our results will motivate a unification of research in weight transfer, deep metric learning, and few-shot learning.
Author Information
Tyler Scott (University of Colorado Boulder)
Karl Ridgeway (University of Colorado, Boulder)
Michael Mozer (Google Brain / U. Colorado)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Poster: Adapted Deep Embeddings: A Synthesis of Methods for k-Shot Inductive Transfer Learning »
Tue. Dec 4th through Wed the 5th Room Room 517 AB #167
More from the Same Authors
-
2022 : Neural Network Online Training with Sensitivity to Multiscale Temporal Structure »
Matt Jones · Tyler Scott · Gamaleldin Elsayed · Mengye Ren · Katherine Hermann · David Mayo · Michael Mozer -
2022 : An Empirical Study on Clustering Pretrained Embeddings: Is Deep Strictly Better? »
Tyler Scott · Ting Liu · Michael Mozer · Andrew Gallagher -
2018 Poster: Learning Deep Disentangled Embeddings With the F-Statistic Loss »
Karl Ridgeway · Michael Mozer -
2018 Poster: Sparse Attentive Backtracking: Temporal Credit Assignment Through Reminding »
Nan Rosemary Ke · Anirudh Goyal · Olexa Bilaniuk · Jonathan Binas · Michael Mozer · Chris Pal · Yoshua Bengio -
2018 Spotlight: Sparse Attentive Backtracking: Temporal Credit Assignment Through Reminding »
Nan Rosemary Ke · Anirudh Goyal · Olexa Bilaniuk · Jonathan Binas · Michael Mozer · Chris Pal · Yoshua Bengio -
2017 : Access consciousness and the construction of actionable representations »
Michael C Mozer -
2017 : Workshop overview »
Michael Mozer · Angela Yu · Brenden Lake -
2017 Workshop: Cognitively Informed Artificial Intelligence: Insights From Natural Intelligence »
Michael Mozer · Brenden Lake · Angela Yu -
2016 : Overcoming temptation: Incentive design for intertemporal choice »
Michael Mozer -
2016 : Opening Remarks, Invited Talk: Michael C. Mozer »
Michael Mozer -
2014 Workshop: Human Propelled Machine Learning »
Richard Baraniuk · Michael Mozer · Divyanshu Vats · Christoph Studer · Andrew E Waters · Andrew Lan -
2014 Poster: Automatic Discovery of Cognitive Skills to Improve the Prediction of Student Learning »
Robert Lindsey · Mohammad Khajah · Michael Mozer -
2013 Poster: Optimizing Instructional Policies »
Robert Lindsey · Michael Mozer · William J Huggins · Harold Pashler -
2013 Oral: Optimizing Instructional Policies »
Robert Lindsey · Michael Mozer · William J Huggins · Harold Pashler -
2012 Workshop: Personalizing education with machine learning »
Michael Mozer · javier r movellan · Robert Lindsey · Jacob Whitehill -
2011 Poster: An Unsupervised Decontamination Procedure For Improving The Reliability Of Human Judgments »
Michael Mozer · Benjamin Link · Harold Pashler -
2010 Spotlight: Improving Human Judgments by Decontaminating Sequential Dependencies »
Michael Mozer · Harold Pashler · Matthew Wilder · Robert Lindsey · Matt Jones · Michael Jones -
2010 Poster: Improving Human Judgments by Decontaminating Sequential Dependencies »
Michael Mozer · Harold Pashler · Matthew Wilder · Robert Lindsey · Matt Jones · Michael Jones -
2009 Poster: Predicting the Optimal Spacing of Study: A Multiscale Context Model of Memory »
Michael Mozer · Harold Pashler · Nicholas Cepeda · Robert Lindsey · Edward Vul -
2009 Spotlight: Predicting the Optimal Spacing of Study: A Multiscale Context Model of Memory »
Michael Mozer · Harold Pashler · Nicholas Cepeda · Robert Lindsey · Edward Vul -
2009 Poster: Sequential effects reflect parallel learning of multiple environmental regularities »
Matthew Wilder · Matt Jones · Michael Mozer -
2008 Poster: Optimal Response Initiation: Why Recent Experience Matters »
Matt Jones · Michael Mozer · Sachiko Kinoshita -
2008 Spotlight: Optimal Response Initiation: Why Recent Experience Matters »
Matt Jones · Michael Mozer · Sachiko Kinoshita -
2008 Poster: Temporal Dynamics of Cognitive Control »
Jeremy Reynolds · Michael Mozer -
2007 Spotlight: Experience-Guided Search: A Theory of Attentional Control »
Michael Mozer · David Baldwin -
2007 Poster: Experience-Guided Search: A Theory of Attentional Control »
Michael Mozer · David Baldwin -
2006 Poster: Context Effects in Category Learning: An Investigation of Four Probabilistic Models »
Michael Mozer · Michael Jones · Michael Shettel