NeurIPS Poster On the Value of Target Data in Transfer Learning

Poster

On the Value of Target Data in Transfer Learning

Steve Hanneke · Samory Kpotufe

East Exhibition Hall B + C #227

Keywords: [ Multitask and Transfer Learning ] [ Algorithms ] [ Theory ] [ Learning Theory ]

[ Abstract ]

Abstract:

We aim to understand the value of additional labeled or unlabeled target data in transfer learning, for any given amount of source data; this is motivated by practical questions around minimizing sampling costs, whereby, target data is usually harder or costlier to acquire than source data, but can yield better accuracy.

To this aim, we establish the first minimax-rates in terms of both source and target sample sizes, and show that performance limits are captured by new notions of discrepancy between source and target, which we refer to as transfer exponents.

Interestingly, we find that attaining minimax performance is akin to ignoring one of the source or target samples, provided distributional parameters were known a priori. Moreover, we show that practical decisions -- w.r.t. minimizing sampling costs -- can be made in a minimax-optimal way without knowledge or estimation of distributional parameters nor of the discrepancy between source and target.

Live content is unavailable. Log in and register to view live content