Timezone: »

Shift and Scale is Detrimental To Few-Shot Transfer
Moslem Yazdanpanah · Christian Desrosiers · Mohammad Havaei · Eugene Belilovsky · Samira Ebrahimi Kahou
Event URL: https://openreview.net/forum?id=7OmUSzlgd4a »

Batch normalization is a common component in computer vision models, including ones typically used for few-shot learning. Batch normalization applied in convolutional networks consists of a normalization step, followed by the application of per-channel trainable affine parameters which shift and scale the normalized features. The use of these affine parameters can speed up model convergence on a source task. However, we demonstrate in this work that, on common few-shot learning benchmarks, training a model on a source task using these affine parameters is detrimental to downstream transfer performance. We study this effect for several methods on well-known benchmarks such as cross-domain few-shot learning (CD-FSL) benchmark and few-shot image classification on miniImageNet. We find consistent performance gains, particularly in settings with more distant transfer tasks. Improvements from applying this low-cost and easy-to-implement modifications are shown to rival gains obtained by more sophisticated and costly methods.

Author Information

Moslem Yazdanpanah (University of Kurdistan)
Christian Desrosiers (Ecole de technologie superieure)
Mohammad Havaei (Imagia)
Eugene Belilovsky (KU Leuven / INRIA)
Samira Ebrahimi Kahou (McGill University)

More from the Same Authors