Skip to yearly menu bar Skip to main content

Workshop: Information-Theoretic Principles in Cognitive Systems

On the informativeness of supervision signals

Ilia Sucholutsky · Raja Marjieh · Tom Griffiths


Learning transferable representations by training a classifier is a well-established technique in deep learning (e.g. ImageNet pretraining), but there is a lack of theory to explain why this kind of task-specific pre-training should result in 'good' representations. We conduct an information-theoretic analysis of several commonly-used supervision signals to determine how they contribute to representation learning performance and how the dynamics are affected by training parameters like the number of labels, classes, and dimensions in the training dataset. We confirm these results empirically in a series of simulations and conduct a cost-benefit analysis to establish a tradeoff curve allowing users to optimize the cost of supervising representation learning.

Chat is not available.